Speech Codecs: Pros & Cons: Understanding Various Speech Codecs

The Speech codecs is a method of compression/decompression of audio file containing speech data or streaming speech format. The codecs stands for Coders / Decoders.

There are various kinds of speech codecs available. Since these codecs have been implemented on different algorithms; they have different specification and application in various fields. These speech codecs generally complies Industry standards like ITU.

The various software speech codecs are:

G.711
G.722
G.723 & G.723.1
G.726
G.728
G.729
AMR, AMR-WB, AMR-NB

These various Speech codecs are technically differentiated from each other based on various factors which includes compression technology / algorithm, platform supported, bandwidth, data rates etc

One can easily compare & find out various Speech codecs on wikipedia. But still there is confusion which speech codec is the appropriate and where? However it also depends on application. But understanding pros & cons of some of these codecs gives us the better information and insight depth.

G.711

Overview

G.711 is a Pulse code modulation (PCM) of voice frequencies on a 64 kbps channel. G.711 uses a sampling rate of 8,000 samples per second. Non-uniform quantization with 8 bits is used to represent each sample, resulting in a 64 kbit/s bit rate.

There are two types of standard compression algorithms are used. (1) µ-law algorithm (2) A-law algorithm.

Pros

Designed to deliver precise transmission of speech
Very low processing overheads

Cons

Poor network efficiency
Lacks missing packet interpolation
Including overheads, uses >64kbps, thus at least 128kbps bandwidth in each direction is required

Other Version

G.711.1 is an extension version of G.711, G.711.1, allows the addition of narrowband and/or wideband (16000 samples/s) enhancements, which leading to data rates of 64, 80 or 96 kbit/s.

G.722 Overview

G.722 is a ITU standard wideband speech codec operating at 48-64 kbit/s. Technology of the codec is based on split band ADPCM.

Pros

It is useful in fixed network voice over IP applications, where the required bandwidth is typically not prohibitive
It also offers a significant improvement in speech quality over older narrowband codecs such as G.711

Cons

They are not optimum for broadcast remotes

Other Version

G.722.1 is an ITU-T standard audio codec used for high quality speech G.722.1 is a transform-based compressor that is optimized for both speech and music. The computational complexity is quite low and the algorithmic delay end-to-end is 40 ms.

G.722.2 is also referred as AMR-WB. It is a speech coding standard developed after the AMR using same technology like ACELP. Kindly check AMR-WB for further details.

G.723 & G.723.1

G.723 is completely different than G.723.1

G.723 Overview:

G.723 is an ITU standard for speech codecs that uses the ADPCM method and provides good quality audio at 24 and 40 Kbps.

Note: G.723 codec mainly used for digital circuit multiplication equipment (DCME) applications. And latter folded into G.726. Kindly see the G.726

G.723.1 Overview:

G.723.1 is a speech codec that compresses voice audio in 30 ms frames. An algorithmic look-ahead of 7.5 ms duration means that total algorithmic delay is 37.5 ms.

Pros

Very high compression whilst maintaining high quality audio.
Allows simultaneous encode & decode in software (on fast computers)
G.723.1 is much effective in the audio portion of videoconferencing/telephony over public telephone (POTS).

Cons

Requires a lot of processor power.
Not well-suited to music or sound effects
Lower quality than many other codecs at similar data rates

G.726

Overview

G.726 is an ADPCM speech codec for the transmission of voice at rates of 16, 24, 32, and 40 kbit/s.G.721 and G.723 had been folded into G.726.

Pros

Uses 32 Kbits which is half the rate of G.711 codec and hence increasing the usable network capacity by 100%
Very much used on international trunks in the phone network.

Cons

Not well-suited to music or sound effects

G.728

Overview

G.728 uses Low-Delay Code Excited Linear Prediction (LD-CELP) compression technology at 16 kbps

Pros

G.728 rates as “toll quality”. So voice quality is really good as compared to its previous speech codecs.
G.728 is a Low delay speech coder hence including satellite, cellular, and video conferencing systems

Cons

Few bits are available for error protection

G729

Overview

The G.729 speech codec uses a audio data compression algorithm and compress the data at bit rates that vary between 6.4 and 12.4 kbps

Pros

Low delay for compression of speech data as low as 10 milliseconds. Hence music or tones such as DTMF or fax tones cannot be transported reliably with this codec
Because of its lower bandwidth around 8 kbps it mostly used in Voice over IP (VoIP) applications for its low bandwidth requirement

Cons

Speech quality decreases by marginally.
License required for use

Other Version

G.729A/G.729B uses Conjugate-Structure Algebraic-Code-Excited Linear Prediction (CS-ACELP) compression algorithm. The reduction in complexity may result in a small decrease in voice quality. G.729A is suitable for VoIP or similar applications using multimedia, voice, and/or data

AMR

Overview

Adaptive Multi-Rate (AMR) is an audio data compression scheme optimized for speech coding. AMR was adopted as the standard speech codec by 3GPP

Pros

Superior sound quality due to wider speech bandwidth

Cons

The disadvantage is course the delay it introduces in the voice path.

Other Version

AMR-WB (Adaptive Multi Rate WideBand) is a speech coding standard developed after the AMR using same technology like ACELP.

AMR-NB (Adaptive Multi-Rate Narrowband) is a speech codec employed in low-bitrate applications like mobile phones. It is a form of ACELP.

To commercialize these speech codecs couples of portals are available where one can promote and procure these codecs. Such portals are design-reuse, chipestimates, IPsupermarket.com which allows you to buy/sell or license various speech codecs.

Speech Codecs: Pros & Cons

Friday, January 9, 2009

Understanding Various Speech Codecs

These various Speech codecs are technically differentiated from each other based on various factors which includes compression technology / algorithm, platform supported, bandwidth, data rates etc

One can easily compare & find out various Speech codecs on wikipedia. But still there is confusion which speech codec is the appropriate and where? However it also depends on application. But understanding pros & cons of some of these codecs gives us the better information and insight depth.

G.711

G.722

Overview

G.723 & G.723.1

G.726

G.728

G729

AMR

No comments:

Latest News

Important Blogs

Categories

Software Codecs

Software IP Portal

Blog Archive