Are EVS a solution to the limitations of AMR codecs?

Hi all,
Considering:
Traditional telephony voice bandwidth : up to 3.4 KHz
AMR bandwidth : up to 7 KHz
EVS bandwidth : up to 20 KHz
Can we consider EVS a solution to the limitations of AMR codecs?
Is there any “side-effect”?

AMR-WB, Adaptive Multi-Rate - Wideband Speech Codec.

image

Most speech coding systems in use today are based on telephone-bandwidth narrowband speech, nominally limited to about 200-3400 Hz and sampled at a rate of 8 kHz.

This limitation built into the public switched telephone network (PSTN) dates back to the first transcontinental telephone service at the beginning of the 20th century and imposes a constraint on communication quality.

Today, the increasing penetration of end-to-end digital networks such as the second- and third-generation wireless systems (2G and 3G) and voice over packet networks permits the use of wider speech bandwidth.

The AMR-WB speech codec utilizes the ACELP (Algebraic Code Excitation Linear Prediction) technology, which is also employed in the AMR narrowband and EFR speech codecs as well as in ITU-T G.729 and G.723.1 at 5.3 kbit/s, among others.

The AMR-WB speech codec consists of nine speech codec modes with bit rates of 23.85, 23.05, 19.85, 18.25, 15.85, 14.25, 12.65, 8.85 and 6.6 kbps.

AMR-WB also includes a background noise mode that is designed to be used in discontinuous transmission (DTX) operation in GSM and as a low bit rate source-dependent mode for coding background noise in other systems.

In GSM the bit rate of this mode is 1.75 kbps.

Wideband speech coding results in major subjective improvements in speech quality. Compared to narrowband telephone speech, low-frequency enhancement in AMR-WB from 50 to 200 Hz contributes to increased naturalness, presence, and comfort.

The high-frequency extension from 3400 to 7000 Hz provides better fricative differentiation (for example, between words like fin and thin), and therefore higher intelligibility.

The adoption of AMR-WB by ETSI/3GPP and ITU-T (where it is referred to as G.722.2) is of significant importance because, for the first time, the same codec has been adopted for wireless as well as wireline services. This eliminates the need for transcoding and eases the implementation of wideband voice applications and services across a wide range of communication systems and platforms.

AMR-WB: Audio data compression scheme optimized for speech coding in GSM (Global System for Mobile Communications) and UMTS (Universal Mobile Telecommunications System), an elaboration of AMR, and featuring Algebraic Code Excited Linear Prediction (ACELP) compression coding.

The AMR-WB bitrates are 6.60, 8.85, 12.65, 14.25, 15.85, 18.25, 19.85, 23.05 and 23.85 kbps.

  • The lowest rate that provides excellent speech quality in a clean environtment is 12.65 kbps.
  • Higher rates are useful in background noise conditions and in the case of music.
  • Rates of 6.60 and 8.85 provide reasonable quality when compared to narrow band codecs like AMR.

In summary: Compared to narrowband speech codecs (like AMR) optimized for traditional telephone voice quality of 300-3400 Hz, the AMR-WB codec’s wider bandwidth of 50-7000 Hz provides excellent speech quality.

image

In tradition AMR-NB it is considered that 0 kHz - 4 kHz contains most of the frequency component of any voice source & moderate voice service can be provided if the source signal frequency components are kept restricted up to 4 kHz & then that analog source signals encoded.

image

In AMR-WB the source signal frequency component are kept up to 7 kHz & that 7 kHz signal is encoded which improves the voice quality than traditional AMR-NB.

In Narrow Band AMR the analog to digital sampling rate is 8000 samples/sec & the PCM encoding is 8 bit.
In Wide Band AMR the analog to digital sampling rate is 16000 samples/sec & the PCM encoding is 14 bit.

AMR-WB comprises nine coding rates. The first three rates, 6.60 kbps, 8.85 kbps and12.65 kbps, make up the mandatory multirateconfiguration (set 0) for wideband voicetelephony. Two optional configurations with 15.85 kbps or 23.85 kbps modes have beendefined for use with specific telephony applications, such as multiparty conferencing.