C:\WINWORD\CCITTREC.DOT INTERNATIONAL TELECOMMUNICATION UNION CCITT G.727 THE INTERNATIONAL TELEGRAPH AND TELEPHONE CONSULTATIVE COMMITTEE GENERAL ASPECTS OF DIGITAL TRANSMISSION SYSTEMS; TERMINAL EQUIPMENTS 5-, 4-, 3- AND 2-BITS SAMPLE EMBEDDED ADAPTIVE DIFFERENTIAL PULSE CODE MODULATION (ADPCM) Recommendation G.727 Geneva, 1990 Printed in Switzerland FOREWORD The CCITT (the International Telegraph and Telephone Consultative Committee) is a permanent organ of the International Telecommuni- cation Union (ITU). CCITT is responsible for studying technical, operating and tariff questions and issuing Recommendations on them with a view to standardizing telecommunications on a worldwide basis. The Plenary Assembly of CCITT which meets every four years, establishes the topics for study and approves Recommendations pre- pared by its Study Groups. The approval of Recommendations by the members of CCITT between Plenary Assemblies is covered by the procedure laid down in CCITT Resolution No. 2 (Melbourne, 1988). Recommendation G.727 was prepared by Study Group XV and was approved under the Resolution No. 2 procedure on the 14th of December 1990. ___________________ CCITT NOTE In this Recommendation, the expression “Administration” is used for conciseness to indicate both a telecommunication Administration and a recognized private operating agency. ãITU1990 All rights reserved. No part of this publication may be reproduced or uti- lized in any form or by any means, electronic or mechanical, including pho- tocopying and microfilm, without permission in writing from the ITU. PAGE BLANCHE Recommendation G.727 Recommendation G.727 5-, 4-, 3- AND 2-bits SAMPLE EMBEDDED ADAPTIVE DIFFERENTIAL PULSE CODE MODULATION (ADPCM) 1 Introduction This Recommendation contains the specification of an embedded Adaptive Differential Pulse Code Modulation (ADPCM) algorithms with 5- , 4-, 3- and 2-bits per sample (i.e., at rates of 40, 32, 24 and 16kbit/s). The characteristics below are recommended for the conversion of 64kbit/s. A-law or m-law PCM channels to/ from variable rate-embedded ADPCM channels. The Recommendation defines the transcoding law when the source signal is a pulse-code modulation signal at a pulse rate of 64kbit/s devel- oped from voice frequency analog signals as fully specified by Blue Book Volume, RecommendationG.711. Applications where the encoder is aware and the decoder is not aware of the way in which the ADPCM codeword bits have been altered, or when both the encoder and decoder are aware of the ways the codewords are altered, or where neither the encoder nor the decoder are aware of the ways in which the bits have been altered can benefit from other embedded ADPCM algorithms. 2 General The embedded ADPCM algorithms specified here are extensions of the ADPCM algorithms defined in RecommendationG.726 and are recom- mended for use in packetized speech systems operating according to the Packetized Voice Protocol (PVP) specified in draft RecommendationG.764. PVP is able to relieve congestion by modifying the size of a speech packet when the need arises. Utilizing the embedded property of the algo- rithm described here, the least significant bit(s) of each codeword can be disregarded at packetization points and/or intermediate nodes to relieve con- gestion. This provides for significantly better performance than by dropping packets during congestion. Section 3 outlines a description of the ADPCM transcoding algo- rithm. Figure1/G.727 shows a simplified block diagram of the encoder and the decoder. Sections4 and5 provide the principles and functional descrip- tions of the ADPCM encoding and decoding algorithms, respectively. Section6 contains the computational details of the algorithm. In this sec- tion, each sub-block in the encoder and decoder is precisely defined using one particular logical sequence. If other methods of computation are used, extreme care should be taken to ensure that they yield exactly the same value for the output processing variables. Any further departures from the processes detailed in Section6 will incur performance penalties which may be severe. Figure 1/G.727 = 14,5 cm 3 Embedded ADPCM algorithms Embedded ADPCM algorithms are variable bit rate coding algorthms with the capability of bit dropping outside the encoder and decoder blocks. They consist of a series of algorithms such that the decision levels of the lower rates quantizers are subsets of the quantizer at the highest rate. This allows bit reductions at any point in the network without the need of coordi- nation between the transmitter and the receiver. In contrast, the decision lev- els of the conventional ADPCM algorithms. such as those in RecommendationG.726, are not subsets of one another and therefore, the transmitter must inform the receiver of the coding rate the encoding algo- rithm. Embedded algorithms can accommodate the unpredictable and bursty characteristics of traffic patterns that require congestion relief. Because con- gestion relief may occur after the encoding is performed, embedded coding is different from variable-rate coding where the encoder and decoder must use the same number of bits in each sample. In both cases, the decoder must be told the number of bits to use in each sample. Embedded algorithms produce code words which contain enhance- ment bits and core bits. The Feed-Forward (FF) path utilizes enhancement and core bits, while the Feedback (FB) path uses core bits only. The inverse quantizer and the predictor of both the encoder and the decoder use the core bits. With this structure, enhancement bits can be discarded or dropped dur- ing network congestion. However, the number of core bits in the FB paths of both the encoder and decoder must remain the same to avoid mistracking. The four embedded ADPCM rates are 40, 32, 24 and 16kbit/s, where the decision levels for the 32, 24 and 16kbit/s quantizers are sub-sets of those for the 40kbit/s quantizer. Embedded ADPCM algorithms are referred to by (x, y) pairs where x refers to the FF (enhancement and core) ADPCM bits and y refers to the FB (core) ADPCM bits. For example, if y is set to 2bits, (5,2) will represent the 40kbits/s embedded algorithm, (4,2) will represent the 32kbit/s embedded algorithm, (3,2) will represent the 24kbit/s embedded algorithm and (2,2) the 16 kbit/s algorithm. The bit rate is never less than 16kbit/s because the minimum number of core bits is2. Simplified block diagrams of both the embedded ADPCM encoder and decoder are shown in Figure1/G.727. The Recommendation provides coding rates of 40, 32, 24 and 16kbit/ s and core rates of 32, 24 and 16kbit/s. This corresponds to the following pairs: (5,2), (4,2), (3,2), (2,2); (5,3), (4,3), (3,3); (5,4), (4,4). 3.1 ADPCM encoder Subsequent to the conversion of the A-law or m-law PCM input sig- nal to uniform PCM, a difference signal is obtained by subtracting an esti- mate of the input signal from the input signal itself. An adaptive4-, 8-, 16- or 32-level quantizer is used to assign 2, 3, 4 or 5 binary digits to the value of the difference signal for transmission to the decoder. (Not all the bits nec- essarily arrive at the decoder since some of these bits can be dropped to relieve congestion in the packet network. For a given received sample, how- ever, the core bits are guaranteed arrival if there are no transmission errors and the packets arrive at destination.) FB bits are fed to the inverse quan- tizer. The number of core bits depends on the embedded algorithm selected. For example, the (5,2) algorithm will always contain 2core bits. The inverse quantizer produces a quantized difference signal from these binary digits. The signal estimate is added to this quantized difference signal to produce the reconstructed version of the input signal. Both the reconstructed signal and the quantized difference signal are operated upon by an adaptive predictor which produces the estimate of the input signal, thereby complet- ing the feedback loop. 3.2 ADPCM decoder The decoder includes a structure identical to the FB portion of the encoder. In addition, there is also an FF path that contains a uniform PCM to A-law or m-law conversion. The core as well as the enhancement bits are used by the synchronous coding adjustment block to prevent cumulative distortion on synchronous tandem codings (ADPCM-PCM-ADPCM, etc., digital connections) under certain conditions (see §5.10). The synchronous coding adjustment is achieved by adjusting the PCM output codes to elimi- nate quantizing distorsion in the next ADPCM encoding stage. 3.3 One's density requirements These algorithms produce the all-zero code words. If requirements on one's density exist in national networks, other methods should be used to ensure that this requirement is satisfied. 3.4 Applications In the anticipated application with G.764, the Coding Type (CT) field and the block Dropping Indicator (BDI) field in the packet header defined in G.764 will inform the coder of what algorithm to use. For all other applica- tions, the information that PVP supplies must be made known to the decoder. 4 ADPCM encoder principles Figure 2/G.727 is a block schematic of the encoder. For each variable to be described, k is the sampling index and samples are taken at 125ms intervals. A description of each block is given in §§4.1 to 4.9. Fig. 2/G.727 = 9,5 cm 4.1 Input PCM format conversion This block converts the input signal s(k) from A-law or m-law PCM to a uniform PCM signal sl(k). 4.2 Difference signal computation This block calculates the difference signal d(k) from the uniform PCM signal sl(k) and the signal estimate se(k). 4.3 Adaptive quantizer A 4-, 8-, 16- or 32-level non-uniform mid-rise adaptive quantizer is used to quantize the difference signal d(k). Prior to quantization, d(k) is con- verted to a base 2logarithmic representation and scaled by y(k) which is computed by the scale factor adaptation block. The normalized input/output characteristic (infinite precision values) of the quantizer is given in Tables1/G.727 through 4/G.727 for the 16, 24, 32 and 40kbit/s algorithms, respectively. Two, three, four or five binary digits are used to specify the quantized level representing d(k) (the most significant bit represents the sign bit and the remaining bits represent the magnitude). The 2-, 3-, 4- or 5- bit quantizer output I(k) forms the 16, 24, 32 or 40kbit/s output signal and is also fed to the bit-masking block. I(k) includes both the enhancement and core bits. 4.4 Bit masking This block produces the core bits Ic(k) by logically right-shifting the input signal I(k) so as to mask the maximum droppable (least significant) bits. The number of bits to mask and the number of places to right shift depend on the embedded algorithm selected. For example, this block will mask the two least significant bits (LSB's) and shift the remaining bits two places to the right when the (4,2) algorithm is selected. The output of the bit-masking block Ic(k) is fed to the inverse adaptive quantizer, the quan- tizer scale factor adaptation and the adaptation speed control blocks. 4.5 Inverse adaptive quantizer The inverse quantizer uses the core bits to compute a quantized ver- sion dq(k) of the difference signal using the scale factor y(k) and Table1/ G.727, 2/G.727, 3/G.727 or 4/G.727 and then taking the antilog to the base2 of the result. The estimated difference se(k) is added to dq(k) to pro- duce the reconstructed version sr(k) of the input signal. Table1/G.727, 2/ G.727, 3/G.727 or 4/G.727 will be applicable only when there are 2, 3, 4 or 5bits, respectively, in the FF path. 4.6 Quantizer scale factor adaptation This bloc computes y(k), the scaling factor for the quantizer and the inverse quantizer. (The scaling factor y(k) is also fed to the adaptation speed control block.) The inputs are the bit-masked output Ic(k) and the adaptation speed control parameter al(k). The basic principle used in scaling the quantizer is bimodal adapta- tion: — fast for signals (e.g., speech) that produce difference signals with large fluctuations, — slow for signals (e.g., voiceband data, tones) that produce differ- ence signals with small fluctuations. The speed of adaptation is controlled by a combination of fast and slow scale factors. The fast (unlocked) scale factor yu(k) is recursively computed in the base2 logarithmic domain from the resultant logarithmic scale factor y(k): where yu(k) is limited by 1.06 £ yu(k) £ 10.00. For 2-core-bit operation (1 sign bit), the discrete function W[Ic(k)] is defined as follows (infinite precision values): For 3-core-bit operation (1 sign bit), the discrete function W[Ic(k)] is defined as follows (infinite precision values): For 4-core-bit (1 sign bit) operation, the discrete function W[Ic(k)] is defined as follows (infinite precision values): The factor (1 – 2–5) introduces finite memory into the adaptive pro- cess so that the states of the encoder and decoder converge following trans- mission errors. The slow (locked) scale factor yl(k) is derived from yu(k) with a low pass filter operation: The fast and slow scale factors are then combined to form the result- ant scale for: where 0 £ al(k) £ 1. 4.7 Adaptation speed control The controlling parameter al(k) can assume values in the range [0, 1]. It tends towards unity for speech signals and towards zero for voiceband data signals. It is derived from a measure of the rate-of-change of the differ- ence signal values. Two measures of the average magnitude of Ic(k) are computed: and where F[Ic(k)] is defined by for 2-core-bit (1 sign bit) operation; or for 3-core-bit (1 sign bit) operation; or for 4-core-bit (1 sign bit) operation. Thus, dms(k) is a relatively short term average of F[Ic(k)] and dml(k) is a relatively long term average of F[Ic(k)]. Using these two averages, the variable ap(k) is defined: