*C:\WINWORD\CCITTREC.DOTRecommendation H.320 Recommendation H.320 NARROW-BAND VISUAL TELEPHONE SYSTEMS AND TERMINAL EQUIPMENT 1 Scope This Recommendation covers the technical requirements for narrow-band visual telephone services defined in€H.200/AV.120-Serie Recommendations, where channel rates do not exceed 1920€kbit/s. Note€€It is anticipated that this Recommendation will be extended to a number of Recommendations each of which would cover a single videoconferencing or videophone service (narrow-band, broadband,€etc.). However, large parts of these Recommendations would have identical wording, while in the points of divergence the actual choices between alternatives have not yet been made; for the time being, therefore, it is convenient to treat all the text in a single Recommendation. The service requirements for visual telephone services are presented in Recommenda- tion€H.200/AV.120-Serie; video and audio coding systems and other technical set aspects common to audiovisual services are covered in other Recommendations in the€H.200/AV.200-Serie. 2 Definitions bit-rate allocation signal (BAS) Bit position within the frame structure of H.221 to transmit, e.g. commands, control and indication signals, capabilities. control and indication (C&I) End-to-end signalling between terminals consisting of control which causes a state change in the receiver and indication which provides for information as to the functioning of the system, see also Recommendation€H.230. data port Input/output gate for the user data transmitted within service channel or sub-channels according to Recommendation€H.221. lip synchronization Operation to provide feeling that speaking motion of the displayed person is synchronized with the voice the person makes. in-band signalling Signalling via BAS of the H.221 frame structure. multipoint control unit (MCU) A piece of equipment located in a node of the network or in a terminal which receives several channels from access ports and, according to certain criterions, processes audiovisual signals and distributes them to the connected channels. man-machine interface (MMI) Man-machine interface between user and terminal/system which consists of a physical section (electro-acoustic, electro-optic transducer, keys,€etc.) and a logical section dealing with functional operation states. narrow-band Bit rates ranging from 64 kbit/s to 1920 kbit/s. This channel capacity may be provided as a single B/H0/H11/H12-channel or multiple B/H0-channels in ISDN. out-band signalling Signalling via a channel not being part of the B/H0/H11/H12-channel (due to I.400-Serie Recommendations). visual telephone services A group of audiovisual services including videophone defined in Recommendation€F.721 and videoconferencing to be defined in€H.200/AV.112-Serie Recommendations. FIGURE 1/H.230 = 11,5 cm 3 System description 3.1 Block diagram and identification of elements A generic visual telephone system is shown in Figure 1/H.320. It consists of terminal equipment, network, multipoint control unit (MCU) and other system operation entities. A configuration of the terminal equipment consisting of several functional units is also shown in Figure€1/H.320. Video I/O equipment includes cameras, monitors and video processing units to provide functions such as split-screen scheme. Audio I/O equipment includes microphones, loud-speakers and audio processing units to provide such functions as acoustic echo cancellation. Telematic equipment are visual aids such as electronic blackboard, still picture transceiver to enhance basic visual telephone communication. System control unit carries out such functions as network access through end-to-network signalling and end-to-end control to establish common mode of operation and signalling for proper operation of the terminal through end-to-end signalling. Video codec carries out redundancy reduction coding and decoding for video signals, while audio codec does the same thing for audio signals. Delay in the audio path compensates video codec delay to maintain lip synchronization. Mux/dmux unit multiplexes transmitting video, audio, data and control signals into a single bit stream and demultiplexes a received bit stream into consisting multimedia signals. Network interface makes necessary adaptation between the network and the terminal according to the user-network interface requirements defined in the I.400-Serie Recommendations. 3.2 Signals Visual telephone signals are classified into video, audio, data and control as follows: Audio signals are continuous traffic and require real-time transmission. Note In order to reduce the average bit rate of audio signals, voice activation can be introduced (in which case the audio signals are no longer continuous). Video signals are also continuous traffic, the bit rate allocated to video signals should be as high as possible, in order to maximize the quality within the available channel capacity. Data signals include still pictures, facsimile and documents, or other facilities, this signal may occur only occasionally as required and may temporarily displace all or part of the audiovisual signal content. It should be noted that data signals are associated only with optional enhancements to the basic visual telephone system, therefore, the opening of a path to carry such signals is preceded by negotiation between the terminals. Control signals are some system control signals by definition. The path for the terminal-to-network control signals is provided in the D-channel, while the path for the terminal-to-terminal control signals is provided in BAS or service channel only when necessary by the mechanism defined in Recommendation€H.221. 3.3 Bit rate options and infrastructure 3.3.1 Communication modes of visual telephone Communication modes of visual telephone are defined in Table 1/H.320 according to their channel configuration and coding. *include 320-T01ETABLE 1/H.320 Communication modes of visual telephone Visual telephone modeChannel rate (kbit/s)ISDN channel (Note 2) ISDN interface CodingBasicPrimary rateAudioVideoaa064BRec. G.711Not applicablea1Rec. H.200/ AV.254bb11282BRec. G.711b2Rec. G.722b3Rec.H.200/ AV.254, AV.253 (Note 1)c1983Bd2564Be3205BRec. H.261f3846Bg384H0ApplicableRec. G.722h7682H0i11523H0Not applicablej15364H0k1536H11l19205H0m1920H12Note 1€€(Audio coding of mode b3) In addition to H.200/AV.254, higher quality audio coding such as H.200/AV.253 may be used for this mode. Note 2 For multiple channels of B/H0, all channels are synchronized at the terminal according to § 2.7 of Recom-mendation€H.221. 3.3.2 Terminal types of visual telephone Table 2/H.320 lists terminal types of visual telephone. The terminal type is categorized according to the communication modes and the type of communication channels with which the terminal can communicate; mxB (type€X with parameter€a-f), nxH0 (type€Y with parameter 1-5; see Note), H11/H12 (type€Z with parameter a-ß) or their combinations. Note€€Type Y terminals must have the H0-6B compatibility mode defined in Recommendation€H.221 for interworking of evolving networks. 3.3.2.1 Examples: a) type Xb3 is a terminal capable of operating at modes a0, a1, b1, b2 and b3 through B or 2xB-channel; b) type Xb3Y1 is a terminal capable of operating at modes a0, a1, b1, b2, b3 and€g through€B, 2xB- or H0-channel. c) type XfY4Za is a terminal capable of operating at modes a0-k through (1-6)xB, (1-4)xH0- or H11-channel. For MxB and NxH0 categories, the terminal should be able to operate at all the values of€m and€n not higher than€M and€N in principle (see Note). The type of remote terminal is identified through the transfer rate capability exchange defined in Recommendation€H.242. Note€€Until Recommendation H.200/AV.254 is recommended, exceptions may arise. 3.3.3 Video codec As per Recommendation H.261. 3.3.4 Audio codec As per Recommendations G.711, G.722, H.200/AV.254, AV.253 (see Table€1/H.320). 3.3.5 Frame structure As per Recommendation H.221. 3.3.6 Control and indication (C&I) Identified subset of H.230 is used (see §€4.4). 3.3.7 Communication procedure As per Recommendation H.242. 3.4 Call control arrangements To establish intercommunication between various audiovisual terminals, it is necessary to carry out in-band and out-band procedures according to Recommendation€H.242 and other relevant Recommendations. The different stages of the call are referred according to a point-to-point configuration where terminal€X is the calling terminal and€Y the called terminal. *include 320-T02ETABLE 2/H.320 Visual telephone terminal type ModeType X (Note 2)Type Y (Note 3)Type Zab1b2b3b4b5cdef12345aba0 B (audio only)XXXXXXXXXXa1 B (H.200/AV.254 audio)XXXXXXXXb1 2B (G.711 audio)XXXXXXXXXb2 2B (G.722 audio)XXXXXXXb3 2B (H.200/AV.254 audio)XXXXXc 3BXXXXd 4BXXXe 5BXXf 6BXg H0XXXXXh 2H0XXXXi 3H0XXXj 4H0XXk H11Xl 5H0Xm H12XNote 1 X means the mode is equipped with the terminal of the type. Note 2 Types Xb4 and Xb5 are defined to take into account that H.200/AV.254 has not yet been recommended. Note 3 Terminal of this type must have the H0-6B compatible mode defined in Recommendation€H.221. 3.4.1 Establishment of a visual telephone call€€Normal procedure The provision of the communication is made in the main following steps: phase A: call set-up, out-band signalling; phase B1: mode initialization on initial channel; phase CA: call set-up of additional channel(s), if relevant; phase CB1: initialization on additional channel(s); phase B2 (or CB2): establishment of common parameters; phase C: visual telephone communication; phase D: termination phase; phase E: call release. 3.4.1.1 Phase A€€Call set-up After user initialization, the terminal X performs a call set-up procedure. As soon as the terminal receives an indication from the network that the connection is established, a bidirectional channel is opened from end to end, and it overlays€H.221 framing on the channel. Following the connection establishment, all the terminals will start to work in the following modes defined in Recommendation€H.221: type X: mode OF (A-law or µ-law), type Y and type Z: mode OF (A-law or µ-law) audio only. In-band procedure is activated. 3.4.1.2 Phase B1€€Mode initialization 3.4.1.3 Phase B1-1 Using the procedures provided in Recommendation€H.242, framed PCM audio is transmitted in both directions, after frame and multiframe alignment terminal capabilities are exchanged. 3.4.1.4 Phase B1-2 (terminal procedure) Determination of the appropriate mode to be transmitted. This will normally be the highest common mode (see Table€3/H.320 for the case using a B or 2xB-channel), but a lower compatible mode could be chosen instead. In the case that both terminals have announced the capability to work on additional channel(s), terminal€X initiates the request for supplementary call set-up. Alternatively, this action may be suspended until the user at€X has given the go-ahead, the Y€user may also control the additional channel requests. It is for further study. Note€€If the user at either terminal does not wish the call to proceed to two or more channels, even though his terminal has this capability, he must set the terminal such that only single-channel capability is declared in phase€B1-1. In that case, we should distinguish the active capability, wished by the users, from the virtual capability of the terminal. 3.4.1.5 Phase B1-3 (mode switching) Both terminals switch to the mode they have identified in phase€B1-2, using the procedure of Recommendation€H.242. Note€€If the terminals have not both adopted the common mode, an asymmetric communication may result. *include 320-T03ETABLE 3/H.320 Common mode (default) for communication between different types of visual telephones using a B or 2xB-channel XaXb1Xb2Xb3Xb4Xb5Terminal typea1a1a1a1a0a0Xab1b1b1b1b1Xb1b2b2b1b2Xb2b3b1b2Xb3b1b1Xb4b2Xb5Note The communication modes indicated in the table above include the possibility to use the CIF format as well as the QCIF format. The CIF format is used in both directions of transmission, if both terminals indicate this capability. In all other cases the QCIF format is used. Each terminal may use a minimum decodable picture interval in its sending direction which makes best use of the capability indicated by the other terminal. This table does not include interworking situations between visual telephones and telephone terminals. If visual telephone terminals are connected to telephones, mode a0 is used for the communication. 3.4.1.6 Phase CA€€Call set-up of the additional channel(s) Following phases B1-3 and B2 if relevant, the communication phase€C proceeds on that channel. If additional channels have been requested these are again phase€A (hence the nomenclature Phase CA), exactly as in phase€A above, and additional call set-ups are performed by the terminals. On each of the established channels€H.221 framing is overlaid (see Note). Note€€During phase CA an intermediate audiovisual mode could be offered on the initial channel used for initialization, until full completion of initialization phase. 3.4.1.7 Phase CB1 Mode initialization on additional channel(s) 3.4.1.8 Phase CB1-11 Using the procedure provided in Recommendation€H.242, frame and multiframe alignments are gained. 3.4.1.9 Phase CB1-12 Synchronization of the channels is achieved. 3.4.1.10 Phase CB1-2 (terminal procedure) Determination of the appropriate mode to be transmitted. This will normally be the highest common mode, but a lower compatible mode could be chosen instead. 3.4.1.11 Phase CB1-3 (mode switching) Both terminals switch to the mode they have identified in phase B1-2 using the procedure of Recommendation€H.242. Note€€Here again, if the terminals have not both adopted the common mode, an asymmetric communication will result. 3.4.1.12 Phase B2 (or CB2)€€Establishment of common parameters This phase establishes common operational parameters specific to visual telephone (e.g.€encryption) after phase€B1 process is finished. Capabilities or requirements of the receiving side are first indicated then the sending side decides operational parameters and controls the receiving side. BAS codes for this purpose are defined in Recommendation€H.221. 3.4.1.13 Phase C€€Visual telephone communication In the case where more than one channel is used, there will be intermediate phases€CA, CB1, CB2 as described in this section. Likewise, if additional channels are dropped during the call there will be intermediate phases€CD, CE as described in §€3.4.4. The provisions of this paragraph apply to any channel, initial or additional, for which phases€B1 and€B2 have been completed and phase€D not yet started. 3.4.1.13.1 Mode switching According to action by either user (for example, starting a facsimile machine) a different mode from the highest common mode may become more appropriate. Switching to this mode is made according to the procedure of Recommendation€H.242. 3.4.1.13.2 Capability change The user may change the capability of his terminal during the call (for example, by connecting or switching-on auxiliary telematic equipment); the terminal must initiate the capability exchange procedure defined in Recommendation€H.242. 3.4.1.14 Phase D€€Termination phase 3.4.1.15 Phase D1 (terminal procedure) When one of the users hangs up, the terminal involves phase D2 directly. 3.4.1.16 Phase D2 (mode switching) Mode OF is forced according to Recommendation€H.242 (or taking into account the result of phase D1 if different, for further study). 3.4.1.17 Phase E€€Call termination (release) The terminal which has initiated the hang up sends messages over the D-channel with respect to all channels and idles all of them (that means no more information sent over). At the other terminal, the first disconnect message received will idle all channels. The actual disconnection occurs at reception of the other disconnect message(s). 3.4.2 Exceptional procedures for phases A and B In case of unsuccessful outcome during phases A and B (due to many causes), exceptional procedures are provided in order to ensure a suitable service. The matter is for further study. 3.4.3 Exceptional procedures during phase C During the actual exchange of audiovisual data, problems may occur in some channels. Fallback procedures, managed by the terminal are activated. The description of the procedures and the appropriate indications are for further study. 3.4.4 Addition and dropping of channels during a visual telephone call 3.4.4.1 Addition According to action by a user (for example the activation of auxiliary equipment) one or more additional channels are requested. The procedure follows those described for phases CA and CB1. 3.4.4.2 Dropping Two phases are envisaged: 3.4.4.2.1 Phase CD1 The common mode, appropriate to the channel(s) which remains, is selected. 3.4.4.2.2 Phase CD2 The mode switching procedure of Recommendation€H.242 is applied to involve the mode identified in phase€CD1; the remaining channel is the channel used for initialization (see phase€A). It supports an appropriate fallback mode. The matter is for further study. 3.4.5 Transmission and display of pictures at the start of a visual telephone call According to the chosen terminal procedures, pictures may or may not be visible to both users as soon as initialization is complete. In the case that either phase€B1-3 or phase€CB1-3 has activated a common mode, including video, mutual visibility of the users is possible. The following paragraphs collect alternative procedures which can be used to suspend picture display until user intervention (by mutual agreement or otherwise) and causes pictures to be displayed. 1) No video transmitted: In phase B1-2 and (if relevant) phase CB1-2 the mode selected includes video OFF. During phase€C either user may unilaterally switch to video ON, alternatively, his terminal may send the C&I BAS code VIR (video indicate ready-to-activate), but not switch to video-ON until VIR is also received from the other terminal. While the incoming video-OFF state remains, the visual telephone screen should display a symbol or message indicating this (i.e.€there is no fault). As already noted in § 3.4.1, phase B1-2, the request for additional channel may, according to terminal procedure, be delayed while video-OFF is maintained; user action to activate video would then result in procedure phases€CA1, CB1 (CB2 if required). 2) Video pattern transmitted: An electronically generated or other pattern is transmitted instead of the signal from a normal camera. The C&I BAS code VIS (video indicate suppressed) is used to indicate the situation to the remote party. 3) Video transmitted but not displayed: Terminal procedures simply involve local action to display not the incoming signal but an explanatory symbol or message. User action would cause the incoming signal to be displayed, but if this should depend on mutual action by both users then a new C&I BAS code VRD (video ready-to-display) must be defined. This point is for further study. 3.5 Optional enhancements 3.5.1 Data ports Data ports as physical I/O ports of the terminal for telematic and other equipment are activated/deactived by BAS commands. Depending on the transmission capability of a connection, e.g.€multiples of B/H0 channels,€etc., various bit rates are available at these ports. Allocation of bit streams to the port(s) is performed by in-band signalling. Data conveyed at the port(s) is transparent, data rates being listed in Annex€A to Recommendation€H.221. 3.5.2 Encryption Encryption may be applied on audio and video signals separately (preferably for multipoint connections) or on audio and video signals multiplexed. Switching-on and off the encryption process has to be signalled between the terminals (or terminal and MCU respectively) via in-band signalling. 4 Terminal requirements 4.1 Environments Under study. 4.2 Audio and video arrangements Under study. 4.3 Delay compensation in the audio path The H.261 video codecs require some processing delay, while the H.200/AV.250-Serie audio codecs involve much less delay. Hence, if lip synchronization is to be maintained, that video processing delay must be compensated in the audio path. Since video coder and decoder delays may vary according to implementation, delay compensation must be carried out individually at the coder and decoder. A reference measurement method of video coder and decoder delays is defined in Recommendation€H.261. 4.4 Control and indications (C&I) C&I are chosen from the general audiovisual set contained in Recommendation H.230. For visual telephone systems, those signals in Table€4/H.320 are used, where their source, sink, synchronization with picture, transmission channel and codewords are indicated. All visual telephone terminals have a video source providing a picture of participants, and some terminals may have additional video sources; the participant-picture source is designated #1, having the associated symbol VIA. When incoming video is ON (BAS command (010) [1 or 2]) and VIA, VIA2, VIA3 have not been transmitted, source€#1 is assumed. 5 Intercommunications The mechanisms for intercommunication with other services are described in the H.200/AV.240-Serie Recommendations. 5.1 Intercommunication between different visual telephone terminal types A common mode of operation is determined as described in §€3.4.1 above. D-channel signalling should include new LLC and HLC which are appropriate for audiovisual services, but this point is for further study. 5.2 Intercommunication with telephony Note€€Description of this paragraph is for communications using a B-channel. *include 320-T04ETABLE 4/H.320 C&I signals for visual telephone C&I signal C/I Source SinkSync. with pictureTransmission channelCodeword definitionPicture formatIDecoderCoderNoBASH.221Picture formatCCoderDecoderYesEmbedded in videoH.261Minimum decodable picture intervalIDecoderCoderNoBASH.221Freeze picture request control, VCFCCoder or MCUDecoderNoBASH.221Fast update request control, VCUCDecoder or MCUCoderNoBASH.221Freeze picture release controlCCoderDecoderYesEmbedded in videoH.261MCU related messageCTerminal or MCUTerminal or MCUNoMLPH.200/ AV.270-SeriesMultipoint command conference cancel MCC/cancel-MCC C MCU Terminal No BAS H.230Multipoint command symmetrical data transmission, MCS C MCU Terminal No BAS H.230Multipoint command negating MCS, MCNCMCUTerminalNoBASH.230Audio loop request control, LCACTerminalTerminalNoBASH.221Video loop request control, LCVCTerminalTerminalNoBASH.221Digital loop request control, LCDCTerminalTerminalNoBASH.221Loop off request, LCOCTerminalTerminalNoBASH.221Split-screen indicationISending terminalReceiving terminalYesEmbedded in videoH.261Document camera indicationISending terminalReceiving terminalYesEmbedded in videoH.261Audio active/muted indication, AIA/AIMISending terminalReceiving terminalNoBASH.230Video active indication VIA, VIA2, VIA3ISending terminalReceiving terminalNoBASH.230Video ready to activate/ suppressed indication,VIR/VISISending terminalReceiving terminalNoBASH.230 5.2.1 Intercommunication with ISDN telephones A call from a visual telephone to an ISDN telephone is first placed as an audiovisual call, but the ISDN telephone returns incompatible destination or the network returns recovery on timer expiry in case of no responses from the called side, then the visual telephone may switch to a speech or 7€kHz audio bearer service call. A call from ISDN telephone to a visual telephone is accepted by the visual telephone because every audiovisual terminal is equipped with this telephone capability as a minimum function. For both of the above cases, the operational mode of communication is G.711 speech or G.722 audio. 5.2.2 Intercommunication with PSTN telephones A call from visual telephone to a PSTN telephone may be initiated as an audiovisual call, but the network returns no route to destination, then the visual telephone may switch to a speech or 3.1€kHz Audio Bearer Service Call. The operational mode of communication is€G.711 audio coding. A call from a PSTN telephone is routed into the ISDN as a 3.1 kHz audio call which can be responded by the visual telephone for the same reason as described in §€5.2.1. The operational mode of communication is 3.1€kHz audio. 5.3 Intercommunication with other audiovisual terminals A common mode of operation is determined according to H.200/AV.242-Serie Recommendations. 6 Maintenance Some loop-back functions are envisaged to allow verification of the functional aspects of the terminal in order to ensure correct operation of the system and satisfactory quality of the service to the remote party. The following loop-back functions (see Figure€2/H.320) are envisaged: a) Loop at terminal-network interface (towards network) Upon receiving the digital loop back BAS, loop back is activated at the digital interface of the terminal toward the network side. In case of a multiple B/H0 channel arrangement, loop back is activated in each connection. b) Loop at terminal-network interface (towards terminal) The procedure is for further study. c) Loop at analogue I/O interface Upon receiving the video loop back or audio loop back BAS, loop back is activated at the analogue interface of the video/audio codec towards the video/audio codec. The opportunity of having a self-checking procedure at terminal stage is for further study. 7 Human factor aspects To achieve error free and uncomplicated utilization of terminal equipment and service from the users standpoint, human factor related aspects have to be studied and recommended. These aspects deal with the flow of information between user and terminal/network. This information can be divided into a physical section and a logical section of the MMI. FIGURE 2/H.320 = 12 cm 7.1 Physical section Figures and properties of transducers (camera, microphone, etc.). Signals particularly related to the service, keys, pictograms. 7.2 Logical section Procedures, e.g. for call establishment/release, during communication phase. Consistency between the MMIs of visual telephone and terminals of other teleservices. MONTAGE ( a mettre au tableau 4/H.320) Terminal Confe- rence Maintenance MCU Video *IMPORT R:\\ART\\WMF\\ITU.WMF \* mergeformat INTERNATIONAL TELECOMMUNICATION UNION CCITT H.320 THE INTERNATIONAL TELEGRAPH AND TELEPHONE CONSULTATIVE COMMITTEE LINE TRANSMISSION OF NON-TELEPHONE SIGNALS NARROW-BAND VISUAL TELEPHONE SYSTEMS AND TERMINAL EQUIPMENT Recommendation H.320 *IMPORT R:\\ART\\WMF\\CCITTRUF.WMF \* mergeformatGeneva, 1990 FOREWORD The CCITT (the International Telegraph and Telephone Consultative Committee) is the permanent organ of the International Telecommunication Union (ITU). CCITT is responsible for studying technical, operating and tariff questions and issuing Recommendations on them with a view to standardizing telecommunications on a worldwide basis. The Plenary Assembly of CCITT which meets every four years, establishes the topics for study and approves Recommendations prepared by its Study Groups. The approval of Recommendations by the members of CCITT between Plenary Assemblies is covered by the procedure laid down in CCITT Resolution No. 2 (Melbourne,€1988). Recommendation H.320 was prepared by Study Group XV and was approved under the Resolution No. 2 procedure on the 14 of December€1990. ___________________ CCITT NOTE In this Recommendation, the expression Administration is used for conciseness to indicate both a telecommunication Administration and a recognized private operating agency.  ITU€€1990 All rights reserved. No part of this publication may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying and microfilm, without permission in writing from the ITU.