The drawings contain in this Recommendation have been done in Autocad. SECTION 3 MONITORING AND MEASUREMENTS Recommendation Q.791 MONITORING AND MEASUREMENTS FOR SIGNALLING SYSTEM No. 7 NETWORKS 1 General 1.1 Introduction 1.1.1 In order to effectively manage the resources provided by a Signalling System No. 7 network, it is necessary to monitor and measure the present and estimate the future performance, utilization, and availability of these resources. Recommendation Q.791 is limited to measurements and monitoring of the MTP and SCCP. The principles and scope of this Recommendation are: - measurements made on the signalling network resources are known as "raw" or primitive measurements and in general only these measurements are identified in this Recommendation; - the recommended primitive measurements and at times, other derived measurements, whose computation using the primitive measurements is described, are those required for the effective management of the signalling network resources; - a basic subset of signalling network measurements is recommended for international networks, but it is intended that this subset also be useful for national networks, which, however, may need additional measurements; - monitoring and measuring are considered to be passive processes and although the results of monitoring and measuring may be used to invoke test and maintenance actions and procedures, it is left to other Recommendations, e.g. Recommendation Q.795, to provide details of such actions and procedures; - Recommendation Q.791 is not intended to provide signalling network testing and maintenance procedures; it is left to other Recommendations to provide such procedures, e.g. Recommendations Q.707, Q.795 etc. 1.2 Local and global view 1.2.1 The signalling network measurements can provide both a local view and global network view of the performance of the signalling network. The primitive measurements which provide the two views are not necessarily different. Rather the global view is a result of a summary of measurements from more than a single signalling point so that the behaviour of the signalling network is centrally observable. A global view of the performance of the signalling network, in general, becomes more useful as the network becomes larger (i.e. more signalling points or multiple users). 1.3 Grouping of measurements 1.3.1 Each primitive measurement is classified for the purpose of guidance into one or more categories, called operations, maintenance and administration which will indicate its general area of use (see SS 2 and 5). measure measurements includes for each measurement an indication of the appropriate categories (operations, administration and maintenance) and reference to the pertinent Recommendations. 1.4 Guidelines for uses of measurements 1.4.1 The measurements may be used singly, or in conjunction with other measurements. It is not the intent of the Recommendation to specify the computations and algorithms to be applied to the primitive measurements. Guidelines, however, are provided (see ' 5) for some uses of measurements so that, for example, the view at both ends of an international link is consistent. Fascicle VI.9 - Rec. Q.791 PAGE465 2 Definition of terms 2.1 Operations (O) 2.1.1 The operation of network resources utilizes measurements that are used in real time, or are retained for short time intervals. Operations activities include signalling network surveillance. 2.1.2 Signalling network management "on occurrence" events and measurements include those which monitor and measure the signalling network response to abnormal conditions. (Requires further study.) 2.1.3 Signalling network surveillance measurements include those which monitor and measure the signalling network resources to ensure that the appropriate network performance is maintained. 2.2 maintenance (M) 2.2.1 Maintenance of the signalling network resources may involve the monitoring of the facility and equipment resources and maintaining network performance by expediting preventive and corrective effort when the measurements indicate a problem. 2.3 administration (A) 2.3.1 The administration of the signalling network resources involves measurements that are used on a long-term basis and are in general retained external to the signalling network resources (see Recommendation Q.795, S 2.6). 2.3.2 Administration activities include planning and dimensioning (engineering) the signalling network resources, including determination of the resource quantities, e.g. number of links set, and resource configuration, e.g. routing. 3 Listing of measurements 3.1 General 3.1.1 The recommended measurements are presented in the Tables 1/Q.791 to 9/Q.791. Explanatory notes relating to the contents of these tables are given below. 3.1.2 The obligatory column is used to indicate those measurements which must be provided at a signalling point. The additional ACT/PERM column indicates whether these measurements are permanently activated, or activated on demand. In non-obligatory cases, if the measurement is provided, the administration must also decide whether the measurement will be activated on demand or be permanently active. 3.1.3 The count items in the tables, identified in the units column as "events/SP", "MSUs/SL" etc., implies the total count of events in the specified period and implicitly indicates the identity of what is being counted i.e. "events/SP" identifies the Signalling Point, "MSUs/SL" identifies the Signalling Link, etc. 3.1.4 The event items in the tables which are recorded "on occurence" are intended to be recorded with a time stamp, giving the unique network time when the event indicator was generated (see Recommendation Q.795, S 2.7). The resolution and accuracy of the time stamp should be as high as possible, to increase the ability to resolve complex and rapid sequences of events. 3.1.5 The periods of measurement are specified in the Duration of Measurement column. 3.2 Table 1/Q.791 H However, the specific cause for the failure (Items 1.3V1.6) is an additional optional measurement. 3.2.2 The measurement of Snumber of Signal Units received in errorT contains the number of items (not necessarily the number of Signal Units) between what are perceived as SFlagsT plus the number of sets of 16 octets received in the Soctet countingT mode. 3.3 Table 2/Q.791 3.3.1 Local busy is defined as the period during which busy LSSUs are transmitted. 3.4 Table 3/Q.791 3.4.1 The notation S3/2T in the Level column indicates that the measured octets are those transferred across the Level 3/Level 2 boundary in the appropriate direction. 3.4.2 The opening flag and the check bits are included in Item 3.2 3.4.3 The signalling link congestion (Items 3.6V3.11) refers to link status ScongestedT at Level 3. A link is marked at Level 3 as congested when a congestion threshold is reached at the transmit side (see Recommendation Q.704, ' PAGE478 Fascicle VI.9 - Rec. Q.791 3.6 on Signalling Network Congestion and ' 11 on Signalling Traffic Flow Control). These measurements should be kept as Sthresholds 1, 2 and 3 separatelyT if that national option is selected. 3.5 Table 4/Q.791 3.5.1 Measurements 4.9 through 4.12 are required at Signalling Points in international networks if measurements 5.1 through 5.4 are not available to an RPOA. In other networks, measurements 5.1 through 5.4 at consecutive Signalling Points from origination to destination of a call might be used to derive measurements 4.9 through 4.12, consequently real time collection of the latter may not be necessary. 3.5.2 Measurements 4.9 and 4.10 are only obligatory in international networks. 3.5.3 Measurements 4.5 and 4.6 are only required at Signalling Transfer Points. 3.6 Table 5/Q.791 3.6.1 Measurement 5.5, the number of MSUs discarded due to a routing data error, can be used to trigger the MTP Route Verification Test (MRVT) described in Q.795, ' 2.3. 3.7 Table 6/Q.791 3.7.1 Activation of the measurements in Table 6/Q.791 is recommended on a per Point Code (PC) or set of Point Codes and/or Service Information Octet (SIO) basis. The measurements are not obligatory. 3.7.2 Some of the measurements in Table 6/Q.791 may be of interest for accounting purposes. 3.8 Table 7/Q.791 o of measurements. 3.9 Table 8/Q.791 3.9.1 Coordinated State Change Control measurements (Items 8.6 and 8.7) are to be taken at the signalling point of the subVsystem requesting to go out of service. These measurements are only applicable at nodes with replicated subVsystems. 3.9.2 Unavailability measurements 8.1, 8.2, 8.3, 8.4 and 8.5 are architecturally dependent. Fascicle VI.9 - Rec. Q.791 PAGE465 Table 1/Q.791 Table 2/Q.791 Table 3/Q.791 Table 4/Q.791 Table 5/Q.791 Table 6/Q.791 Table 7/Q.791 Table 8/Q.791 Table 9/Q.791 3.10 Table 9/Q.791 3.10.1 SCCP management messages are included in the totals of items 9.3, 9.4, 9.6 and 9.7. 3.10.2 SCCP utilization measurements, items 9.3 and 9.4, refers to all messages processed by SCCP Routing Control, whether or not the message is processed or delivered successfully. 3.10.3 Measurement 9.5 measures the utilization of the translation function within SCCP Routing Control and is a count of all messages for which global title translation is attempted. The measurement is only applicable at nodes with translation capabilities. 3.10.4 Measurement 9.8 refers only to those messages which would normally have been routed to a sub-system but because of a change in the translation process (e.g. due to a routing failure towards that sub-system), are directed to a backup sub-system. The measurement is only applicable at replicated nodes with translation capabilities. 4 Operations and maintenance part support 4.1 The measurements defined in this Recommendation are intended to be controlled through the use of the operations and maintenance application part defined in Recommendation Q.795. Recommendation Q.795 defines the functions needed to initiate and stop the measurements and the procedures to handle the transfer of data after collection. Long-term measurement collection procedures are defined in S 2.6 of Recommendation Q.795 and on-occurrence measurement reporting procedures in S 2.7. 5 Uses of measurements 5.1 Introduction 5.1.1 This section provides a context for the measurements listed in the Tables 1/Q.791 to 9/Q.791. It describes briefly the operational, maintenance and administrative activities likely to be associated with a Signalling System No. 7 network and how the measurements may be used to support these activities. 5.1.2 A list of supporting measurements (if any) follows each description. Each measurement is identified by its table number followed by a decimal point and the sequence number of the measurement within the table (e.g., Item 1.2 is the second measurement of Table 1/Q.791). 5.2 Operational uses 5.2.1 Message Transfer Part (MTP) 5.2.1.1 Surveillance of network status This activity is concerned with surveillance of the network as a whole in order to coordinate and assign priorities to maintenance actions. The information PAGE478 Fascicle VI.9 - Rec. Q.791 to support this activity will come from indicators of the operational and congestion status. These indicators may be found in the tables designated as Usage "O" and duration of measurement "on-occurrence". Measurements to survey network status: - local automatic changeover (Item 1.10); - local automatic changeback (Item 1.11); - start of remote processor outage (Item 2.10); - stop of remote processor outage (Item 2.11); - SL congestion indications (Item 3.6); - stop of SL congestion (Item 3.9); - number of congestion events resulting in loss of MSUs (Item 3.11); - start of linkset failure (Item 4.3); - stop of linkset failure (Item 4.4); - initiation of Broadcast TFP due to failure of measured linkset (Item 4.5); - initiation of Broadcast TFA for recovery of measured linkset (Item 4.6); - start of unavailability in measurement 4.9 (Item 4.11); - stop of unavailability in measurement 4.9 (Item 4.12); - adjacent signalling point inaccessible (Item 5.1); - stop of adjacent signalling point inaccessible (Item 5.4). Additional measurement may be provided to the operations user for determining the integrity of the network. These measurements will be provided on a five or thirty minute basis. Measurements: - duration of link in the in-service state (Item 1.1); - duration of SL unavailability (for any reason) (Item 2.1); - local management inhibit (Item 2.13); - local management uninhibit (Item 2.14); - duration of local busy (Item 2.15); - number of SIF and SIO octets received (Item 3.4); - unavailability of route set to a given destination or set of destinations (Item 4.9); - duration of adjacent signalling point inaccessible (Item 5.2). 5.2.1.2 Monitoring of link and network traffic performance This activity is concerned with ensuring that congestion thresholds and the numbers of discarded messages are within specification. If, for example, the number of Message Signal Units (MSUs) discarded due to a routing data error exceeds limits, the Routing Verification Test described in Recommendation Q.795 could be initiated to identify the source and type of routing data error. Discarded message counts may be gathered signalling point by signalling point and added together to a give a total network performance measure. One aspect of traffic performance can be monitored by measuring the amount of time that a given link is congested. The link loading or congestion duration must match the criteria upon which provisioning of links has been based. Measurements to monitor links: - number of Signalling Information Field (SIF) and Service Information Octet (SIO) octets transmitted (Item 3.1); - SL congestion indications (Item 3.6); - cumulative duration of SL congestion (Item 3.7). Measurements of MSUs discarded: Fascicle VI.9 - Rec. Q.791 PAGE465 - due to SL congestion (Item 3.10); - due to routing data error (Item 5.5). Duration measurements measure the effects of signalling link set and route set availability, by individual link set and route set. These measurements identify the effects of congestion or failure upon the surrounding network. Measurements: - duration of link in the in-service state (Item 1.1); - duration of SL unavailability (for any reason) (Item 2.1); - duration of SL unavailability due to remote processor outage (Item 2.9); - duration of local busy (Item 2.15); - cumulative duration of SL congestion (Item 3.7); - duration of unavailability of signalling linkset (Item 4.2); - duration of unavailability in measurement 4.9 (Item 4.10); - duration of adjacent signalling point inaccessible (Item 5.2). 5.2.2 Signalling Connection Control Part (SCCP) 5.2.2.1 SCCP Routing Performance The monitoring of routing failures allows SCCP Routing and Translation function to detect any abnormal number of messages which cannot be routed, independent of the originator being informed through message return. Measurements: Routing Failure due to: - no translation for address of such nature (Item 7.1); - no translation for this specific address (Item 7.2); - network failure (point code not available) (Item 7.3); - network congestion (Item 7.4); - sub-system failure (unavailable) (Item 7.5); - sub-system congestion (Item 7.6); - unequipped user (sub-system) (Item 7.7); - reason unknown (Item 7.9). In addition, the following measurements can be used as a consistency check or a network protection mechanism: - UDTS messages sent (Item 9.1); - UDTS messages received (Item 9.2); 5.2.2.2 SCCP unavailability The monitoring of SCCP unavailability may prove useful in the activation/deactivation of other network measurements. Measurements: Start of Local SCCP unavailable due to: - failure (Item 8.1); - maintenance made busy (Item 8.2); - congestion (Item 8.3), PAGE478 Fascicle VI.9 - Rec. Q.791 Stop of local SCCP unavailable; - all reasons (Item 8.4). 5.2.3 Telephony User Part For further study. 5.2.4 Integrated Services Digital Network User Part (ISDN-UP) For further study. 5.2.5 Transaction Capabilities Application Part (TCAP) For further study. 5.3 Maintenance uses The activities described in this section relate basically to the detection of degraded performance and to the maintenance of a particular signalling point and the signalling links associated with that signalling point. They may be used on a near real time basis, or may be monitored over a period of days or weeks to detect unfavourable trends. They are designed so that one signalling point can monitor its own status without relying on measurements from adjacent signalling points. 5.3.1 Message Transfer Part (MTP) 5.3.1.1 Detection of increases in link SU error rates This activity ensures that the signalling data link error rate is not rising beyond specification. The SU Error Rate Monitor is the basic instrument for monitoring signalling data link performance. Basic traffic counts are used to normalize performance measurements in order to compare system performance measurements. Measurements: - number of SIF and SIO octets transmitted (Item 3.1); - number of SIF and SIO octets received (Item 3.4). Operational measurements counting error events provide supplementary information to warn of impending failures or give a running assessment of signalling data link quality. Measurements: - number of Signal Units (SUs) in error (monitors incoming performance) (Item 1.8); - number of Negative Acknowledgements (NACKs) received (monitors outgoing performance) (Item 1.9). Counting total Signal Unit errors allows the estimation of Signalling Data Link bit error rates (see Recommendation Q.706, S 3.1) assuming that errors are random. The estimate uses measurements 1.1 duration of link in the in-service state, multiplied by the link transmission rate. Measurements: - duration of link in the in-service state (Item 1.1); - duration of link unavailability (any reason) (Item 2.1). 5.3.1.2 Detection of marginal links performance The SU Error Rate Monitor applies to lost alignment as well as corrupted data. Usually both conditions are caused by degraded performance of the transmission facility. Alignment and proving failures often indicate a marginally performing link. Measurements: - SL alignment failure (Item 1.7) This activity is concerned with detecting routing instabilities caused by marginal link performance. Measurements: - local automatic changeover (Item 1.10); - local automatic changeback (Item 1.11); - SL congestion indications (Item 3.6); - cumulative duration of SL congestion (Item 3.7); - number of congestion events resulting in loss of MSUs (Item 3.11). 5.3.1.3 Detection of link failure events in either direction By "link failure" is meant an event which causes a particular link to be unavailable for signalling (i.e. a failure at Level 1 or Level 2). Signalling link failures are detected in order to require preventive and corrective maintenance actions to restore the network capabilities. This maintenance action can be required on a single failure event or when the number of signalling links in failure for a link set or across different link sets exceeds a threshold. Signalling link failure measurements are summarized not only for specific links sets, but also across many different link sets, where these may involve Fascicle VI.9 - Rec. Q.791 PAGE465 common transmission systems or signalling points. The distribution of failure and degradation sources may be randomly located, but if specific network elements appear to be common to a large number of the failures, then they are suspect as a significant failure source requiring further maintenance action. Measurements: - number of link failures: all reasons (Item 1.2); abnormal FIBR/BSNR (Item 1.3); excessive delay of acknowledgement (Item 1.4); excessive error rate (Item 1.5); excessive duration of congestion (Item 1.6); - signalling link restoration (Item 1.12). 5.3.1.4 Detection of routing and distribution table errors In operation, the Signalling System No. 7 routing data will be updated frequently as the network changes. It is necessary to keep track of signalling point status and routing problems on a routine basis (see Recommendation Q.795, S 2.1) Measurements: - duration of unavailability of signalling linkset (Item 4.2); - start of linkset failure (Item 4.3); - stop of linkset failure (Item 4.4); - initiation of Broadcast TFP due to failure of measured linkset (Item 4.5); - initiation of Broadcast TFA for recovery of measured linkset (Item 4.6); - unavailability of route set to a given destination or set of destinations (Item 4.9); - duration of unavailability in measurement 4.9 (Item 4.10); - start of unavailability in measurement 4.9 (Item 4.11); - stop of unavailability in measurement 4.9. (Item 4.12); - adjacent SP inaccessible (Item 5.1); - duration of adjacent SP inaccessible (Item 5.2); - stop of adjacent SP inaccessible (Item 5.4); - MSUs discarded due to a routing data error (Item 5.5). 5.3.1.5 Component reliability and maintainability studies These studies are concerned with calculating the mean time between failures (MTBF) and mean time to repair (MTTR) for each type of component in the Signalling System No. 7 network. It may be useful for some purposes to have MTBF and MTTR data by Signalling System No. 7 function with which to correlate associated maintenance action. Measurements: - number of link failures; all reasons (Item 1.2); abnormal FTBR/BSNR (Item 1.3); excessive delay of acknowledgement (Item 1.4); excessive error rate (Item 1.5); excessive duration of congestion (Item 1.6); - duration of SL inhibition due to local management actions (Item 2.5); PAGE478 Fascicle VI.9 - Rec. Q.791 - duration of SL inhibition due to remote management actions (Item 2.6); - duration of SL unavailability due to link failure (Item 2.7); - duration of SL unavailability due to remote processor outage (Item 2.9); - start of remote processor outage (Item 2.10); - stop of remote processor outage (Item 2.11); - local management inhibit (Item 2.13); - local management uninhibit (Item 2.14). 5.3.2 Signalling connection control part (SCCP) 5.3.2.1 SCCP routing performance interwo interworking difficulties. Measurements: Routing failures: V no translation for address of such nature (Item 7.1); V no translation for this specific address (Item 7.2); V network failure (point code not available) (Item 7.3); V network congestion (Item 7.4); V subVsystem failure (unavailable) (Item 7.5); V subVsystem congestion (Item 7.6); V unequipped user (subVsystem) (Item 7.7); V reason unknown (Item 7.9). Protocol interworking: V syntax error detected (Item 7.8). 5.3.2.2 SCCP availability It is useful to monitor the effectiveness of Coordinated State Change Control. Measurements: V subVsystem out of service request granted (Item 8.6); V subVsystem out of service request denied (Item 8.7). 5.3.3 Telephony user part For further study. 5.3.4 Integrated services digital network user part (ISDNVUP) For further study. 5.3.5 Transaction capabilities application part (TCAP) For further study. 5.4 Administrative uses 5.4.1 Message transfer part (MTP) 5.4.1.1 Monitoring of link and signalling point utilization MTP utilization measurement is concerned with evaluating message flows to ensure that they are not beginning to exceed stated link and signalling point capacities. It also ensures that existing routing is resulting in proportionate utilization of available capacity. Measurements by link: Fascicle VI.9 - Rec. Q.791 PAGE465 V duration of link in the inVservice state (Item 1.1); V duration of SL unavailable (for any reason) (Item 2.1); V number of SIF and SIO octets transmitted (Item 3.1); V octets retransmitted (Item 3.2); V number of message signal units transmitted (Item 3.3); V number of SIF and SIO octets received (Item 3.4); V number of message signal units received (Item 3.5); V SL congestion indications (Item 3.6); V cumulative duration of SL congestions (Item 3.7). Measurements by signalling point: V number of SIF and SIO octets received: with given Origination Point Code (OPC) (Item 6.1); with given OPC and SIO (Item 6.4); V number of SIF and SIO octets transmitted: with given Destination Point Code (OPC) (Item 6.2); with given DPC and SIO (Item 6.5); - number of SIF and SIO octets handled: with given SIO (Item 6.3); with given OPC, DPC, and SIO (item 6.6). Measurements by signalling route set: - unavailability of route set to a given destination or set of destinations (Item 4.9); - duration of unavailability in measurement 4.9 (Item 4.10); - MSUs discarded due to routing data error (Item 5.5). 5.4.2 Signalling connection control part (SCCP) 5.4.2.1 SCCP utilization Network administration is interested in monitoring SCCP utilization for use in analyzing the current network and designing future network configurations. One way to monitor SCCP utilization is to measure the amount of SCCP traffic. Measurements: SCCP traffic received: UDTS messages (Item 9.2); total messages (for connectionless only) (classes 0 & 1) (Item 9.7). SCCP traffic sent: UDTS messages (Item 9.1); total messages (for connectionless only) PAGE478 Fascicle VI.9 - Rec. Q.791 (classes 0 & 1) (Item 9.6). General: - total messages handled (from local or remote sub-systems) (Item 9.3); - total messages intended for local sub-systems (Item 9.4); - total messages requiring global title translation (Item 9.5); - messages sent to a backup sub-system (Item 9.8). 5.4.2.2 SCCP routing performance Network Administration is also interested in tracking long-term message routing performance of the SCCP. This can be obtained from the following measurements or their sum. Measurements: SCCP Routing failures: - no translation for address of such nature (Item 7.1); - no translation for this specific address (Item 7.2); - network failure (point code not available) (Item 7.3); - network congestion (Item 7.4); - sub-system failure (unavailable) (Item 7.5); - sub-system congestion (Item 7.6); - unequipped user (sub-system) (Item 7.7); - reason unknown (Item 7.9). SCCP unavailability: - duration of local SCCP unavailable (all reasons) (Ref. 8.5). 5.4.3 Telephony user part For further study. 5.4.4 Integrated services digital network user part (ISDN-UP) For further study. 5.4.5 Transaction capabilities application part (TCAP) For further study. 5.5 Preparation of traffic forecasts 5.5.1 This activity is concerned with the calculation of values which will be entered into provisioning tables to determine future equipment quantities required. The data to be used are those already collected to support activities mentioned in SS 5.2.1.2 and 5.4.1.1. Depending upon implementation, more detailed measurements may be required to provision such items as internal buffers or number of processors where these may vary. 5.6 Network planning 5.6.1 This activity requires longer-term traffic forecasts, based as much upon marketing intentions as upon extrapolations of existing patterns. Nevertheless, to understand existing patterns, planners need knowledge of traffic origins and destinations. 5.6.2 The measurements in Table 6/Q.791 and Table 9/Q.791 indicate how much traffic is being originated at the measured signalling point, and how much traffic has that signalling point as a destination. These measurements are useful for calculating traffic flows by origination/destination pair. 5.6.3 In reality, however, traffic flows do not spread randomly through a network. For each origin, distance and other factors result in a concentration of flows to favoured destinations. As a result, it will be necessary to measure flows on the network by destination. 5.6.4 Given the large potential number of destinations, measurements may have to be grouped (see explanatory notes for Table 6/Q.791 and Table 9/Q.791 in S 3). 5.7 Evaluation of maintenance force effectiveness This activity consists of managerial control of the maintenance function, through examination of failure trends, equipment availabilities and the amount of outage due to manual as opposed to automatic busying of components. The activity is usually carried out with the aid of indices based upon data listed in S 5.3. Fascicle VI.9 - Rec. Q.791 PAGE465