ANSI C Rationale  -> 2 Environment                                                       Index 

1  Introduction

This Rationale summarizes the deliberations of X3J11, the Technical Committee charged by ANSI with devising a standard for the C programming language.  It has been published along with the draft Standard to assist the process of formal public review. 

The X3J11 Committee represents a cross-section of the C community: it consists of about fifty active members representing hardware manufacturers, vendors of compilers and other software development tools, software designers, consultants, academics, authors, applications programmers, and others.  In the course of its deliberations, it has reviewed related American and international standards both published and in progress.  It has attempted to be responsive to the concerns of the broader community: as of September 1988, it had received and reviewed almost 200 letters, including dozens of formal comments from the first public review, suggesting modifications and additions to the various preliminary drafts of the Standard. 

Upon publication of the Standard, the primary role of the Committee will be to offer interpretations of the Standard.  It will consider and respond to all correspondence received. 

1.1  Purpose

The Committee's overall goal was to develop a clear, consistent, and unambiguous Standard for the C programming language which codifies the common, existing definition of C and which promotes the portability of user programs across C language environments. 

The X3J11 charter clearly mandates the Committee to codify common existing practice The Committee has held fast to precedent wherever this was clear and unambiguous.  The vast majority of the language defined by the Standard is precisely the same as is defined in Appendix A of The C Programming Language by Brian Kernighan and Dennis Ritchie, and as is implemented in almost all C translators.  (This document is hereinafter referred to as K&R.) 

K&R is not the only source of ``existing practice.''  Much work has been done over the years to improve the C language by addressing its weaknesses.  The Committee has formalized enhancements of proven value which have become part of the various dialects of C. 

Existing practice, however, has not always been consistent.  Various dialects of C have approached problems in different and sometimes diametrically opposed ways.  This divergence has happened for several reasons.  First, K&R, which has served as the language specification for almost all C translators, is imprecise in some areas (thereby allowing divergent interpretations), and it does not address some issues (such as a complete specification of a library)  important for code portability.  Second, as the language has matured over the years, various extensions have been added in different dialects to address limitations and weaknesses of the language; these extensions have not been consistent across dialects. 

One of the Committee's goals was to consider such areas of divergence and to establish a set of clear, unambiguous rules consistent with the rest of the language.  This effort included the consideration of extensions made in various C dialects, the specification of a complete set of required library functions, and the development of a complete, correct syntax for C. 

The work of the Committee was in large part a balancing act.  The Committee has tried to improve portability while retaining the definition of certain features of C as machine-dependent.  It attempted to incorporate valuable new ideas without disrupting the basic structure and fabric of the language.  It tried to develop a clear and consistent language without invalidating existing programs.  All of the goals were important and each decision was weighed in the light of sometimes contradictory requirements in an attempt to reach a workable compromise. 

In specifying a standard language, the Committee used several guiding principles, the most important of which are:

Existing code is important, existing implementations are not. A large body of C code exists of considerable commercial value.  Every attempt has been made to ensure that the bulk of this code will be acceptable to any implementation conforming to the Standard.  The Committee did not want to force most programmers to modify their C programs just to have them accepted by a conforming translator. 

On the other hand, no one implementation was held up as the exemplar by which to define C: it is assumed that all existing implementations must change somewhat to conform to the Standard. 

C code can be portable. Although the C language was originally born with the UNIX operating system on the DEC PDP-11, it has since been implemented on a wide variety of computers and operating systems.  It has also seen considerable use in cross-compilation of code for embedded systems to be executed in a free-standing environment.  The Committee has attempted to specify the language and the library to be as widely implementable as possible, while recognizing that a system must meet certain minimum criteria to be considered a viable host or target for the language. 

C code can be non-portable. Although it strove to give programmers the opportunity to write truly portable programs, the Committee did not want to force programmers into writing portably, to preclude the use of C as a ``high-level assembler'': the ability to write machine-specific code is one of the strengths of C.  It is this principle which largely motivates drawing the distinction between strictly conforming program and conforming program (§1.7). 

Avoid ``quiet changes.'' Any change to widespread practice altering the meaning of existing code causes problems.  Changes that cause code to be so ill-formed as to require diagnostic messages are at least easy to detect.  As much as seemed possible consistent with its other goals, the Committee has avoided changes that quietly alter one valid program to another with different semantics, that cause a working program to work differently without notice.  In important places where this principle is violated, the Rationale points out a QUIET CHANGE

A standard is a treaty between implementor and programmer. Some numerical limits have been added to the Standard to give both implementors and programmers a better understanding of what must be provided by an implementation, of what can be expected and depended upon to exist.  These limits are presented as minimum maxima (i.e., lower limits placed on the values of upper limits specified by an implementation)  with the understanding that any implementor is at liberty to provide higher limits than the Standard mandates.  Any program that takes advantage of these more tolerant limits is not strictly conforming, however, since other implementations are at liberty to enforce the mandated limits. 

Keep the spirit of C. The Committee kept as a major goal to preserve the traditional spirit of C There are many facets of the spirit of C, but the essence is a community sentiment of the underlying principles upon which the C language is based.  Some of the facets of the spirit of C can be summarized in phrases like

The last proverb needs a little explanation.  The potential for efficient code generation is one of the most important strengths of C.  To help ensure that no code explosion occurs for what appears to be a very simple operation, many operations are defined to be how the target machine's hardware does it rather than by a general abstract rule.  An example of this willingness to live with what the machine does can be seen in the rules that govern the widening of char objects for use in expressions: whether the values of char objects widen to signed or unsigned quantities typically depends on which byte operation is more efficient on the target machine. 

One of the goals of the Committee was to avoid interfering with the ability of translators to generate compact, efficient code.  In several cases the Committee has introduced features to improve the possible efficiency of the generated code; for instance, floating point operations may be performed in single-precision if both operands are float rather than double.

1.2  Scope

This Rationale focuses primarily on additions, clarifications, and changes made to the language as described in the Base Documents (see §1.5).  It is not a rationale for the C language as a whole: the Committee was charged with codifying an existing language, not designing a new one.  No attempt is made in this Rationale to defend the pre-existing syntax of the language, such as the syntax of declarations or the binding of operators. 

The Standard is contrived as carefully as possible to permit a broad range of implementations, from direct interpreters to highly optimizing compilers with separate linkers, from ROM-based embedded microcomputers to multi-user multi-processing host systems.  A certain amount of specialized terminology has therefore been chosen to minimize the bias toward compiler implementations shown in the Base Documents. 

The Rationale discusses some language or library features which were not adopted into the Standard.  These are usually features which are popular in some C implementations, so that a user of those implementations might question why they do not appear in the Standard. 

1.3  References

1.4  Organization of the document

This Rationale is organized to parallel the Standard as closely as possible, to facilitate finding relevant discussions.  Some subsections of the Rationale comprise just the subsection title from the Standard: this indicates that the Committee thought no special comment was necessary. Where a given discussion touches on several areas, attempts have been made to include cross-references within the text.  Such references, unless they specify the Standard or the Rationale, are deliberately ambiguous. 

As for the organization of the Standard itself, Base Documents existed only for Sections 3 (Language) and 4 (Library) of the Standard.  Section 1 (Introduction) was modeled after the introductory matter in several other standards for procedural languages.  Section 2 (Environment) was added to fill a need, identified from the start, to place a C program in context and describe the way it interacts with its surroundings.  The Appendices were added as a repository for related material not included in the Standard itself, or to bring together in a single place information about a topic which was scattered throughout the Standard. 

Just as the Standard proper excludes all examples, footnotes, references, and appendices, this rationale is not part of the Standard The C language is defined by the Standard alone.  If any part of this Rationale is not in accord with that definition, the Committee would very much like to be so informed. 

1.5  Base documents

The Base Document for Section 3 (Language) was ``The C Reference Manual'' by Dennis M. Ritchie, which was used for several years within AT&T Bell Laboratories and reflects enhancements to C within the UNIX environment.  A version of this manual was published as Appendix A of The C Programming Language by Kernighan and Ritchie (K&R).  Several deviations in the Base Document from K&R were challenged during Committee deliberations, but most changes from K&R ultimately included in the Standard were readily endorsed by the Committee since they were widely known and accepted outside the UNIX user community. 

The Base Document for Section 4 (Library) was the 1984 /usr/group Standard (/usr/group is a UNIX system users group.)  In defining what a UNIX-like environment looks like to an applications programmer writing in C, /usr/group was obliged to describe library functions usable in any C environment.  The Committee found /usr/group's work to be an excellent codification of existing practice in defining C libraries, once the UNIX-specific functions had been removed. 

The work begun by /usr/group is being continued by the IEEE Committee 1003 to define a portable operating system interface (``POSIX'')  based on the UNIX environment.  The X3J11 Committee has been working with IEEE 1003 to resolve potential areas of overlap or conflict between the two Committees.  The result of this coordination has been to divide responsibility for standardizing library functions into two areas.  Those functions needed for a C implementation in any environment are the responsibility of X3J11 and are included in the Standard.  IEEE 1003 retains responsibility for those functions which are operating-system-specific; the POSIX standard will refer to the ANSI C Standard for C library function definitions. 

Many of the discussions in this Rationale employ the formula ``feature X has been changed (added, removed) because ... .''  The changes (additions, removals) should be understood as being with respect to the appropriate Base Document. 

1.6  Definitions of terms

The definitions of object, bit, byte, and alignment reflect a strong consensus, reached after considerable discussion, about the fundamental nature of the memory organization of a C environment:

(Thus, for instance, on a machine with 36-bit words, a byte can be defined to consist of 9, 12, 18, or 36 bits, these numbers being all the exact divisors of 36 which are not less than 8.)  These strictures codify the widespread presumption that any object can be treated as an array of characters, the size of which is given by the sizeof operator with that object's type as its operand. 

These definitions do not preclude ``holes'' in struct objects. Such holes are in fact often mandated by alignment and packing requirements.  The holes simply do not participate in representing the (composite) value of an object. 

The definition of object does not employ the notion of type.  Thus an object has no type in and of itself.  However, since an object may only be designated by an lvalue (see §3.2.2.1), the phrase ``the type of an object''  is taken to mean, here and in the Standard, ``the type of the lvalue designating this object,''  and ``the value of an object'' means ``the contents of the object interpreted as a value of the type of the lvalue designating the object.'' 

The concept of multi-byte character has been added to C to support very large character sets. See §2.2.1.2

The terms unspecified behavior, undefined behavior, and implementation-defined behavior are used to categorize the result of writing programs whose properties the Standard does not, or cannot, completely describe.  The goal of adopting this categorization is to allow a certain variety among implementations which permits quality of implementation to be an active force in the marketplace as well as to allow certain popular extensions, without removing the cachet of conformance to the Standard Appendix F to the Standard catalogs those behaviors which fall into one of these three categories. 

Unspecified behavior gives the implementor some latitude in translating programs.  This latitude does not extend as far as failing to translate the program. 

Undefined behavior gives the implementor license not to catch certain program errors that are difficult to diagnose.  It also identifies areas of possible conforming language extension: the implementor may augment the language by providing a definition of the officially undefined behavior. 

Implementation-defined behavior gives an implementor the freedom to choose the appropriate approach, but requires that this choice be explained to the user.  Behaviors designated as implementation-defined are generally those in which a user could make meaningful coding decisions based on the implementation definition.  Implementors should bear in mind this criterion when deciding how extensive an implementation definition ought to be.  As with unspecified behavior, simply failing to translate the source containing the implementation-defined behavior is not an adequate response. 

1.7  Compliance

The three-fold definition of compliance is used to broaden the population of conforming programs and distinguish between conforming programs using a single implementation and portable conforming programs. 

A strictly conforming program is another term for a maximally portable program.  The goal is to give the programmer a fighting chance to make powerful C programs that are also highly portable, without demeaning perfectly useful C programs that happen not to be portable.  Thus the adverb strictly

By defining conforming implementations in terms of the programs they accept, the Standard leaves open the door for a broad class of extensions as part of a conforming implementation.  By defining both conforming hosted and conforming freestanding implementations, the Standard recognizes the use of C to write such programs as operating systems and ROM-based applications, as well as more conventional hosted applications.  Beyond this two-level scheme, no additional subsetting is defined for C, since the Committee felt strongly that too many levels dilutes the effectiveness of a standard. 

Conforming program is thus the most tolerant of all categories, since only one conforming implementation need accept a program to rule it conforming.  The primary limitation on this license is §2.1.1.3

Diverse sections of the Standard comprise the ``treaty'' between programmers and implementors regarding various name spaces --- if the programmer follows the rules of the Standard the implementation will not impose any further restrictions or surprises:

One proposal long entertained by the Committee was to mandate that each implementation have a translate-time switch for turning off extensions and making a pure Standard-conforming implementation.  It was pointed out, however, that virtually every translate-time switch setting effectively creates a different ``implementation,''  however close may be the effect of translating with two different switch settings.  Whether an implementor chooses to offer a family of conforming implementations, or to offer an assortment of non-conforming implementations along with one that conforms, was not the business of the Committee to mandate.  The Standard therefore confines itself to describing conformance, and merely suggests areas where extensions will not compromise conformance. 

Other proposals rejected more quickly were to provide a validation suite, and to provide the source code for an acceptable library.  Both were recognized to be major undertakings, and both were seen to compromise the integrity of the Standard by giving concrete examples that might bear more weight than the Standard itself.  The potential legal implications were also a concern. 

Standardization of such tools as program consistency checkers and symbolic debuggers lies outside the mandate of the Committee.  However, the Committee has taken pains to allow such programs to work with conforming programs and implementations. 

1.8  Future directions


ANSI C Rationale  -> 2 Environment                                                       Index