SECTION 8.7 EXAMPLE-A STORAGE ALLOCATOR 187static Header base; 1* empty list to get started *1static Header *freep = NULL; 1* start of free list *11* malloc: general-purpose storage allocator *1void *malloc(unsigned nbytes){ Header *p, *prevp; Header *morecore(unsigned); unsigned nunits; nunits = (nbytes+sizeof(Header)-1)/sizeof(Header) + 1; if «prevp = freep) == NULL) { 1* no free list yet *1 base.s.ptr = freep = prevp = &base; base.s.size = 0; } for (p = prevp->s.ptr;, ; prevp = p, p = p->s.ptr) { if (p->s.size >= nunits) { 1* big enough *1 if (p->s.size == nunits) 1* exactly *1 prevp->s.ptr = p->s.ptr; else { 1* allocate tail end *1 p->s.size -= nunits; p += p->s.size; p->s.size = nunits; } freep = prevp; return (void *)(p+1); } if (p == freep) 1* wrapped around free list *1 if «p = morecore(nunits» == NULL) return NULL; 1* none left *1 }} The function moreeore obtains storage from the operating system. Thedetails of how it does this vary from system to system. Since asking the systemfor memory is a comparatively expensive operation, we don't want to do that onevery call to malloe, so moreeore requests at least NALLOCunits; this largerblock will be chopped up as needed. After setting the size field, morecoreinserts the additional memory into the arena by calling free. The UNIX system call sbrk (n ) returns a pointer to n more bytes ofstorage. sbrk returns -1 if there was no space, even though NULLwould havebeen a better design. The - 1 must be cast to ehar *. so it can be comparedwith the return value. Again, casts make the function relatively immune to thedetails of pointer representation on different machines. There is still oneassumption, however, that pointers to different blocks returned by sbrk can bemeaningfully compared. This is not guaranteed by the standard, which permitspointer comparisons only within an array. Thus this version of malloe is port-able only among machines for which general pointer comparison is meaningful.
188 THE UNIX SYSTEM INTERFACE CHAPTER 8 #define NALLOC 1024 1* minimum #units to request *1 1* morecore: ask system for more memory *1 static Header *morecore(unsigned nu) { char *cp, *sbrk(int); Header *up; if (nu < NALLOC) =nu NALLOC; cp = sbrk(nu * sizeof(Header»; if (cp == (char *) -1) 1* no space at all *1 return NULL; up = (Header *) cp; up->s.size = nu; free«void *)(up+1»; return freep; } free itself is the last thing. It scans the free list, starting at freep, look-ing for the place to insert the free block. This is either between two existingblocks or at one end of the list. In any case, if the block being freed is adjacentto either neighbor, the adjacent blocks are combined. The only troubles arekeeping the pointers pointing to the right things and the sizes correct. 1* free: put block ap in free list *1 void free(void *ap) { Header *bp, *p; bp = (Header *)ap - 1; 1* point to block header *1 for (p = freep; I(bp > p && bp < p->s.ptr); p = p->s.ptr) if (p >= p->s.ptr && (bp > p l l bp < p->s.ptr» break; 1* freed block at start or end of arena *1 if (bp + bp->s.size == p->s.ptr) { 1* join to upper nbr *1 bp->s.size += p->s.ptr->s.size; bp->s.ptr = p->s.ptr->s.ptr; } else bp->s.ptr = p->s.ptr; if (p + p->s.size == bp) { 1* join to lower nbr *1 p->s.size += bp->s.size; p->s.ptr = bp->s.ptr; } else p->s.ptr = bp; freep = p;} Although storage allocation is intrinsically machine-dependent, the codeabove illustrates how the machine dependencies can be controlled and confinedto a very small part of the program. The use of typedef and union handles
SECTION 8.7 EXAMPLE-A STORAGE ALLOCATOR 189alignment (given that sbrk supplies an appropriate pointer). Casts arrangethat pointer conversionsare made explicit, and even cope with a badly-designedsystem interface. Even though the details here are related to storage allocation,the general approach is applicable to other situations as well.Exercise 8-6. The standard library function calloc (n, size) returns apointer to n objects of size size, with the storage initialized to zero. Writecalloc, by calling mal10cor by modifying it. 0Exercise 8-7. mal10c accepts a size request without checking its plausibility;free believes that the block it is asked to free contains a valid size field.Improve these routines so they take more pains with error checking. 0Exercise 8-8 Write a routine bfree (p, n) that will free an arbitrary block pof n characters into the free list maintained by mal10c and free. By usingbfree, a user can add a static or external array to the free list at any time. 0
APPENDIX A: Reference ManualA1. Introduction This manual describes the C language specified by the draft submitted to ANSI on31 October, 1988, for approval as \"American National Standard for InformationSystems-Programming Language C, X3.159-1989.\" The manual is an interpretation ofthe proposed standard, not the Standard itself, although care has been taken to make ita reliable guide to the language. For the most part, this document follows the broad outline of the Standard, which inturn follows that of the first edition of this book, although the organization differs indetail. Except for renaming a few productions, and not formalizing the definitions of thelexical tokens or the preprocessor, the grammar given here for the language proper isequivalent to that of the Standard. Throughout this manual, commentary material is indented and written in smaller type, as this is. Most often these comments highlight ways in which ANSI Standard C differs from the language defined by the first edition of this book, or from refinements subsequently introduced in various compilers.A2. Lexical Conventions A program consists of one or more translation units stored in files. It is translatedin several phases, which are described in §A12. The first phases do low-level lexicaltransformations, carry out directives introduced by lines beginning with the # character,and perform macro definition and expansion. When the preprocessing of §A12 is com-plete, the program has been reduced to a sequence of tokens.A2.1 Tokens There are six classes of tokens: identifiers, keywords, constants, string literals, opera-tors, and other separators. Blanks, horizontal and vertical tabs, newlines, formfeeds, andcomments as described below (collectively, \"white space\") are ignored except as theyseparate tokens. Some white space is required to separate otherwise adjacent identifiers,keywords, and constants. 191
192 REFERENCE MANUAL APPENDIX A If the input stream has been separated into tokens up to a given character, the nexttoken is the longest string of characters that could constitute a token.A2.2 Comments The characters 1* introduce a comment, which terminates with the characters *1.Comments do not nest, and they do not occur within string or character literals.A2.3 Identifiers An identifier is a sequence of letters and digits. The first character must be a letter;the underscore _ counts as a letter. Upper and lower case letters are different. Identi-fiers may have any length, and for internal identifiers, at least the first 31 characters aresignificant; some implementations may make more characters significant. Internal iden-tifiers include preprocessor macro names and all other names that do not have externallinkage (§AI1.2). Identifiers with external linkage are more restricted: implementations ,\"may make as few as the first six characters as significant, and may ignore case distinc-tions.A2.4 Keywords The following identifiers are reserved for use as keywords, and may not be usedotherwise:auto double int structbreak else long switchcase enum register typedefchar extern return unionconst float short unsignedcontinue for signed voiddefault goto sizeof volatiledo if static whileSome implementationsalso reserve the words fortran and asm. The keywords const, signed, and volatile are new with the ANSI stand- ard; enum and void are new since the first edition, but in common use; entry, formerly reserved but never used, is no longer reserved.A2.5 Constants There are several kinds of constants. Each has a data type; §A4.2discussesthe basictypes. constant: integer-constant character-constant floating-constant enumeration -constant
SECTION A2 LEXICAL CONVENTIONS 193A2.5.1 Integer Constants An integer constant consisting of a sequence of digits is taken to be octal if it beginswith 0 (digit zero), decimal otherwise. Octal constants do not contain the digits 8 or 9.A sequence of digits preceded by Ox or Ox (digit zero) is taken to be a hexadecimalinteger. The hexadecimal digits include a or A through f or F with values 10 through15. An integer constant may be suffixed by the letter u or U, to specify that it isunsigned. It may also be suffixed by the letter 1 or L to specify that it is long. The type of an integer constant depends on its form, value and suffix. (See §A4 fora discussion of types.) If it is unsuffixed and decimal, it has the first of these types inwhich its value can be represented: int, long int, unsigned long into If it isunsuffixed octal or hexadecimal, it has the first possible of these types: int, unsignedint, long int, unsigned long into If it is suffixed by u or u, then unsignedint, unsigned long into If it is suffixed by 1 or L, then long int, unsignedlong into The elaboration of the types of integer constants goes considerably beyond the first edition, which merely caused large integer constants to be long. The U suffixes are new.A2.5.2 Character Constants A character constant is a sequence of one or more characters enclosed in singlequotes, as in ' x '. The value of a character constant with only one character is thenumeric value of the character in the machine's character set at execution time. Thevalue of a multi-character constant is implementation-defined. Character constants do not contain the ' character or newlines; in order to representthem, and certain other characters, the following escape sequences may be used.newline NL (LF) \n backs lash \ \\horizontal tab HT \tvertical tab VT \v question mark? \?backspace BS \bcarriage return CR \r single quote \'form feed FF \faudible alert BEL \a double quote \" \ \" octal number 000 \000 hex number hh \xhhThe escape \000 consists of the backslash followed by I, 2, or 3 octal digits, which aretaken to specify the value of the desired character. A common example of this construc-tion is \0 (not followed by a digit), which specifies the character NUL. The escape\xhh consists of the backslash, followed by x, followed by hexadecimal digits, which aretaken to specify the value of the desired character. There is no limit on the number ofdigits, but the behavior is undefined if the resulting character value exceeds that of thelargest character. For either octal or hexadecimal escape characters, if the implementa-tion treats the char type as signed, the value is sign-extended as if cast to char type.If the character following the \ is not one of those specified, the behavior is undefined. In some implementations, there is an extended set of characters that cannot berepresented in the char type. A constant in this extended set is written with a preced-ing L, for example L' x \" and is called a wide character constant. Such a constant hastype wchar_t, an integral type defined in the standard header <stddef. h>. As with
194 REFERENCE MANUAL APPENDIX Aordinary character constants, octal or hexadecimal escapes may be used; the effect isundefined if the specified value exceeds that representable with wchar_t. Some of these escape sequences are new, in particular the hexadecimal charac- ter representation. Extended characters are also new. The character sets com- monly used in the Americas and western Europe can be encoded to fit in the char type; the main intent in adding wchar _t was to accommodate Asian languages.A2.S.3 Floating Constants A floating constant consists of an integer part, a decimal point, a fraction part, an eor E, an optionally signed integer exponent and an optional type suffix, one of f, F, 1, orL. The integer and fraction parts both consist of a sequence of digits. Either the integerpart or the fraction part (not both) may be missing; either the decimal point or the eand the exponent (not both) may be missing. The type is determined by the suffix; F orf makes it float, L or 1 makes it long double; otherwise it is double. Suffixes on floating constants are new.A2.S.4 Enumeration Constants Identifiers declared as enumerators (see §A8.4) are constants of type intoA2.6 String Literals A string literal, also called a string constant, is a sequence of characters surroundedby double quotes, as in \" ... \". A string has type \"array of characters\" and storageclass static (see §A4 below) and is initialized with the given characters. Whetheridentical string literals are distinct is implementation-defined, and the behavior of a pro-gram that attempts to alter a string literal is undefined. Adjacent string literals are concatenated into a single string. After any concatena-tion, a null byte \0 is appended to the string so that programs that scan the string canfind its end. String literals do not contain newline or double-quote characters; in orderto represent them, the same escape sequences as for character constants are available. As with character constants, string literals in an extended character set are writtenwith a preceding L, as in L\" ••• \". Wide-character string literals have type \"array ofwchar_t.\" Concatenation of ordinary and wide string literals is undefined. The specification that string literals need not be distinct, and the prohibition against modifying them, are new in the ANSI standard, as is the concatenation of adjacent string literals. Wide-character string literals are new.A3. Syntax Notation In the syntax notation used in this manual, syntactic categories are indicated byitalic type, and literal words and characters in typewriter style. Alternativecategories are usually listed on separate lines; in a few cases, a long set of narrow alter-natives is presented on one line, marked by the phrase \"one of.\" An optional terminal ornonterminal symbol carries the subscript \"opt ,\" so that; for example,
SECTION A4 MEANING OF IDENTIFIERS 195 { expressionopt }means an optional expression, enclosed in braces. The syntax is summarized in §A13. Unlike the grammar given in the first edition of this book, the one given here makes precedence and associativity of expression operators explicit.A4. Meaningof Identifiers Identifiers, or names, refer to a variety of things: functions; tags of structures,unions, and enumerations; members of structures or unions; enumeration constants;typedef names; and objects. An object, sometimes called a variable, is a location instorage, and its interpretation depends on two main attributes: its storage class and itstype. The storage class determines the lifetime of the storage associated with the identi-fied object; the type determines the meaning of the values found in the identified object.A name also has a scope, which is the region of the program in which it is known, and alinkage, which determines whether the same name in another scope refers to the sameobject or function. Scope and linkage are discussed in §A11.A4.1 Storage Class There are two storage classes: automatic and static. Several keywords, together withthe context of an object's declaration, specify its storage class. Automatic objects arelocal to a block (§A9.3), and are discarded on exit from the block. Declarations within ablock create automatic objects if no storage class specification is mentioned, or if theauto specifier is used. Objects declared register are automatic, and are (if possible)stored in fast registers of the machine. Static objects may be local to a block or external to all blocks, but. in either caseretain their values across exit from and reentry to functions' and blocks. Within ~ block,including a block that provides the code for a function, static objects are declared withthe keyword static. The objects declared outside all blocks, atthe same level as func-tion definitions, are always static. They may be made local to a particular translationunit by useofthe static keyword; this gives them internal linkage, they become glo-bal to an entire program by omitting an explicit storage class, or by using the keywordextern; this gives them external linkage.A4.2 Basic Types There are several fundamental types. The standard header <limits. h> describedin Appendix B defines the largest and smallest values of each type in the local imple-mentation. The numbers given in Appendix B show the smallest acceptable magnitudes. Objects declared as characters (char) are large enough to store any member of theexecution character set. If a genuine character from that set is stored in a char object,its value is equivalent to the integer code for the character, and is non-negative. Otherquantities may be stored into char variables, but the available range of values, andespecially whether the value is signed, is implementation-dependent. ' Unsigned characters declared unsigned char consume the same amount of spaceas plain characters, but always appear non-negative; explicitly signed characters declaredsigned c~ar likewise take the same space as plain characters.
196 REFERENCE MANUAL APPENDIX A unsigned char type does not appear in the first edition of this book,but is in common use. signed char is new. Besidesthe char types, up to three sizes of integer, declared short Lnt, int, andlong int, are available. Plain int objects have the natural size suggested by the hostmachine architecture; the other sizes are provided to meet special needs. Longerintegers provide at least as much storage as shorter ones, but the implementation maymake plain integers equivalent to either. short integers, or long integers. The int typesall represent signed values unless specifiedotherwise. Unsigned integers, declared using the keyword unsigned, obey the laws of arith-metic modulo 2n where n is the number of bits in the representation, and thus arithmeticon unsigned quantities can never overflow. The set of non-negative values that can bestored in a signed object is a subset of the values that can be stored in the correspondingunsigned object, and the representation for the overlappingvalues is the same. Any of single precision floating point (f loa t), double precision floating point(double), and extra precision floating point (long double) may be synonymous,butthe ones later in the list are at least as precise as those before. long double is new. The first edition made long float equivalent to double; the locution has been withdrawn. Enumerations are unique types that have integral values; associated with eachenumeration is a set of named constants (§A8.4). Enumerations behave like integers,but it is common for a compiler to issue a warning when an object of a particularenumeration type is assigned something other than one of its constants, or an expressionof its type. Because objects of these types can be interpreted as numbers, they will be referred toas arithmetic types. Types char, and int of all sizes, each with or without sign, andalso enumeration types, will collectively be called integral types. The types float,double, and long double will be calledfloating types. The void type specifies an empty set of values. It is used as the type returned byfunctions that generate no value.A4.3 Derived Types Besides the basic types, there is a conceptually infinite class of derived types con-structed from the fundamental types in the followingways: arrays. of objects.of a given type; functions returning objects of a given type; pointers to objects of a given type; structures containing a sequence of objects of various types; unions capable of containing anyone of several objects of various types.In general these methods of constructing objects can be applied recursively.A4.4 Type Qualifiers An object's type may have additional qualifiers. Declaring an object constannounces that its value will not be changed; declaring it volatile announces that ithas special properties relevant to optimization. Neither qualifier affects the range ofvalues or arithmetic properties of the object. Qualifiers are discussed in §A8.2.
SECTION A6 CONVERSIONS 197AS. Objects and Lvalues An object is a named region of storage; an lvalue is an expression referring to anobject. An obvious example of an lvalue expression is an identifier with suitable typeand storage class. There are operators that yield lvalues: for example, if E is an expres-sion of pointer type, then *E is an lvalue expression referring to the object to which Epoints. The name \"lvalue\" comes from the assignment expression E1 = E2 in whichthe left operand E 1 must be an lvalue expression. The discussion of each operator speci-fies whether it expects lvalue operands and whether it yields an lvalue.A6. Conve,slons Some operators may, depending on their operands, cause conversion of the value ofan operand from one type to another. This section explains the result to be expectedfrom such conversions. §A6.5 summarizes the conversions demanded by most ordinaryoperators; it will be supplemented as required by the discussion of each operator.A6.1 Integral Promotion A character, a short integer, or an integer bit-field, all either signed or not, or anobject of enumeration type, may be used in an expression wherever an integer may beused. If an int can represent all the values of the original type, then the value is con-verted to int; otherwise the value is converted to uns igned into This process iscalled integral promotion.A6.2 Integral Conversions Any integer is converted to a given unsigned type by finding the smallest non-negative value that is congruent to that integer, modulo one more than the largest valuethat can be represented in the unsigned type. In a two's complement representation, thisis equivalent to left-truncation if the bit pattern of the unsigned type is narrower, and tozero-filling unsigned values and sign-extending signed values if the unsigned type iswider. When any integer is converted to a signed type, the value is unchanged if it can berepresented in the new type and is implementation-defined otherwise.A6.3 Integer and Floating When a value of floating type is converted to integral type, the fractional part is dis-carded; if the resulting value cannot be represented in the integral type, the behavior isundefined. In particular, the result of converting negative floating values to unsignedintegral types is not specified. When a value of integral type is converted to floating, and .the value is in therepresentable range but is not exactly representable, then the result may be either thenext higher or next lower representable value. If the result is out of range, the behavioris undefined.
198 REFERENCE MANUAL APPENDIX AA6.4 Floating Types When a less precise floating value is converted to an equally or more precise floatingtype, the value is unchanged. When a more precise floating value is converted to a lessprecise floating type, and the value is within representable range, the result may beeither the next higher or the next lower representable value. If the result is out of range,the behavior is undefined.A6.5 Arithmetic Conversions Many operators cause conversions and yield result types in a similar way. The effectis to bring operands into a common type, which is also the type of the result. This pat-tern is called the usual arithmetic conversions. First, if either operand is long double, the other is converted to long double. Otherwise, if either operand is double, the other is converted to double. Otherwise, if either operand is float, the other is converted to float. Otherwise, the integral promotions are performed on' both operands; then, if either operand is unsigned long int, the other is converted to unsigned long into Otherwise, if one operand is long int and the other is unsigned int, the effect depends on whether a long int can represent all values of an unsigned int; if so, the unsigned int operand is converted to long int; if not, both are converted to unsigned long into Otherwise, if one operand is long int, the other is converted to long into Otherwise, ifeither operand is unsLqnedint, the other is converted to unsigned into Otherwise, both operands have type into There are two changes here. First, arithmetic on float operands may be done in single precision, rather than double; the first edition specified that all floating arithmetic was double precision. Second, shorter unsigned types, when com- bined with a larger signed type, do not propagate the unsigned property to the result type; in the first edition, the unsigned always dominated. The new rules are slightly more complicated, but reduce somewhat the surprises that may occur when an unsigned quantity meets signed. Unexpected results may still occur when an unsigned expression is compared to a signed expression of the same size.A6.S Pointers and Integers An expression of integral type may be added to or subtracted from a pointer; in sucha case the integral expression is converted as specified in the discussion of the additionoperator (§A7.7). Two pointers to objects of the same type, in the same array, may be subtracted; theresult is converted to an integer as specified in the discussion of the subtraction operator(§A7.7). An integral constant expression with value 0; or such an expression cast to typevoid *, may be converted, by a cast, by assignment, or by comparison, to a .pointer ofany type. This produces a null pointer that is equal to another null pointer of the sametype, but unequal to any pointer to a function or object. Certain other conversions involving pointers are permitted, but have implementation-dependent aspects. They must be specified by an explicit type-conversion operator, or
SECTION A6 CONVERSIONS 199cast (§§A7.5 and A8.8). A pointer may be converted to an integral type large enough to hold it; the requiredsize is implementation-dependent. The mapping function is also implementation-dependent. An object of integral type may be explicitly converted to a pointer. The mappingalways carries a sufficiently wide integer converted from a pointer back to the samepointer, but is otherwise implementation-dependent. A pointer to one type may be converted to a pointer to another type. The resultingpointer may cause addressing exceptions if the subject pointer does not refer to an objectsuitably aligned in storage. It is guaranteed that a pointer to an object may be con-verted to a pointer to an object whose type requires less or equally strict storage align-ment and back again without change; the notion of \"alignment\" is implementation\"dependent, but objects of the char types have least strict alignment requirements. Asdescribed in §A6.8, a pointer may also be converted to type void * and back againwithout change. A pointer may be converted to another pointer whose type is the same except for theaddition or removal of qualifiers (§§A4.4, A8.2) of the object type to which the pointerrefers. If qualifiers are added, the new pointer is equivalent to the old except for restric-tions implied by the new qualifiers. If qualifiers are removed, operations on the underly-ing object remain subject to the qualifiers in its actual declaration. Finally, a pointer to a function may be converted to a pointer to another functiontype. Calling the function specified by the converted pointer is implementation-dependent; however, if the converted pointer is reconverted to its original type, the resultis identical to the original pointer.A6.7 Void The «nonexistent) value of a void object may not be used in any way, and neitherexplicit nor implicit conversion to any non-void type may be applied. Because a voidexpression denotes a nonexistent value, such an expression may be used only where thevalue is not required, for example as an expression statement (§A9.2) or as the leftoperand of a comma operator (§A7.18). An expression may be converted to type void by a cast. For example, a void castdocuments the discarding of the value of a function call used as an expression statement. void did not appear in the first edition of this book,but has become common since.A6.8 Pointersto Void Any pointer to an object may be converted to type void * without loss of informa-tion. If the result is converted back to the original pointer type, the original pointer isrecovered. Unlike the pointer-to-pointer conversions discussed in §A6.6, which generallyrequire an explicit cast, pointers may be assigned to and from pointers of type void *,and may be compared with them. This interpretation of void * pointers is new; previously, char * pointers played the role of generic pointer. The ANSI standard specifically blesses the meeting of void * pointers with object pointers in assignments and relationals, while requiring explicit casts for other pointer mixtures.
200 REFERENCE MANUAL APPENDIX AA 7. Expressions The precedence of expressionoperators is the same as the order of the major subsec-tions of this section, highest precedence first. Thus, for example, the expressionsreferred to as the operands of + (§A7.7) are those expressionsdefined in §§A7.I-A7.6.Within each subsection, the operators have the same precedence. Left- or right-associativity is specified in each subsection for the operators discussed therein. Thegrammar in §A13 incorporates the precedence and associativityof the operators. The precedence and associativity of operators is fully specified, but the order ofevaluation of expressionsis, with certain exceptions,undefined, even if the subexpressionsinvolve side effects. That is, unless the definition of an operator guarantees that itsoperands are evaluated in a particular order, the implementation is free to evaluateoperands in any order, or even to interleave their evaluation. However, each operatorcombines the values produced by its operands in a way compatible with the parsing ofthe expressionin which it appears. This rule revokes the previous freedom to reorder expressions with operators that are mathematically commutative and associative. but can fail to be compu- tationally associative. The change affects only floating-point computations near the limits of their accuracy, and situations where overflow is possible. The handling of overflow,divide check, and other exceptions in expressionevaluationis not defined by the language. Most existing implementations of C ignore overflowinevaluation of signed integral expressions and assignments, but this behavior is notguaranteed. Treatment of division by 0, and all floating-point exceptions,varies amongimplementations;sometimes it is adjustable by a non-standard library function.A7.1 Pointer Generation If the type ofan expression or subexpressionis \"array of T,\" for some type T, thenthe value of the expression is a pointer to the first object in the array, and the type ofthe expression is altered to \"pointer to T.\" This conversiondoes not take place if theexpression is the operand of the unary & operator, or of ++, --, sizeof, or as the leftoperand of an assignment operator or the • operator. Similarly, an expression of type\"function returning T,\" except when used as the operand of the & operator, is convertedto \"pointer to function returning T.\"A7.2 Primary ExpressionsPrimary expressionsare identifiers, constants, strings, or expressionsin parentheses. primary-expression: identifier constant string ( expression An identifier is a primary expression, provided it has been suitably declared as dis-cussed below. Its type is specified by its declaration. An identifier is an lvalue if itrefers to an object (tAS) and if its type is arithmetic, structure, union, or pointer. A constant is a primary expression. Its type depends on its form as discussed intA2.S. A string literal is a primary expression. Its type is originally \"array of char\" (forwide-character strings, \"array of wchar_t\"), but followingthe rule given in §A7.1, this
SECTION A7 EXPRESSIONS 201is usually modified to \"pointer to char\" (wchar_t) and the result is a pointer to thefirst character in the string. The conversion also does not occur in certain initializers;see §AS.7. A parenthesized expression is a primary expression whose type and value are identi-cal to those of the unadorned expression. The presence of parentheses does not affectwhether the expression is an lvalue.A7.3 Postfix Expressions The operators in postfix expressions group left to right. postfix -expression: primary-expression postfix-expression [ expression] postfix-expression ( argument-expression-listq, postfix-expression . identifier postfix-expression -> identifier postfix-expression ++ postfix-expression -- argument-expression-list: assignment-expression argument-expression-list , assignment-expressionA7.3.1 Array References A postfix expression followed by an expression in square brackets is a postfix expres-sion denoting a subscripted array reference. One of the two expressions must have type\"pointer to T', where T is some type, and the other must have integral type; the type ofthe subscript expression is T. The expression E1 [E2] is identical (by definition) to* ( (E 1) + (E2 ) ). See §AS.6.2 for further discussion.A7.3.2 Function Calls A function call is a postfix expression, called the function designator, followed byparentheses containing a possibly empty, comma-separated list of assignment expressions(§A7.17), which constitute the arguments to the function. If the postfix expression con-sists of an identifier for which no declaration exists in the current scope, the identifier isimplicitly declared as if the declaration extern int identifier ( ) ;had been given in the innermost block containing the function call. The postfix expres-sion (after possible implicit declaration and pointer generation, §A7.1) must be of type\"pointer to function returning T,\" for some type T, and the value of the function callhas type T. In the first edition, the type was restricted to \"function,\" and an explicit * operator was required to call through pointers to functions. The ANSI standard blesses the practice of some existing compilers by permitting the same syntax for calls to functions and to functions specified by pointers. The older syntax is still usable. The term argument is used for an expression passed by a function call; the termparameter is used for an input object (or its identifier) received by a function definition,
202 REFERENCE MANUAL APPENDIX Aor described in a function declaration. The terms \"actual argument (parameter)\" and\"formal argument (parameter)\" respectively are sometimes used for the same distinc-tion. In preparing for the call to a function, a copy is made of each argument; allargument-passing is strictly by value. A function may change the values of its parame-ter objects, which are copies of the argument expressions, but these changes cannotaffect the values of the arguments. However, it is possible to pass a pointer on theunderstanding that the function may change the value of the object to which the pointerpoints. There are two styles in which functions may be declared. In the new style, the typesof parameters are explicit and are part of the type of the function; such a declaration isalso called a function prototype. In the old style, parameter types are not specified.Function declaration is discussed in §§A8.6.3 and AIO.I. If the function declaration in scope for a call is old-style, then default argument pro-motion is applied to each argument as follows: integral promotion (§A6.1) is performedon each argument of integral type, and each float argument is converted to double.The effect of the call is undefined if the number of arguments disagrees with thenumber of parameters in the definition of the function, or if the type of an argumentafter promotion disagrees with that of the corresponding parameter. Type agreementdepends on whether the function's definition is new-style or old-style. If it is old-style,then the comparison is between the promoted type of the argument of the call, and thepromoted type of the parameter; if the definition is new-style, the promoted type of theargument must be that of the parameter itself, without promotion. If the function declaration in scope for a call is new-style, then the arguments areconverted, as if by assignment, to the types of the corresponding parameters of thefunction's prototype. The number of arguments must be the same as the number ofexplicitly described parameters, unless the declaration's parameter list ends with theellipsis notation (, •.. ). In that case, the number of arguments must equal or exceedthe number of parameters; trailing arguments beyond the explicitly typed parameterssuffer default argument promotion as described in the preceding paragraph. If thedefinition of the function is old-style, then the type of each parameter in the prototypevisible at the call must agree with the corresponding parameter in the definition, afterthe definition parameter's type has undergone argument promotion. These rules are especially complicated because they must cater to a mixture of old- and new-style functions; Mixtures are to be avoided if possible. The order of evaluation of arguments is unspecified; take note that various compilersdiffer. However, the arguments and the function designator are completely evaluated,including all side effects, before the function is entered. Recursive calls to any functionare permitted.A7.3.3 Structure References A postfix expression followed by a dot followed by an identifier is a postfix expres-sion. The first operand expression must be a structure or a union, and the identifiermust name a member of the structure or union. The value is the named member of thestructure or union, and its type is the type of the member. The expression is an lvalue ifthe first expression is an Ivalue, and if the type of the second expression is not an arraytype.
SECTION A7 EXPRESSIONS 203 A postfix expression followed by an arrow (built from - and » followed by an iden-tifier is a postfix expression. The first operand expression must be a pointer to a struc-ture or a union, and the identifier must name a member of the structure or union. Theresult refers to the named member of the structure or union to which the pointer expres-sion points, and the type is the type of the member; the result is an lvalue if the type isnot an array type. Thus the expression E1->MOS is the same as (*E1) •MOS. Structures and unionsare discussed in §A8.3. In the first edition of this book, it was already the rule that a member name in such an expression had to belong to the structure or union mentioned in the postfix expression; however, a note admitted that this rule was not firmly enforced. Recent compilers, and ANSI, do enforce it.A7.3.4 Postfix Incrementation A postfix expression followed by a ++ or - - operator is a postfix expression. Thevalue of the expression is the value of the operand. After the value is noted, the operandis incremented (++) or decremented (...-) by 1. The operand must be an lvalue; see thediscussion of additive operators (§A7.7) and assignment (§A7.17) for further constraintson the operand and details of the operation. The result is not an lvalue.A7.4 Unary Operators Expressions with unary operators group right-to-left, unary-expression: postfix -expression ++ unary-expression - - unary-expression unary-operator cast-expression sizeof unary-expression sheof (type-name ) unary-operator: one of &*+A7.4. 1 Prefix Incrementation Operators A unary expression preceded by a ++ or -- operator is a unary expression. Theoperand is incremented (+ +) or decremented (- -) by 1. The value of the expression.isthe value after. the incrementation (decrementation). The operand must be an lvalue;see the discussion of additive operators (§A7.7) and assignment (§A7.17) for furtherconstraints on the operand and details of the operation. The result is not an lvalue.A7.4.2 Address Operator The unary & operator takes the address of its operand. The operand must be anlvalue referring neither to a bit-field nor to an object declared as register, or must beof function type. The result is a pointer to the object or function referred to by theIvalue. If the type of the operand is T, the type of the result is \"pointer to T.\"
204 REFERENCE MANUAL APPENDIX AA7.4.3 Indirection Operator The unary * operator denotes indirection, and returns the object or function to whichits operand points. It is an lvalue if the operand is a pointer to an object of arithmetic,structure, union, or pointer type. If the type of the expressionis \"pointer to T,\" the typeof the result is T.A7.4.4 Unary Plus Operator The operand of the unary + operator must have arithmetic type, and the result is thevalue of the operand. An integral operand undergoes integral promotion. The type ofthe result is the type of the promoted operand. The unary + is new with the ANSI standard. It was added for symmetry with unary -.A7.4.6 Unary Minus Operator The operand of the unary _ operator must have arithmetic type, and the result is thenegative of its operand. An integral operand undergoes integral promotion. The nega-tive of an unsigned quantity is computed by subtracting the promoted value from thelargest value of the promoted type and adding one; but negative zero is zero. The typeof the result is the type of the promoted operand.A7.4.8 One's Complement Operator The operand of the - operator must have integral type, and the result is the one'scomplement of its operand. The integral promotions are performed. If the operand isunsigned, the result is computed by subtracting the value from the largest value of thepromoted type. If the operand is signed, the result is computed by converting the pro-moted operand to the corresponding unsigned type, applying -, and converting back tothe signed type. The type of the result is the type of the promoted operand.A7.4.7 Logical Negation Operator The operand of the I operator must have arithmetic type or be a pointer, and theresult is 1 if the value of its operand compares equal to 0, and 0 otherwise. The type ofthe result is intoA7.4.8 Sizeof Operator The sizeof operator yields the number of bytes required to store an object of thetype of its operand. The operand is either an expression,.which is not evaluated, or aparenthesized type name. When sizeof is applied to a char, the result is 1; whenapplied to an array, the result is the total number of bytes in the array. When appliedto a structure or union, the result is the number of bytes in the object, including anypadding required to make the object tile an array: the size of an array of n elements is ntimes the size of one element. The operator may not be applied to an operand of func-tion type, or of incomplete type, or to a bit-field. The result is an unsigned integral con-stant; the particular type is implementation-defined. The standard header <stddef. h>(see Appendix B) defines this type as size_ t.
SECTION A7 EXPRESSIONS 205A7.5 Casts A unary expressionpreceded by the parenthesized name of a type causes conversionof the value of the expressionto the named type. cast-expression: unary-expression ( type-name) cast-expressionThis construction is called a cast. Type names are described in §A8.8. The effects ofconversionsare described in §A6. An expressionwith a cast is not an lvalue.A7.6 MuHiplicativeOperators The multiplicative operators *, I, and\" group left-to-right. multiplicative-expression: cast-expression multiplicative-expression * cast-expression multiplicative-expression I cast-expression multiplicative-expression\" cast-expression The operands of * and I must have arithmetic type; the operands of \" must haveintegral type. The usual arithmetic conversions are performed on the operands, andpredict the type of the result. The binary * operator denotes multiplication. The binary I operator yields the quotient, and the\" operator the remainder, of thedivision of the first operand by the second; if the second operand is 0, the result is unde-fined. Otherwise, it is always true that (alb) *b + a\"b is equal to a. If bothoperands are non-negative,then the remainder is non-negativeand smaller than the divi-sor; if not, it is guaranteed only that the absolute value of the remainder is smaller thanthe absolute value of the divisor.A7.7 Additive Operators The additive operators + and - group left-to-right. If the operands have arithmetictype, the usual arithmetic conversions are performed. There are some additional typepossibilitiesfor each operator. additive-expression: multiplicative-expression additive-expression + multiplicative-expression additive-expression - multiplicative-expression The result of the + operator is the sum of the operands. A pointer to an object in anarray and a value of any integral type may be added. The latter is converted to anaddress offset by multiplying it by the size of the object to which the pointer points.The sum is a pointer of the same type as the original pointer, and points to anotherobject in the same array, appropriately offset from the original object. Thus if P is apointer to an object in an array, the expression P+ 1 is a pointer to the next object in thearray. If the sum pointer points outside the bounds of the array, except at the first loca-tion beyond the high end, the result is undefined. The provision for pointers just beyond the end of an array is new. It legitimizes a common idiom for looping over the elements of an array. The result of the - operator is the difference of the operands. A value of any
206 REFERENCE MANUAL APPENDIX Aintegral type may be subtracted from a pointer, and then the same conversions and con-ditions as for addition apply. If two pointers to objects of the same type are subtracted, the result is a signedintegral value representing the displacement between the pointed-to objects; pointers tosuccessive objects differ by 1. The' type of the result depends on the implementation, butis defined as ptrdiff_ t in the standard header <stddef. h>. The value is undefinedunless the pointers point. to objects within the same array; however if P points to the lastmember of an array,then (P+ 1) -P has value 1.A7.8 Shift Operators The shift operators « and» group left-to-right. For both operators, each operandmust be integral, and is subject to the integral promotions. The type of the result is thatof the promoted left operand. The result is undefined if the right operand is negative, orgreater than or equal to the number of bits in the left expression's type. shift -expression: additive-expression shift -expression < < additive-expression shift -expression > > additive-expressionThe value of E 1«E2 is E 1 (interpreted as a bit pattern) left-shifted E2 bits; in theabsence of overflow, this is equivalent to multiplication by 2E2. The value of E1»E2 isE 1 right-shifted E2 bit positions. The right shift is equivalent to division by 2£2 if E 1 isunsigned or if it has a non-negative value; otherwise the result is implementation-defined.A7.9 Relational Operators The relational operators group left-to-right, but this fact is not useful; a-eb-ec isparsed as (a<b) <c, and a-eb evaluates to either 0 or 1.relational-expression: < shift -expression shift -expression relational-expression > shift-expression relational-expression <= shift -expression relational-expression > = shift -exprrssion relational-expressionThe operators < (less), > (greater), <= (less or equal) and >= (greater or equal) allyield 0 if the specified relation is false and 1 if it is true. The type of the result is intoThe usual arithmetic conversions are performed on arithmetic operands. . Pointers toobjects of the same type (ignoring any qualifiers) may be compared; the result dependson the relative locations in the address space of the pointed-to objects. Pointer com-parison is defined only for parts of the same object: if two pointers point to the samesimple object,· they compare equal; if the pointers are to members of the same structure,pointers to objects declared later in the structure compare higher; if the pointers are tomembers of the same union, they compare equal; if the pointers refer to members of anarray, the comparison is equivalent to comparison of the corresponding subscripts. If Ppoints to the last member of an array, then .P+1 compares higher than P, even thoughP+ 1 points outside the array. Otherwise, pointer comparison is undefined.These rules slightly liberalize the restrictions stated in the first edition, by per-mitting comparison of pointers to different members of a structure or union.They also legalize comparison with a pointer just off the end of an array.
SECTION A7 EXPRESSIONS 207A7. 10 Equality Operators equality-expression: relational-expression equality-expression = = relational-expression equality-expression I= relational-expressionThe == (equal to) and the I= (not equal to) operators are analogous to the relational==operators except for their lower precedence. (Thus a<b c-ed is 1 whenever a-eband c-ed have the same truth-value.)The equality operators follow the same rules as the relational operators, but permitadditional possibilities: a pointer may be compared to a constant integral expression withvalue 0, or to a pointer to void. See §A6.6.A7.11 Bitwise AND Operator AND-expression: equality-expression AND-expression & equality-expressionThe usual arithmetic conversions are performed; the result is the bitwise AND functionof the operands. The operator applies only to integral operands.A7.12 Bitwise Exclusive OR Operator exclusive-OR-expression: AND-expression exclusive-OR -expression \" AND-expressionThe usual arithmetic conversions are performed; the result is the bitwise exclusive ORfunction of the operands. The operator applies only to integral operands.A7. 13 Bitwise Inclusive OR Operator inclusive-OR -expression: exclusive-OR -expression inclusive-OR-expression I exclusive-OR-expressionThe usual arithmetic conversions are performed; the result is the bitwise inclusive ORfunction of its operands. The operator applies only to integral operands.A7.14 Logical AND Operator logical- AND-expression: inclusive-OR -expression logical-AND-expression && inclusive-OR-expression°The && operator groups left-to-right. It returns 1 if both its operands compare unequalto zero, otherwise. Unlike &, && guarantees left-to-right evaluation: the first operandis evaluated, including all side effects; if it is equal to 0, the value of the expression is 0.Otherwise, the right operand is evaluated, and if it is equal to 0, the expression's value is0, otherwise 1. The operands need not have the same type, but each must have arithmetic type or bea pointer. The result is into
208 REFERENCE MANUAL APPENDIX AA7.15 Logical OR Operator logical- OR-expression: logical-AND-expression logical- OR-expression : : logical-AN D-expression°The l l operator groups left-to-right. It returns 1 if either of its operands comparesunequal to zero, and otherwise. Unlike I, I I guarantees left-to-right evaluation: thefirst operand is evaluated, including all side effects; if it is unequal to 0, the value of theexpression is 1. Otherwise, the right operand is evaluated, and if it is unequal to 0, theexpression's value is 1, otherwise 0. The operands need not have the same type, but each must have arithmetic type or bea pointer. The result is intoA7.16 Q)nditional Operator conditional- expression: logical- OR-expression logical-OR-expression ? expression: conditional-expressionThe first expression is evaluated, including all side effects; if it compares unequal to 0,the result is the value of the second expression, otherwise that of third expression. Onlyone of the second and third operands is evaluated. If the second and third operands arearithmetic, the usual arithmetic conversions are performed to bring them to a commontype, and that is the type of the result. If both are void, or structures or unions of the°same type, or pointers to objects of the same type, the result has the common type. Ifone is a pointer and the other the constant 0, the is converted to the pointer type, andthe result has that type. If one is a pointer to void and the other is another pointer, theother pointer is converted to a pointer to void, and that is the type of the result. In the type comparison for pointers, any type qualifiers (§A8.2) in the type to whichthe pointer points are insignificant, but the result type inherits qualifiers from both armsof the conditional.A7~17 Assigl'l1'l9t't Expressions There are several assignment operators; all group right-to-Ieft. assignm ent- expression: conditional- expression unary- expression assignment- operator assignment- expression assignment-operator: one of = *= /= %= += -= «= »= &= = 1=All require an Ivalue as left operand, and the lvalue must be modifiable: it must not bean array, and must not have an incomplete type, or be a function. Also, its type mustnot be qualified with const; if it is a structure or union, it must not have any memberor, recursively, submember qualified with const. The type of an assignment expressionis that of its left operand, and the value is the value stored in the left operand after theassignment has taken place. In the simple assignment with =, the value of the expression replaces that of theobject referred to by the lvalue. One of the following must be true: both operands havearithmetic type, in which case the right operand is converted to the type of the left bythe assignment; or both operands are structures or unions of the same type; or one
SECTION A7 EXPRESSIONS 209operand is a pointer and the other is a pointer to void; or the left operand is a pointerand the right operand is a constant expression with value 0; or both operands arepointers to functions or objects whose types are the same except for the possible absenceof const or volatile in the right operand. An expression of the form E 1 op = E2 is equivalent to E 1 = E 1 op (E2) exceptthat E 1 is evaluated only once.A7.18 Q)nma Operator expression: assignment-expression expression, assignment-expressionA pair of expressions separated by a comma is evaluated left-to-right, and the value ofthe left expression is discarded. The type and value of the result are the type and valueof the right operand. All side effects from the evaluation of the left operand are com-pleted before beginning evaluation of the right operand. In contexts where comma isgiven a special meaning, for example in lists of function arguments (§A 7.3.2) and lists ofinitializers (§A8.7), the required syntactic unit is an assignment expression, so thecomma operator appears only in a parenthetical grouping; for example, f(a, (t=3, t+2), c)has three arguments, the second of which has the value 5.A7.19 C4nstart Expressions Syntactically, a constant expression is an expression restricted to a subset of opera-tors: constant- expression: conditional-expressionExpressions that evaluate to a constant are required in several contexts: after case, asarray bounds and bit-field lengths, as the value of an enumeration constant, in initializ-ers, and in certain preprocessor expressions. Constant expressions may not contain assignments, increment or decrement opera-tors, function calls, or comma operators, except in an operand of sizeof. If the con-stant expression is required to be integral, its operands must consist of integer, enumera-tion, character, and floating constants; casts must specify an integral type, and any float-ing constants must be cast to an integer. This necessarily rules out arrays, indirection,address-of, and structure member operations. (However, any operand is permitted forsizeof.) More latitude is permitted for the constant expressions of initializers; the operandsmay be any type of constant, and the unary &. operator may be applied to external orstatic objects, and to external or static arrays subscripted with a constant expression.The unary &. operator can also be applied implicitly by appearance of unsubscriptedarrays and functions. Initializers must evaluate either to a constant or to the address ofa previously declared external or static object plus or minus a constant. Less latitude is allowed for the integral constant expressions after #if; sizeofexpressions, enumeration constants, and casts are not permitted. See §A12.5.
210 REFERENCE MANUAL APPENDIX AAS. Declarations Declarations specify the interpretation given to each identifier; they do not neces-sarily reserve storage associated with the identifier. Declarations that reserve storage arecalled definitions. Declarations have the form declaration: declaration-specifiers init-declarator-list.g;The declarators in the init-declarator-list contain the identifiers being declared; thedeclaration-specifiersconsist of a sequence of type and storage class specifiers. declaration-specifiers: storage-class-specifier declaration-specifiers.q, type-specifier declaration-specifiers q, type-qualifier declaration-specifiers s; init -declarator-list: init -declarator init-declarator-list , init-declarator init -declarator: declarator declarator = initializerDeclarators will be discussed later (§A8.S); they contain the names being declared. Adeclaration must have at least one declarator, or its type specifier must declare a struc-ture tag, a union tag, or the members of an enumeration; empty declarations are not per-mitted.A8.1 Storage Class Specifiers The storage class specifiers are: storage-class-specifier: auto register static extern typedefThe meanings of the storage classes were discussedin §A4. The auto and register specifiers give the declared objects automatic storageclass, and may be used only within functions. Such declarations also serve as definitionsand cause storage to be reserved. A register declaration is equivalent to an autodeclaration, but hints that the declared objects will be accessed frequently. Only a fewobjects are actually placed into registers, and only certain types are eligible; the restric-tions are implementation-dependent. However, if an object is declared register, theunary & operator may not be applied to it, explicitlyor implicitly. The rule that it is illegal to calculate the address of an object declared register. but actually taken to be auto. is new. The static specifier gives the declared objects static storage class. and may beused either inside or outside functions. Inside a function, this specifier causes storage tobe allocated, and serves as a definition; for its effect outside a function, see §A11.2. A declaration with extern, used inside a function, specifies that the storage for thedeclared objects is defined elsewhere;for its effects outside a function, see §Al1.2.
SECTION A8 DECLARATIONS 111 The typedef specifier does not reserve storage and is called a storage class specifieronly for syntactic convenience;it is discussedin §A8.9. At most one storage class specifier may be given in a declaration. If none is given,these rules are used: objects declared inside a function are taken to be auto; functionsdeclared within a function are taken to be extern; objects and functions declared out-side a 'function are taken to be static, with external linkage. See§§AIO-All.AB.2 Type Specifiers The type-specifiersare type-specifier: void char short int long float double signed unsigned struct-or-union-specifier enum -specifier typedef-nameAt most one of the words long or short may be specified together with int; themeaning is the same if int is not mentioned. The word long may be specifiedtogetherwith double. At most one of signed or unsigned may be specified together withint or any of its shortor long varieties, or with char. Either may appear alone, inwhich case int is understood. The signed specifier is useful for forcing char objectsto carry a sign; it is permissiblebut redundant with other integral types. Otherwise, at most one type-specifier may be given in a declaration. If the type-specifier is missing from a declaration, it is taken to be into Types may also be qualified, to indicate special properties of the objects beingdeclared. type-qualifier: const volatileType qualifiers may appear with any type specifier. A const object may be initialized,but not thereafter assigned to. There are no implementation-independentsemantics forvolatile objects. The const and volatile properties are new with the ANSI standard. The purpose of cons e is to announce objects that may be placed in read-only memory, and perhaps to increase opportunities for optimization. The purpose of volatile is to force an implementation to suppress optimization that could otherwise occur. For. example, for a machine with memory-mapped input/output, a pointer to a device register might be declared as a pointer to volatile, in order to prevent the compiler from removing apparently redun- dant references through the pointer. Except that it should diagnose explicit attempts to change const objects, a compiler may ignore these qualifiers.
212 REFERENCE MANUAL APPENDIX AAS.3 Structure and Union Declarations A structure is an object consistingof a sequence of named members of various types.A union is an object that contains, at different times, anyone of several members ofvarious types. Structure and union specifiers have the same form. struct -or-union-specifier: struct-or-union identifieropt { struct-declaration-list } struct -or-union identifier struct-or-union: struct unionA struct-declaration-list is a sequence of declarations for the members of the structure orunion: struct -declaration -list: struct -declaration struct -declaration -list struct -declaration struct -declaration: specifier-qualifier-list struct -declarator-list specifier-qualifier-list: type-specifier specifier-qualifier-listg; type-qualifier specifier-qualifter-listg; struct -declarator-list: struct -declarator struct -declarator-list , struct -declaratorUsually, a struct-declarator is just a declarator for a member of a structure or union. Astructure member may also consist of a specified number of bits. Such a member is alsocalled a bit-field, or merely field; its length is set off from the declarator for the fieldname by a colon. struct -declarator: declarator declaratoropt : constant-expression A type specifier of the form struct-or-union identifier { struct-declaration-list }declares the identifier to be the tag of the structure or union specified by the list. Asubsequent declaration in .the .same or an inner scope may refer to the same type byusing the tag in a specifier without the list: struct -or-union identifierIf a specifier with a tag but without a list appears when. the tag is not declared, anincomplete type is specified. Objects with an incomplete structure or union typemay bementioned in contexts where their size is not needed, for example in declarations (notdefinitions), for specifying a pointer, or for creating a typede£, but not otherwise. Thetype becomes complete on occurrence of a subsequent specifier with that tag, and con-taining a declaration list. Evenin specifiers with a list, the structure or union type beingdeclared is incomplete within the list, and becomes complete only at the } terminatingthe specifier. A structure may not contain a member of incomplete type. Therefore, it is impossi-ble to declare a structure or union containing an instance of itself. However, besides
SECTION A8 DECLARATIONS 213givmg a name to the structure or union type, tags allow definition of self-referentialstructures; a structure or union may contain a pointer to an instance of itself, becausepointers to incomplete types may be declared. A very special rule applies to declarations of the form struct-or-union identifier ;that declare a structure or union, but have no declaration list and no declarators. Evenif the identifier is a structure or union tag already declared in an outer scope (§All.1),this declaration makes the identifier the tag of anew, incompletely-typed structure orunion in the current scope. This recondite rule is new with ANSI. It is intended to deal with mutually- recursive structures declared in an inner scope, but whose tags might already be declared in the outer scope. A structure or union specifier with a list but no tag creates a unique type; it can bereferred to directly only in the declaration of which it is a part. The names of members and tags do not conflict with each other or' with ordinaryvariables. A member name may not appear twice in the same structure or union, butthe same member name. may be used in different structures or unions. In the first edition of this book, the names of structure and union members were not associated with their parent. However, this association became common in compilers well before the ANSI standard. A non-field member of a structure or union may have any object type. A fieldmember (which need not have a declarator and thus may be unnamed) has type int,unsigned int, or signed int, and is interpreted as an object of integral type of thespecified length in bits; whether an int field is treated as signed is implementation-dependent. Adjacent field members of structures are packed into implementation-dependent storage units in an implementation-dependent direction. When a field follow-ing another field will not fit into a partially-filled storage unit, it may be split betweenunits, or the unit may be padded. An unnamed field with width 0 forces this padding, sothat the next field will begin at the edge of the next allocation unit. The ANSI. standard makes fields even more implementation-dependent than did the fi.r~t.~ditioJl. It is advisable to read the language rules for storing bit-fields as \"i~pleRl~nt~tioo-dependent\" without qIJalificati9n. Structures with bit-fields may be used as a portable way of attempting to reduce the storage required for a structure (with the probable cost of increasing the instruction space, and time, needed to access the fields), or as a non-portable way to describe a storage lay- out known at the bit level. In the second case, it is necessary to understand the rules of the local implementation. The members of a structure have addresses increasing in the order of their declara-tions. A non-field member of a structure is aligned at an addressing boundary depend-ing on its type; therefore, there may be unnamed holes in a structure. If a pointer to astructure is cast to the type of a pointer to its first member, the result refers to the firstmember. A union may be thought of as a structure all of whose members begin at offset 0 andwhose size is sufficient to contain any of its members. At most one of the members canbe stored in a union at any time. If a pointer to a union is cast to the type of a pointerto a member, the result refers to that member. A simple example of a structure declaration is
214 REFERENCE MANUAL APPENDIX A struct tnode { char tword [20] ; int count; struct tnode *left; struct tnode *right; };which contains an array of 20 characters, an iriteger, and two pointers to similar struc-tures, Once this declaration has been given, the declaration struct tnode s, *sp;declares s to be a structure of the given sort and sp to be a pointer to a structure of thegiven sort. With these declarations, the expression sp->countrefers to the count field of the structure to which sp points; s.leftrefers to the left subtree pointer of the structure s; and s.r!ght->tword[O]refers to the first character of the. tword member of the right subtree of s. In general, a member of a union may riot be inspected unless the value of the unionhas been assigned using that same member. However, one special guarantee simplifiesthe use of unions: if a union contains several structures that share a common initialsequence, and if the union currently contains one of these structures, it is permitted torefer to the common initial part of any of the contained structures. For example, thefollowing is a legal fragment: union { struct { int type; } n; struct { int type; int intnode; } ni; struct { int type; float floatnode; } nf; } u; u.nf.type = FLOAT; =u.nf.floatnode 3.14; if (u.~.type == FLOAT) ...sin(u.nf.floatnode)AS.4 Enumeration. Enumerations are unique types with values ranging over a set of named constantscalled enumerators. The form of an enumeration specifier borrows from that of struc-tures and unions.
SECTION A8 DECLARATIONS 215 enum -specifler: enum identifieropt { enumerator-list} enum identifier enumerator-list: enumerator enumerator-list , enumerator enumerator: identifier identifier = constant-expressionThe identifiers in an enumerator list are declared as constants of type int, and mayappear wherever constants are required. If no enumerators with = appear, then thevalues of the corresponding constants begin at 0 and increase by 1 as the declaration isread from left to right. An enumerator with = gives the associated identifier the valuespecified;subsequent identifiers continue the progressionfrom the assigned value. Enumerator names in the same scope must all be distinct from each other and fromordinary variable names, but the values need not be distinct. The role of the identifier in the enum-specifier is analogous to that of the structuretag in a struct-specifier; it names a particular enumeration. The rules for enum-specifiers with and without tags and lists are the same as those for structure or unionspecifiers, except that incomplete enumeration types do not exist; the tag of an enum-specifier without an enumerator list must refer to an in-scope specifier with a list. Enumerations are new since the first edition of this book, but have been part of the language for some years.A8.5 Declarator8 Declarators have the syntax: declarator: pointer optdirect -declarator direct -declarator: [ constant-expressionopt ] identifier ( parameter-type-list ) ( declarator ) ( identifier-listopt ) direct-declarator direct -declarator direct-declarator pointer: * type-qualifier-listopt * type-qualtfter-listi., pointer type-qualifier-list: type-qualifier type-qualifier-list type-qualifierThe structure of declarators resembles that of. indirection, function, and array expres-sions; the grouping is the same.
216 REFERENCE MANUAL APPENDIX AA8.8 Meaning of Deelaratora A list of declarators appears after a sequence of type and storage class specifiers.Each declarator declares a unique main identifier, the one that appears as the first alter-native of the production for direct-declarator. The storage class specifiers apply directlyto this identifier, but its type depends on the form of its declarator. A declarator is readas an assertion that when its identifier appears in an expression of the same form as thedeclarator, it yields an object of the specified type. Considering only the type parts of the declaration specifiers (§A8.2) and a particulardeclarator, a declaration has the form \"T D,\" where T is a type and 0 is a declarator.The type attributed to the identifier in the various forms of declarator is describedinductively using this notation. In a declaration T 0 where 0 is an unadorned identifier, the type of the identifier isT. In a declaration T D where 0 has the form ( 01 )then the type of the identifier in 01 is the same as that of n The parentheses do notalter the type, but may change the binding of complex declarators.A8.8.1 Pointer Deelaratora In a declaration T Dwhere 0 has the form * type-qualifier-Iistopt 0 1and the type of the identifier in the declaration T 01 is \"type-modifier T,\" the type ofthe identifier of 0 is \"type-modifier type-qualifier-list pointer to T.\" Qualifiers follow-ing * apply to pointer itself, rather than to the object to which the pointer points. For example, consider the declaration int *ap[];Here ap[] plays the role of 01; a declaration \"int ap[]\" (below) would give ap thetype \"array of int,\" the type-qualifier list is empty, and the type-modifier is \"array of.\"Hence the actual declaration gives ap the type \"array of pointers to int.\" As other examples, the declarations =int i, *pi, *eonst epi &ij =eonst int ei 3, *pei;declare an integer i and a pointer to an integer pi. The value of the constant pointerepi may not be changed; it will always point to the same location, although the value towhich it refers may be altered. The integer ei is constant, and may not be changed(though it may be initialized, as here.) The type of pei is \"pointer to eonst int,\"and pei itself may be changed to point to another place, but the value to which it pointsmay not be altered by assigning through pcd,A8.8.2 Array Deelaratora In a declaration T 0 where 0 has the form 01 [constant-expressionop]tand the type of the identifier in the declaration T D1 is \"type-modifier T,\" the type ofthe identifier of 0 is \"type-modifier array of T.\" If the constant-expression is present, ito.must have integral type, and value greater than If the constant expression specifying
SECTION A8 DECLARATIONS 217the bound is missing, the array has an incomplete type. An array may be constructed from an arithmetic type, from a pointer, from a struc-ture or union, or from another array (to generate a multi-dimensionalarray). Any typefrom which an array is constructed must be complete; it must not be an array or struc-ture of incomplete type. This implies that for a multi-dimensional array, only the firstdimension may be missing. The type of an object of incomplete array type is completedby another, complete, declaration for the object (§AlO.2), or by initializing it (§A8.7).For example, float fa[17], *afp[17];declares an array of float numbers and an array of pointers to float numbers. Also, static int x3d[3][S][7];declares a static three-dimensional array of integers, with rank 3x5x7. In completedetail, x3d is an array of three items; each item is an array of five arrays; each of thelatter arrays is an array of seven integers. Any of the expressions x3d, x3d [ i ],x3d[ i] [j], x3d[ i] [j] [k] may reasonably appear in an expression. The first threehave type \"array,\" the last has type into More specifically, x3d [ i ][ j] is an array of7 integers, and x3d [ i] is an array of 5 arrays of 7 integers. The array subscripting operation is defined so that E1[E2 ] is identical to* (E 1+E2). Therefore, despite its asymmetric appearance, subscripting is a commuta-tive operation. Because of the conversion rules that apply to + and to arrays (§§A6.6,A7.1, A7.7), if E1 is an array and E2 an integer, then E1 [E2] refers to the E2-thmember of E1. In the example, x3d [ i ] [ j ] [k] is equivalent to * (x3d [ i ] [ j] + k). The firstsubexpression x3d [ i ] [ j] is converted by §A7.1 to type \"pointer to array of integers;\"by §A7.7, the addition involvesmultiplication by the size of an integer. It followsfromthe rules that arrays are stored by rows (last subscript varies fastest) and that the firstsubscript in the declaration helps determine the amount of storage consumed by anarray, but plays no other part in subscript calculations.A8.6.3 Function Declarators In a new-style function declaration T Dwhere Dhas the form D 1 tparameter-type-ltst )and the type of the identifier in the declaration T D1 is \"type-modifier T,\" the type ofthe identifier of D is \"type-modifier function with arguments parameter-type-listreturning T.\" The syntax of the parameters is parameter-type-list: parameter-list parameter-list , parameter-list: parameter-declaration parameter-list , parameter-declaration parameter-declaration: declaration-specifiers declarator declaration-specifiers abstract -declarator optIn the new-style declaration, the parameter list specifiesthe types of the parameters. As
218 REFERENCE MANUAL APPENDIX Aa special case, the declarator for a new-style function with no parameters has a parame-ter type list consisting solely of the keyword void. If the parameter type list ends withan ellipsis \", ••• \", then the function may accept more arguments than the number ofparameters explicitly described; see §A7.3.2. The types of parameters that are arrays or functions are altered to pointers, inaccordance with the rules for parameter conversions;see §A10.1. The only storage classspecifier permitted in a parameter's declaration specifier is register,and this specifieris ignored unless the function declarator heads a function definition. Similarly, if thedeclarators in the parameter declarations contain identifiers and the function declaratordoes not head a function definition, the identifiers go out of scope immediately.Abstract declarators, which do not mention the identifiers, are discussed in §AS.S. In an old-style function declaration T 0 where 0 has the form 01 (identifier-listopt)and the type of the identifier in the declaration T 01 is \"type-modifier T,\" the type ofthe identifier of 0 is \"type-modifier function of unspecified arguments returning T.\"The parameters (if present) have the form identifier-list: identifier identifier-list , identifierIn the old-style declarator, the identifier list must be absent unless the declarator is usedin the head of a function definition (§AIO.l). No information about the types of theparameters is supplied by the declaration. For example, the declaration int f(), *fpi()i (*pfi)();declares a function f returning an integer, a function fpi returning a pointer to aninteger, and a pointer pfi to a function returning an integer. In none of these are theparameter types specified; they are old-style. In the new-style declaration int strcpy(char *dest, const char *source), rand(void);strcpy is a function returning int,with two arguments, the first a character pointer,and the second a pointer to constant characters. The parameter names are effectivelycomments. The second function rand takes no arguments and returns into Function declarators with parameter prototypes are, by far, the most important language change introduced by the ANSI standard. They offer an advantage over the \"old-style\" declarators of the first edition by providing error-detection and coercion of arguments across function calls, but at a cost: turmoil and con- fusion during their introduction, and the necessity of accommodating both forms. Some syntactic ugliness was required for the sake of compatibility, namely void as an explicit marker of new-style functions without parameters. The ellipsis notation .. , ••. \" for variadic functions is also new, and, together with the macros in the standard header <stdarg. h>, formalizes a mechanism that was officially forbidden but unofficially condoned in the first edition. These notations were adapted from the C++ language.AS.7 Initialization When an object is declared, its init-declarator may specify an initial value for theidentifier being declared. The initializer is preceded by =, and iseither an expression,ora list of initializers nested in braces. A list may end with a comma, a nicety for neat
SECTION AS DECLARATIONS 219formatting. initializer: assignment -expression { initializer-Iist } { initializer-list, } inuialtzer-list: initializer initializer-Iist , initializer All the expressions in the initializer for a static object or array must be constantexpressions as described in §A7.19. The expressions in the initializer for an auto orregister object or array must likewise be constant expressions if the initializer is abrace-enclosed list. However, if the initializer for an automatic object is a single expres-sion, it need not be a constant expression, but must merely have appropriate type forassignment to the object. The first edition did not countenance initialization of automatic structures, unions, or arrays. The ANSI standard allows it, but only by constant construc- tions unless the initializer can be expressed by a simple expression. A static object not explicitly initialized is initialized as if it (or its members) wereassigned the constant O. The initial value of an automatic object not explicitlyinitializedis undefined. The initializer for a pointer or an object of arithmetic type is a single expression,perhaps in braces. The expressionis assigned to the object. The initializer for a structure is either an expression of the same type, or a brace-enclosed list of initializers for its members in order. Unnamed bit-field members areignored, and are not initialized. If there are fewer initializers in the list than membersof the structure, the trailing members are initialized with o. There may not be more ini-tializers than members. The initializer for an array is a brace-enclosedlist of initializers for its members. Ifthe array has unknown size, the number of initializers determines the size of the array,and its type becomes complete. If the array has fixed size, the number of initializersmay not exceed the number of members of the array; if there are fewer, the trailingmembers are initialized with O. As a special case, a character array may be initialized by a string literal; successivecharacters of the string initialize successive members of the array. Similarly, a widecharacter literal (§A2.6) may initialize an array of type wchar_t. If the array hasunknown size, the number of characters in the string, including the terminating nullcharacter, determines its size; if its size is fixed, the number of characters in the string,not counting the terminating null character, must not exceed the size of the array. The initializer for a union is either a single expressionof the same type, or a brace-enclosed initializer for the first member of the union. The first edition did not allow initialization of unions. The \"first-member\" rule is clumsy, but is hard to generalize without new syntax. Besides allowing unions to be explicitly initialized in at least a primitive way, this ANSI rule makes definite the semantics of static unions not explicitly initialized. An aggregate is a structure or array. If an aggregate contains members of aggregatetype, the initialization rules apply recursively. Braces may be elided in the initializationas follows:if the initializer for an aggregate's member that is itself an aggregate beginswith a left brace, then the succeeding comma-separated list of initializers initializes the
220 REFERENCE MANUAL APPENDIX Amembers of the subaggregate; it is erroneous for there to be more initializers thanmembers. If, however, the initializer for a subaggregate does not begin with a left brace,then only enough elements from the list are taken to account for the members of thesubaggregate; any remaining members are left to initialize the next member of theaggregate of which the subaggregate is a part. For example, int x[] = { 1, 3, 5 };declares and initializes x as a l-dimensional array with three members, since no size wasspecified and there are three initializers. float y[4][3]= { { 1, 3, 5 }, { 2, 4, 6 }, { 3, 5, 7 }, };is a completely-bracketed initialization: I, 3, and 5 initialize the first row of the arrayy[0], namely y[0] [0], y[0] [ 1], and y[0] [2]. Likewise the next two lines initial-ize y[1]and y[2 J. The Initializer ends early, and therefore the elements of y[3]areinitialized with O. Precisely the same effect could have been achieved by float y[4][3]= { 1, 3, 5, 2, 4, 6, 3, 5, 7 };The initializer for y begins with a left brace, but that for y [ 0] does not; therefore threeelements from the list are used. Likewise the next three are taken successively for y [ 1 ]and then for y [ 2 ] . Also, float y[4][3]= { { 1 }, { 2 }, { 3 }, { 4 } };initializes the first column of y (regarded as a two-dimensional array) and leaves therest O. Finally, =char msg[] \"Syntax error on line \"s\n\";shows a character array whose members are initialized with a string; its size includes theterminating null character,A8.8 Type Name. In several contexts (to specify type conversions explicitly with a cast, to declareparameter types in function declarators, and as an argument of sizeof) it is necessaryto supply the name of a data type. This is accomplished using a type name, which issyntactically a declaration for an object of that type omitting the name of the object. type-name: specifier-qualifier-list abstract -declarator opt abstract -declarator: pointer pointer opt direct -abstract -declarator
SECTION A8 DECLARATIONS 221 direct -abstract -declarator: ( abstract -declarator ) direct -abstract -declarator opt [ constant -expressionopt ] direct -abstract -declarator opt parameter-type-list g. )It is possibleto identify uniquely the location in the abstract-declarator where the identi-fier would appear if the construction were a declarator in a declaration. The named typeis then the same as the type of the hypothetical identifier. For example, int int * int *[3] int (*)[] int *() int (*[])(void)name respectively the types \"integer,\" \"pointer to integer,\" \"array of 3 pointers tointegers,\" \"pointer to an array of an unspecified number of integers,\" \"function ofunspecified parameters returning pointer to integer,\" and \"array, of unspecified size, ofpointers to functions with no parameters each returning an integer.\"A8.9 Typedef Declarations whose storage class specifier is typede£ do not declare objects; insteadthey define identifiers that name types. These identifiers are called typedef names. typedef-name: identifierA typedef declaration attributes a type to each name among its declarators in theusual way (see .8.6). Thereafter, each such typedef name is syntactically equivalent toa type specifier keyword for the associated type. For example, after typedef long Blockno, *Blockptr; typedef struct { double r, theta; } Complex;the constructions Blockno b; extern Blockptr bp; Complex z, *zp;are legal declarations. The type of b is long,that of bp is \"pointer to long,\"and thatof z is the specified structure; zp is a pointer to such a structure. typedef does not introduce new types, only synonymsfor types that could be speci-fied in another way. In the example, b has the same type as any other long object. Typedef names may be redeclared in an inner scope, but a non-empty set of typespecifiers must be given. For example, extern Blockno;does not redeclare Blockno,but extern int Blockno;does.A8.10 Type Equivalence Two type specifier lists are equivalent if they contain the same set of type specifiers,taking into account that some specifiers can be implied by others (for example, long
222 REFERENCE MANUAL APPENDIX Aalone implies long int). Structures, unions, and enumerations with different tags aredistinct, and a tagless union, structure, or enumeration specifiesa unique type. Two types are the same if their abstract declarators (§A8.8), after expanding anytypedef types, and deleting any function parameter identifiers, are the same up toequivalence of type specifier lists. Array sizes and function parameter types are signifi-cant.A9. Statements Except as described, statements are executed in sequence. Statements are executedfor their effect, and do not have values. They fall into several groups. statement: labeled-statement expression-statement compound-statement selection-statement iteration-statement jump-statementA9.1 Labeled Statements Statements may carry label prefixes. labeled-statement: identifier : statement case constant-expression: statement default : statementA label consisting of an identifier declares the identifier. The only use of an identifierlabel is as a target of qoto. The scope of the identifier is the current function. Becauselabels have their own name space; they do not interfere with other identifiers and cannotbe redeclared. See §AII.I. Case labels and default labels are used with the switch statement (§A9.4). Theconstant expressionof case must have integral type. Labels in themselvesdo not alter the flowof control.A9.2 Expression Statement Most statements are expressionstatements, which have the form expression-statement: expressionopt ;Most expression statements are assignments or function calls. All side effects from theexpression are completed before the next statement is executed. If the expression ismissing, the construction is called a null statement; it is often used to supply an emptybody to an iteration statement or to place a label.A9.3 Compound Statement So that several statements can be used where one is expected, the compound state-ment (also called \"block\") is provided. The body of a function definition is a compoundstatement.
SECTION A9 STATEMENTS 223 compound-statement: { declaration-listoP1 statement -listopt } declaration-list: declaration declaration-list declaration statement -list: statement statement-list statementIf an identifier in the declaration-list was in scope outside the block, the outer declara-tion is suspended within the block (see §A11.l), after which it resumes its force. Anidentifier may be declared only once in the same block. These rules apply to identifiersin the same name space (§AIl); identifiers in different name spaces are treated as dis-tinct. Initialization of automatic objects is performed each time the block is entered at thetop, and proceeds in the order of the declarators. If a jump into the block is executed,these initializations are not performed. Initializations of static objects are performedonly once, before the program begins execution.A9.4 Selection Statementa Selection statements choose one of several flowsof control. selection-statement: if (expression) statement if (expression) statement else statement swi tch ( expression ) statement In both forms of the if statement, the expression, which must have arithmetic orpointer type, is evaluated, including all side-effects, and if it compares unequal to 0, thefirst substatement is executed. In the second form, the second substatement is executedif the expression is O. The else ambiguity is resolved by connecting an else with thelast encountered else-less if at the same block nesting level. The switch statement causes control to be transferred to one of several statementsdepending on the value of an expression, which must have integral type. The substate-ment controlled by a switch is typically compound. Any statement within the sub-statement may be labeled with one or more case labels (§A9.l). The controllingexpressionundergoes integral promotion (§A6.1), and the case constants are converted tothe promoted type. No two of the case constants associated with the same switch mayhave the same value after conversion. There may also be at most one default labelassociated with a switch. Switches may be nested; a case or def aul t label is associ-ated with the smallest switch that contains it. When the switch statement is executed, its expression is evaluated, including allside effects, and compared with each case constant. If one of the case constants is equalto the value of the expression, control passes to the statement of the matched caselabel. If no case constant matches the expression,and if there is a def aul t label, con-trol passes to the labeled statement. If no case matches, and if there is no default,then none of the substatements of the switch is executed. In the first edition of this book, the controlling expression of swi tch, and the case constants, were required to have int type.
114 REFERENCE MANUAL APPENDIX AA9.S Iteration StatementsIteration statements specify looping.iteration-statement: while (expression) statement do statement while (expression) ; for (expressionopt ; expressionopt ; expressionopt ) statement In the while and do statements, the substatement is executed repeatedly so long asthe value of the expression remains unequal to 0; the expressionmust have arithmetic orpointer type. With while, the test, including all side effects from the expression,occursbefore each execution of the statement; with do,the test followseach iteration. In the for statement, the first expression is evaluated once, and thus specifies ini-tialization for the loop. There is no restriction on its type. The second expression musthave arithmetic or pointer type; it is evaluated before each iteration, and if it becomesequal to 0, the for is terminated. The third expressionis evaluated after each iteration,and thus specifies a re-initialization for the loop. There is no restriction on its type.Side-effects from each expression are completed immediately after its evaluation. If thesubstatement does not contain continue, a statement for (expression] ; expressionl ; expression] ) statementis equivalent to { expression] ; whi1e (expression2) statement expression] ; } Any of the three expressions may be dropped. A missing second expression makesthe implied test equivalent to testing a non-zero constant.A9.e Jump StatementsJump statements transfer control unconditionally.jump-statement: qoto identifier ; continue ; break ; return expressionoptIn the qoto statement, the identifier must be a label (§A9.1) located in the currentfunction. Control transfers to the labeled statement. A continue statement may appear only within an iteration statement. It causescontrol to pass to the loop-continuationportion of the smallest enclosing such statement.More precisely, within each of the statementswhile ( •.. ) { do { for ( ... ) {contin: ; contin: , contin: ; } while ( .... );} }a continue not contained in a smaller iteration statement is the same as qotocontino A break statement may appear only in an iteration statement or a switch state-ment, and terminates execution of the smallest enclosing such statement; control passes
SECTION AIO EXTERNAL DECLARATIONS 225to the statement followingthe terminated statement. A function returns to its caller -bythe return statement. When return is followedby an expression, the value is returned to the caller of the function. The expression isconverted, as if by assignment, to the type returned by the function in which it appears. Flowing off the end of a function is equivalent to a return with no expression. Ineither case, the returned value is undefined.A10. External Declarations The unit of input provided to the C compiler is called a translation unit; it consistsofa sequence of external declarations, which are either declarations or function definitions. translation-unit: external-declaration translation- unit external-declaration external-declaration: function-definition declaration The scope of external declarations persists to the end of the translation unit in whichthey are declared, just as the effect of declarations within blocks persists to the end ofthe block. The syntax of external declarations is the same as that of all declarations,except that only at this level may the code for functions be given.A 10.1 Function Definitiona Function definitions have the form function-definition: declaration-specijiersopt declarator declaration-list.g, compound -statementThe only storage-class specifiers allowed among the declaration specifiers are extern orstatic; see §AI1.2 for the distinction between them. A function may return an arithmetic type, a structure, a union, a pointer, or void,but not a function or an array. The declarator in a function declaration must specifyexplicitly that the declared identifier has function type; that is, it must contain one ofthe forms (see §A8.6.3) direct-declarator ( parameter-type-list ) direct-declarator ( identijier-listopt )where the direct-declarator is an identifier or a parenthesized identifier. In particular, itmust not achieve function type by means of a typedef. In the first form, the definition is a new-style function, and its parameters, togetherwith their types, are declared in its parameter type list; the declaration-list followingthefunction's declarator must be absent. Unless the parameter type list consists solely ofvoid, showing that the function takes no parameters, each declarator in the parametertype list must contain an identifier. If the parameter type list ends with \", ••• \" thenthe function may be called with more arguments than parameters; the va_arg macromechanism defined in the standard header <stdarg. h> and described in Appendix Bmust be used to refer to the extra arguments. Variadic functions must have at least onenamed parameter. In the second form, the definition is old-style: the identifier list names the
226 REFERENCE MANUAL APPENDIX Aparameters, whilethe declaration list attributes types to them. If no declaration is givenfor a parameter, its type is taken to be into The declaration list must declare onlyparameters named in the list, initialization is not permitted, and the only storage-classspecifier possibleis register. In both styles of function definition, the parameters are understood to be declaredjust after the beginning of the compound statement constituting the function's body, andthus the same identifiers must not be redeclared there (although they may, like otheridentifiers, be redeclared in inner blocks). If a parameter is declared to have type\"array of type,\" the declaration is adjusted to read \"pointer to type;\" similarly, if aparameter is declared to have type \"function returning type,\" the declaration is adjustedto read \"pointer to function returning type.\" During the call to a function, the argu-ments are converted as necessary and assigned to the parameters; see §A7.3.2. New-style function definitions are new with the ANSI standard. There is also a small change in the details of promotion; the first edition specified that the declarations of float parameters were adjusted to read double. The differ- ence becomes noticeable when a pointer to a parameter is generated within a function. A complete example of a new-style function definition is int max(int a, int b, int c) { int m; =m (a > b) ? a : b; return (m > c) ? m : c; }Here int is the declaration specifier; max(int a, int b, int c) is the function'sdeclarator, and { ••• } is the block giving the code for the function. The correspond-ing old-style definition would be int max(a, b, c) int a, b, c; { }where now int max(a, b , c) is the declarator, and int a, b, c; is the declara-tion list for the parameters.A 10.2 External Declaration8 External declarations specify the characteristics of objects, functions and other iden-tifiers. The term \"external\" refers to their location outside functions, and is not directlyconnected with the extern keyword; the storage class for an externally-declared objectmay be left empty, or it may be specified as extern or static. Several external declarations for the same identifier may exist within the same trans-lation unit if they agree in type and linkage, and if there is at most one definition for theidentifier. Two declarations for an object or function are deemed to agree in type under therules discussed in §AS.IO. In addition, if the declarations differ because one type is anincomplete structure, union, or enumeration type (§A8.3) and the other is thecorresponding completed type with the same tag, the types.are taken to agree. More-over, if one type is an incomplete array type (§AS.6.2) and the other is a completed
SECTION All SCOPE AND LINKAGE l'J,7array type, the types, if otherwise identical, are also taken to agree. Finally, if one typespecifies an old-style function, and the other an otherwise identical new-style function,with parameter declarations, the types are taken to agree. If the first external declaration for a function or object includes the static speci-fier, the identifier has internal linkage; otherwise it has external linkage. Linkage isdiscussed in §A11.2. An external declaration for an object is a definition if it has an iriitializer. An exter-nal object declaration that does not have an initializer, and does not contain the externspecifier, is a tentative definition. If a definition for an object appears in a translationunit, any tentative definitions are treated merely as redundant declarations, If no defini-tion for the object appears in the translation unit, all its tentative definitions become asingle definition with initializer O. Each object must have exactly one definition. For objects with internal linkage, thisrule applies' separately to each translation unit, because internally-linked objects areuniqueto a translation unit. For objects with external linkage, it applies to the entireprogram. Although the one-definition rule is formulated somewhat differently in the first edition of this book, it is in' effect identical to the one 'Stated here. Some imple- mentations relax it by generalizing the notion of tentative definition. In the alternate formulation, which is usual in UNIX systems and recognized as a com- mon extension by the Standard, all the tentative definitions for an externally- linked object, throughout all the translation units of a program, are considered together instead of in each' translation unit separately. If a definition occurs somewhere in. the program, then the' tentative definitions become merely declarations,' but if no definition appears, then all its tentative definitions become a definition with initializer O.A11. Scope and Linkage A program need not all be compiled at one time: the source text may be kept inseveral files containing translation units, and precompiled routines may be loaded fromlibraries. Communication among the functions of a program may be carried out boththrough calls and through manipulation of external data. Therefore, there' are two kinds of scope to consider: first, the lexical scope of anidentifier, which is the region of the program text within which the identifier's charac-teristics are understood; and second, the scope associated with objects and functions withexternal linkage, which determines the connections between identifiers in separately com-piled translation units.A 11.1 Lexical Scope Identifiers fall into several name spaces thai do not interfere with one another; thesame identifier may be used for different purposes, even in the same scope, if the usesare in different name spaces. These classes are: objects, functions, typedef names, andenwn constants; labels; tags of structures, unions, and enumerations; and members ofeach structure or union individually. These rules differ in several ways from those described in the first edition of this manual. Labels did not previously have their own name space; tags of structures and unions each had a separate space, and in some implementations
118 REFERENCE MANUAL APPENDIX A enumeration tags did as well; putting different kinds of tags into the same space is a new restriction. The most important departure from the first edition is that each structure or union creates a separate. name space for its members, so that the same name may appear in several different structures. This rule has been common practice for several years. The lexical scope of an object or function identifier in an external declaration beginsat the end of its declarator and persists to the end of the translation unit in which itappears. The scope of a parameter of a function definition begins at the start of theblock defining the function, and persists through the function; the scope of a parameterin a function declaration ends at the end of the declarator. The scope of an identifierdeclared at the head of a block begins at the end of its declarator, and persists to theend of the block. The scope of a label is the whole of the function in which it appears.The scope of a structure, union, or enumeration tag, or an enumeration constant, beginsat its appearance in a type specifier, and persists to the end of the translation unit (fordeclarations at the external level) or to the end of the block (for declarations within afunction). If an identifier is explicitly declared at the head of a block, including the block con-stituting a function, any declaration of the identifier outside the block is suspended untilthe end of the block.A 11.2 Linkage Within a translation unit, all declarations of the same object or function identifierwith internal linkage refer to the same thing, and the object or function is unique to thattranslation unit. All declarations for the same object or function identifier with externallinkage refer to the same thing, and the object or function is shared by the entire pro-gram. As discussed in §A10.2, the first external declaration for an identifier gives the iden-tifier internal linkage if. the static specifier is used, external linkage otherwise. If adeclaration for an identifier within a block does not include the extern specifier, thenthe identifier has no linkage and is unique to the function. If it does include extern,and an external declaration for the identifier is active in the scope surrounding the block,then the identifier has the same linkage as the external declaration, and refers to thesame object or function; but if no external declaration is visible, its linkage is external.A12. Preprocessing A preprocessorperforms macro substitution, conditional compilation, and inclusionofnamed files. Lines beginning with I, perhaps preceded by white space, communicatewith this preprocessor. The syntax of these lines is independent of the rest of thelanguage; they. may appear anywhere and have effect that lasts (independent of scope)until the end of the translation unit. Line boundaries are significant; each line isanalyzed individually (but see §A12.2 for how to adjoin lines). To the preprocessor, atoken is any language token, or a character sequence giving a file name as in thelinclude directive (§A12.4); in addition, any character not otherwise defined is takenas a token. However,the effect of white space characters other than space and horizon-tal tab is undefined within preprocessorlines. Preprocessing itself takes place in several logically successive phases that may, in a
SECTION Al2 PREPROCESSING 229particular implementation, be condensed.1. First, trigraph sequences as described in §A12.1 are replaced by their equivalents. Should the operating system environment require it, newline characters are intro- duced between the lines of the source file.2. Each occurrence of a backslash character \ followed by a newline is deleted, thus splicing lines (§AI2.2).3. The program is split into tokens separated by white-space characters; comments are replaced by a single space. Then preprocessing directives are obeyed, and macros (§§AI2.3-AI2.10) are expanded.4. Escape sequences in character constants and string literals (§§A2.5.2, A2.6) are replaced by their equivalents;then adjacent string literals are concatenated.5. The result is translated, then linked together with other programs and libraries, by collecting the necessary programs and data, and connecting external function and object references to their definitions.A12.1 Trigraph Sequences The character set of C source programs is contained within seven-bit ASCII, but is asuperset of the ISO 646-1983 Invariant Code Set. In order to enable programs to berepresented in the reduced set, all occurrences of the following trigraph sequences arereplaced by the corresponding single character. This replacement occurs before anyother processing. ??= # ??( ??< { ??I \ ??) ??' ??> } ??I ??-No other such replacements occur. Trigraph sequences are new with the ANSI standard.A12.2 Line Splicing Lines that end with the backslash character \ are folded by deleting the backslashand the followingnewline character. This occurs before divisioninto tokens.A 12.3 Macro I\)efinition and Expansion A control line of the form # define identifier token-sequencecauses the preprocessor to replace subsequent instances of the identifier with the givensequence of tokens; leading and trailing white space around the token sequence is dis-carded. A second #define for the same identifier is erroneous unless the second tokensequence is identical to the first, where all white space separations are taken to beequivalent. A line of the form # define identifier ( identifier-list) token-sequencewhere there is no space between the first identifier and the (, is a macro definition withparameters given by the identifier list. As with the first form, leading and trailing whitespace around the token sequence is discarded, and the macro may be redefined only with
230 REFERENCE MANUAL APPENDIX Aa definition in which the number and spelling of parameters, and the token sequence, isidentical. A control line of the form # undef identifiercauses the identifier's preprocessor definition to be forgotten. It is not erroneous toapply lundef to an unknown identifier. When a macro has been defined in the second form, subsequent textual instances ofthe macro identifier followed by optional white space, and then by (, a sequence oftokens separated by commas, and a ) constitute a call of the macro. The arguments ofthe call are the comma-separated token sequences;commas that are quoted or protectedby nested parentheses do not separate arguments. During collection, arguments are notmacro-expanded. The number of arguments in the call must match the number ofparameters in the definition. After the arguments are isolated, leading and trailingwhite space is removed from them. Then the token sequence resulting from each argu-ment is substituted for each unquoted occurrence of the corresponding parameter's iden-tifier in the replacement token sequence of the macro. Unless the parameter in thereplacement sequence is preceded by #,or preceded or followed by ##, the argumenttokens are examined for macro calls, and expanded as necessary,just before insertion. Two special operators influence the replacement process. First, if an occurrence of aparameter in the replacement token sequence is immediately preceded by #, stringquotes (n) are placed around the correspondingparameter, and then both the # and theparameter identifier are replaced by the quoted argument. A \ character is insertedbefore each \"or \ character that appears surrounding, or inside, a string literal or char-acter constant in the argument. Second, if the definition token sequence for either kind of macro contains a ##operator, then just after replacement of the parameters, each ## is deleted, together withany white space on either side, so as to concatenate the adjacent tokens and form a newtoken. The.effect is undefined if invalid tokens are produced, or if the result depends onthe order of processingof the ## operators. Also, ## may not appear at the beginningor .end of a replacement token sequence. In both kinds of macro, the replacement token sequence is repeatedly rescanned formore defined identifiers. However, once a given identifier has been replaced in a givenexpansion, it is not replaced if it turns up again during rescanning; instead it is leftunchallged. Even if the final value of a macro expansion begins with #, it is not taken to be apreprocessingdirective. The details of the macro-expansion process are described more precisely in the ANSI standard than in the first edition. The most important change is the addi- tion of the # and ## operators, which make quotation and concatenation admis- sible. Some of the new rules, especially those involving concatenation, are bizarre. (See example below.) For example, this facility may be used for \"manifest constants,\" as in #define TABSIZE 100 int table[TABSIZE];The definition #define ABSDIFF(a, b) «a»(b)? (a)-(b) : (b)-(a»defines a macro to return the absolute value of the difference between its arguments.Unlike a function to do the same thing, the arguments and returned value may have any
SECTION Al2 PREPROCESSING 231arithmetic type or even be pointers. Also, the arguments, which might have side effects,are evaluated twice, once for the test and once to produce the value.Given the definition#define tempfile{dir) Idir \"/%s\"the macro call tempfile (/usr /tmp) yields\"/usr/tmp\" \"/\"s\"which will subsequently be catenated into a single string. After#define cat{x, y) x II ythe call cat{var, 123) yields var123. However, the call cat{ cat{ 1,2) ,3» isundefined: the presence of II prevents the arguments of the outer call from beingexpanded. Thus it produces the token stringcat (1 2) 3and }3 (the catenation of the last token of the first argument with the first token of thesecond) is not a legal token. If a second level of macro definition is introduced,#define xcat(x,y) cat{x,y)things work more smoothly; xcat{xcat{ 1, 2), 3) does produce 123,because theexpansion of xcat itself does not involve the II operator. Likewise, ABSDIFF (ABSDIFF ( a ,b) ,c) produces the expected, fully-expandedresult.A 12.4 File Inclu8ion A control line of the form # include «filename»causes the replacement of that line by the entire contents of the file filename. Thecharacters in the name filename must not include > or newline, and the effect is unde-fined if it contains any of \", \" \, or / *. The named file is searched for in a sequenceof implementation-dependent places. Similarly, a control line of the form I include \"filename\"searches first in association with the original source file (a deliberately implementation-dependent phrase), and if that search fails, then as if in the first form. The effect ofusing \" \, or / * in the filename remains undefined, but > is permitted. Finally, a directive of the form # include token-sequencenot matching one of the previous forms is interpreted by expanding the token sequenceas for normal text; one of the two forms with <... > or \"... \" must result, and it is thentreated as previously described. linclude files may be nested.A 12.5 Conditional Compilation Parts of a program may be compiled conditionally, according to the followingschematic syntax.
232 REFERENCE MANUAL APPENDIX A preprocessor-conditional: if-line text elif-parts else-partopt #endif if-line: # if constant-expression # ifdef identifier # ifndef identifier elif-parts: elif-line text eli/-partsopt elif-ltne: # elif constant-expression else-part: else-line text else-line: # elseEach of the directives (if-line, elif-line, else-line, and #endif) appears alone on a line.The constant expressionsin #if and subsequent #elif lines are evaluated in order untilan expression with a non-zero value is found; text followinga line with a zero value isdiscarded. The text following the successful directive line is treated normally. \"Text\"here refers to any material, including preprocessor lines, that is not part of the condi-tional structure; it may be empty. Once a successful #if or #elif line has been foundand its text processed, succeeding #elif and #else lines, together with their text, arediscarded. If all the expressionsare zero, and there is an #else, the text followingthe#else is treated normally. Text controlled by inactive arms of the conditional isignored except for checking the nesting of conditionals. . The constant expression in #if and #elif is subject to ordinary macro replace-ment. Moreover,any expressionsof the form defined identifieror defined ( identifier)are replaced, before scanning for macros, by 1L if the identifier is defined in the prepro-cessor, and by OL if not. Any identifiers remaining after macro expansion are replacedby OL. Finally, each integer constant is considered to be suffixed with L, so that allarithmetic is taken to be long or unsigned long. The resulting constant expression (§A7.19) is restricted: it must be integral, and maynot contain sizeof, a cast, or an enumeration constant. The control lines #ifdef identifier #ifndef identifierare equivalent to # if defined identifier # if I defined identifierrespectively. #elif is new since the first edition. although it has been available in some preprocessors. The defined preprocessor operator is also new.
SECTION A12 PREPROCESSING 233A 12.6 Line Control For the benefit of other preprocessors that generate C programs, a line in one of theforms # line constant \"filename\" # 1ine constantcauses the compiler to believe, for purposes of error diagnostics, that the line number ofthe next source line is given by the decimal integer constant and the current input file isnamed by the identifier. If the quoted filename is absent, the remembered name doesnot change. Macros in the line are expanded before it is interpreted.A 12.7 Error Generation A preprocessor line of the form # error token-sequence.i,causes the processor to write a diagnostic message that includes the token sequence.A 12.8 Pragmas A control line of the form action. An unrecognized # pragma token-sequence.g,causes the processor to perform an implementation-dependentpragma is ignored.A 12.9 Null Directive A preprocessor line of the form #has no effect.A 12.10 Predefined NamesSeveral identifiers are predefined, and expand to produce special information. They,and also the preprocessor expression operator defined, may not be undefined or rede-fined. A decimal constant containing the current source line number.__LINE __ A string literal containing the name of the file being compiled. A string literal containing the date of compilation, in the form FILE ____DATE ____TIME __ \"Mmm dd yyyy\". STDC A string literal containing the time of compilation, in the form \"hh:mm: ss\". The constant 1. It is intended that this identifier be defined to be 1 only in standard-conforming implementations. #error and #pragma are new with the ANSI standard; the predefined preprocessor macros are new, but some of them have been available in some implementations.
234 REFERENCE MANUAL APPENDIX AA 13. Grammar Belowis a recapitulation of the grammar that was given throughout the earlier partof this appendix. It has exactly the same content, but is in a different order. The grammar has undefined terminal symbols integer-constant, character-constant,floating-constant, identifier, string, and enumeration-constant; the typewriterstylewords and symbols are terminals given literally. This grammar can be transformedmechanically into input acceptable to an automatic parser-generator. Besides addingwhatever syntactic marking is used to indicate alternatives in productions, it is necessaryto expand the \"one or' constructions, and (depending on the rules of the parser-generator) to duplicate each production with an opt symbol, once with the symbol andonce without. With one further change, namely deleting the production typedef-name:identifier and making typedef-name a terminal symbol, this grammar is acceptable tothe YACC parser-generator. It has only one conflict, generated by the if-elseambi-guity. translation-unit: external-declaration translation- unit external-declarationexternal-declaration: function-definition declarationfunction-definition: declaration-specifiers.c. declarator declaration-Iistopt compound-statementdeclaration: declaration-specifiers init -declarator-Iistoptdeclaration-list: declaration declaration-list declarationdeclaration-specifiers: storage-class-specifier declaration-spectfiers.q, type-specifier declaratton-speciftersg; type-qualifier declaration-specifiersg;storage-class-specifier: one of auto register static extern typedeftype-specifier: one of void char short int long float double signed unsigned struct-or-union-specifler ' enum-specifier typedef-nametype-qualifier: one of const volatilestruct -or-union-specifier: } struct-or-union identifieropt { struct-declaration-list struct -or- union identifierstruct-or-union: one of struct unionstruct -declaration-list: struct -declaratton struct -declaration-ltst struct -declaration
SECTION A13 GRAMMAR 235init -declarator-list: init -declarator init-declarator-list , init-declaratorinit -declarator: declarator declarator = initializerstruct -declaration: specifier-qualifier-list struct -declarator-listspecifier-qualifier-list: type-specifier specifier-qualifier-ltst.p type-qualifier specifier-qualifier-listg;struct -declarator-list: struct -declarator struct -declarator-list , struct -declaratorstruct -declarator: declarator declarator opt : constant -expressionenum -specifier: enum identifier opt { enumerator-list } enum identifierenumerator-list: enumerator enumerator-list , enumeratorenumerator: identifier identifier = constant -expressiondeclarator: pointer optdirect -declaratordirect -declarator: [ constant-expression.g, ] identifier ( parameter-type-list ) ( declarator ) ( tdenttfier-list c; ) direct-declarator direct-declarator direct-declaratorpointer: * type-qualifier-list opt * type-qualifier-listopt pointertype-qualifier-list: type-qualifier type-qualifier-list type-qualifierparameter-type-list: parameter-list parameter-list ,parameter-list: parameter-declaration parameter-list , parameter-declaration
236 REFERENCE MANUAL APPENDIX Aparameter-declaration: declaration-specifiers declarator declaration-specifiers abstract -declarator optidentifier-list: identifier identifier-list , identifierinittalizer: assignment -expression { inuializer-list } { initializer-list, }initializer-list: initializer initializer-list , initializertype-name: specifier-qualifier-list abstract -declarator optabstract -declarator:pointer .pointer opt direct-abstract-declaratordirect -abstract -declarator: ( abstract -declarator ) direct -abstract -declarator opt [ constant -expressionopt ] direct-abstract-declaratorg; ( parameter-type-listg. )typedef-name: identifierstatement: labeled -statement expression-statement compound -statement selection-statement iteration-statement jump-statementlabeled -statement: statement identifier : statement case constant-expression default : statementexpression-statement: expressionopt ;compound -statement: { declaration-listopt statement-listopt }statement -list: statement statement-list statementselection-statement: if (expression ) statement if (expression) statement else statement switch (expression ) statement
Search
Read the Text Version
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
- 35
- 36
- 37
- 38
- 39
- 40
- 41
- 42
- 43
- 44
- 45
- 46
- 47
- 48
- 49
- 50
- 51
- 52
- 53
- 54
- 55
- 56
- 57
- 58
- 59
- 60
- 61
- 62
- 63
- 64
- 65
- 66
- 67
- 68
- 69
- 70
- 71
- 72
- 73
- 74
- 75
- 76
- 77
- 78
- 79
- 80
- 81
- 82
- 83
- 84
- 85
- 86
- 87
- 88
- 89
- 90
- 91
- 92
- 93
- 94
- 95
- 96
- 97
- 98
- 99
- 100
- 101
- 102
- 103
- 104
- 105
- 106
- 107
- 108
- 109
- 110
- 111
- 112
- 113
- 114
- 115
- 116
- 117
- 118
- 119
- 120
- 121
- 122
- 123
- 124
- 125
- 126
- 127
- 128
- 129
- 130
- 131
- 132
- 133
- 134
- 135
- 136
- 137
- 138
- 139
- 140
- 141
- 142
- 143
- 144
- 145
- 146
- 147
- 148
- 149
- 150
- 151
- 152
- 153
- 154
- 155
- 156
- 157
- 158
- 159
- 160
- 161
- 162
- 163
- 164
- 165
- 166
- 167
- 168
- 169
- 170
- 171
- 172
- 173
- 174
- 175
- 176
- 177
- 178
- 179
- 180
- 181
- 182
- 183
- 184
- 185
- 186
- 187
- 188
- 189
- 190
- 191
- 192
- 193
- 194
- 195
- 196
- 197
- 198
- 199
- 200
- 201
- 202
- 203
- 204
- 205
- 206
- 207
- 208
- 209
- 210
- 211
- 212
- 213
- 214
- 215
- 216
- 217
- 218
- 219
- 220
- 221
- 222
- 223
- 224
- 225
- 226
- 227
- 228
- 229
- 230
- 231
- 232
- 233
- 234
- 235
- 236
- 237
- 238
- 239
- 240
- 241
- 242
- 243
- 244
- 245
- 246
- 247
- 248
- 249
- 250
- 251
- 252
- 253
- 254
- 255
- 256
- 257
- 258
- 259
- 260
- 261
- 262
- 263
- 264
- 265
- 266
- 267
- 268
- 269
- 270
- 271
- 272
- 273
- 274
- 275
- 276
- 277
- 278
- 279
- 280
- 281
- 282
- 283
- 284
- 285
- 286
- 287
- 288