@c Copyright @copyright{} 2022 Richard Stallman and Free Software Foundation, Inc. This is part of the GNU C Intro and Reference Manual and covered by its license. @node Preprocessing @chapter Preprocessing @c man begin DESCRIPTION @cindex preprocessing As the first stage of compiling a C source module, GCC transforms the text with text substitutions and file inclusions. This is called @dfn{preprocessing}. @menu * Preproc Overview:: * Directives:: * Preprocessing Tokens:: * Header Files:: * Macros:: * Conditionals:: * Diagnostics:: * Line Control:: * Null Directive:: @end menu @node Preproc Overview @section Preprocessing Overview GNU C performs preprocessing on each line of a C program as the first stage of compilation. Preprocessing operates on a line only when it contains a @dfn{preprocessing directive} or uses a @dfn{macro}---all other lines pass through preprocessing unchanged. Here are some jobs that preprocessing does. The rest of this chapter gives the details. @itemize @bullet @item Inclusion of header files. These are files (usually containing declarations and macro definitions) that can be substituted into your program. @item Macro expansion. You can define @dfn{macros}, which are abbreviations for arbitrary fragments of C code. Preprocessing replaces the macros with their definitions. Some macros are automatically predefined. @item Conditional compilation. You can include or exclude parts of the program according to various conditions. @item Line control. If you use a program to combine or rearrange source files into an intermediate file that is then compiled, you can use line control to inform the compiler where each source line originally came from. @item Compilation control. @code{#pragma} and @code{_Pragma} invoke some special compiler features in how to handle certain constructs. @item Diagnostics. You can detect problems at compile time and issue errors or warnings. @end itemize Except for expansion of predefined macros, all these operations happen only if you use preprocessing directives to request them. @node Directives @section Directives @cindex directives @cindex preprocessing directives @cindex directive line @cindex directive name @dfn{Preprocessing directives} are lines in the program that start with @samp{#}. Whitespace is allowed before and after the @samp{#}. The @samp{#} is followed by an identifier, the @dfn{directive name}. It specifies the operation to perform. Here are a couple of examples: @example #define LIMIT 51 # undef LIMIT # error You screwed up! @end example We usually refer to a directive as @code{#@var{name}} where @var{name} is the directive name. For example, @code{#define} means the directive that defines a macro. The @samp{#} that begins a directive cannot come from a macro expansion. Also, the directive name is not macro expanded. Thus, if @code{foo} is defined as a macro expanding to @code{define}, that does not make @code{#foo} a valid preprocessing directive. The set of valid directive names is fixed. Programs cannot define new preprocessing directives. Some directives require arguments; these make up the rest of the directive line and must be separated from the directive name by whitespace. For example, @code{#define} must be followed by a macro name and the intended expansion of the macro. A preprocessing directive cannot cover more than one line. The line can, however, be continued with backslash-newline, or by a @samp{/*@r{@dots{}}*/}-style comment that extends past the end of the line. These will be replaced (by nothing, or by whitespace) before the directive is processed. @node Preprocessing Tokens @section Preprocessing Tokens @cindex preprocessing tokens Preprocessing divides C code (minus its comments) into @dfn{tokens} that are similar to C tokens, but not exactly the same. Here are the quirks of preprocessing tokens. The main classes of preprocessing tokens are identifiers, preprocessing numbers, string constants, character constants, and punctuators; there are a few others too. @table @asis @item identifier @cindex identifiers An @dfn{identifier} preprocessing token is syntactically like an identifier in C: any sequence of letters, digits, or underscores, as well as non-ASCII characters represented using @samp{\U} or @samp{\u}, that doesn't begin with a digit. During preprocessing, the keywords of C have no special significance; at that stage, they are simply identifiers. Thus, you can define a macro whose name is a keyword. The only identifier that is special during preprocessing is @code{defined} (@pxref{defined}). @item preprocessing number @cindex numbers, preprocessing @cindex preprocessing numbers A @dfn{preprocessing number} is something that preprocessing treats textually as a number, including C numeric constants, and other sequences of characters which resemble numeric constants. Preprocessing does not try to verify that a preprocessing number is a valid number in C, and indeed it need not be one. More precisely, preprocessing numbers begin with an optional period, a required decimal digit, and then continue with any sequence of letters, digits, underscores, periods, and exponents. Exponents are the two-character sequences @samp{e+}, @samp{e-}, @samp{E+}, @samp{E-}, @samp{p+}, @samp{p-}, @samp{P+}, and @samp{P-}. (The exponents that begin with @samp{p} or @samp{P} are new to C99. They are used for hexadecimal floating-point constants.) The reason behind this unusual syntactic class is that the full complexity of numeric constants is irrelevant during preprocessing. The distinction between lexically valid and invalid floating-point numbers, for example, doesn't matter at this stage. The use of preprocessing numbers makes it possible to split an identifier at any position and get exactly two tokens, and reliably paste them together using the @code{##} operator (@pxref{Concatenation}). @item punctuator A @dfn{punctuator} is syntactically like an operator. These are the valid punctuators: @example [ ] ( ) @{ @} . -> ++ -- & * + - ~ ! / % << >> < > <= >= == != ^ | && || ? : ; ... = *= /= %= += -= <<= >>= &= ^= |= , # ## <: :> <% %> %: %:%: @end example @item string constant A string constant in the source code is recognized by preprocessing as a single preprocessing token. @item character constant A character constant in the source code is recognized by preprocessing as a single preprocessing token. @item header name Within the @code{#include} directive, preprocessing recognizes a @dfn{header name} token. It consists of @samp{"@var{name}"}, where @var{name} is a sequence of source characters other than newline and @samp{"}, or @samp{<@var{name}>}, where @var{name} is a sequence of source characters other than newline and @samp{>}. In practice, it is more convenient to think that the @code{#include} line is exempt from tokenization. @item other Any other character that's valid in a C source program is treated as a separate preprocessing token. @end table Once the program is broken into preprocessing tokens, they remain separate until the end of preprocessing. Macros that generate two consecutive tokens insert whitespace to keep them separate, if necessary. For example, @example @group #define foo() bar foo()baz @expansion{} bar baz @emph{not} @expansion{} barbaz @end group @end example The only exception is with the @code{##} preprocessing operator, which pastes tokens together (@pxref{Concatenation}). Preprocessing treats the null character (code 0) as whitespace, but generates a warning for it because it may be invisible to the user (many terminals do not display it at all) and its presence in the file is probably a mistake. @node Header Files @section Header Files @cindex header file A header file is a file of C code, typically containing C declarations and macro definitions (@pxref{Macros}), to be shared between several source files. You request the use of a header file in your program by @dfn{including} it, with the C preprocessing directive @code{#include}. Header files serve two purposes. @itemize @bullet @item @cindex system header files System header files declare the interfaces to parts of the operating system. You include them in your program to supply the definitions and declarations that you need to invoke system calls and libraries. @item Program-specific header files contain declarations for interfaces between the source files of a particular program. It is a good idea to create a header file for related declarations and macro definitions if all or most of them are needed in several different source files. @end itemize Including a header file produces the same results as copying the header file into each source file that needs it. Such copying would be time-consuming and error-prone. With a header file, the related declarations appear in only one place. If they need to be changed, you can change them in one place, and programs that include the header file will then automatically use the new version when next recompiled. The header file eliminates the labor of finding and changing all the copies as well as the risk that a failure to change one copy will result in inconsistencies within a program. In C, the usual convention is to give header files names that end with @file{.h}. It is most portable to use only letters, digits, dashes, and underscores in header file names, and at most one dot. The operation of including another source file isn't actually limited to the sort of code we put into header files. You can put any sort of C code into a separate file, then use @code{#include} to copy it virtually into other C source files. But that is a strange thing to do. @menu * include Syntax:: * include Operation:: * Search Path:: * Once-Only Headers:: @c * Alternatives to Wrapper #ifndef:: * Computed Includes:: @c * Wrapper Headers:: @c * System Headers:: @end menu @node include Syntax @subsection @code{#include} Syntax @findex #include You can specify inclusion of user and system header files with the preprocessing directive @code{#include}. It has two variants: @table @code @item #include <@var{file}> This variant is used for system header files. It searches for a file named @var{file} in a standard list of system directories. You can prepend directories to this list with the @option{-I} option (@pxref{Invocation, Invoking GCC, Invoking GCC, gcc, Using the GNU Compiler Collection}). @item #include "@var{file}" This variant is used for header files of your own program. It searches for a file named @var{file} first in the directory containing the current file, then in the quote directories, then the same directories used for @code{<@var{file}>}. You can prepend directories to the list of quote directories with the @option{-iquote} option. @end table The argument of @code{#include}, whether delimited with quote marks or angle brackets, behaves like a string constant in that comments are not recognized, and macro names are not expanded. Thus, @code{@w{#include }} specifies inclusion of a system header file named @file{x/*y}. However, if backslashes occur within @var{file}, they are considered ordinary text characters, not escape characters: character escape sequences such as used in string constants in C are not meaningful here. Thus, @code{@w{#include "x\n\\y"}} specifies a filename containing three backslashes. By the same token, there is no way to escape @samp{"} or @samp{>} to include it in the header file name if it would instead end the file name. Some systems interpret @samp{\} as a file name component separator. All these systems also interpret @samp{/} the same way. It is most portable to use only @samp{/}. It is an error to put anything other than comments on the @code{#include} line after the file name. @node include Operation @subsection @code{#include} Operation The @code{#include} directive works by scanning the specified header file as input before continuing with the rest of the current file. The result of preprocessing consists of the text already generated, followed by the result of preprocessing the included file, followed by whatever results from the text after the @code{#include} directive. For example, if you have a header file @file{header.h} as follows, @example char *test (void); @end example @noindent and a main program called @file{program.c} that uses the header file, like this, @example int x; #include "header.h" int main (void) @{ puts (test ()); @} @end example @noindent the result is equivalent to putting this text in @file{program.c}: @example int x; char *test (void); int main (void) @{ puts (test ()); @} @end example Included files are not limited to declarations and macro definitions; those are merely the typical uses. Any fragment of a C program can be included from another file. The include file could even contain the beginning of a statement that is concluded in the containing file, or the end of a statement that was started in the including file. However, an included file must consist of complete tokens. Comments and string literals that have not been closed by the end of an included file are invalid. For error recovery, the compiler terminates them at the end of the file. To avoid confusion, it is best if header files contain only complete syntactic units---function declarations or definitions, type declarations, etc. The line following the @code{#include} directive is always treated as a separate line, even if the included file lacks a final newline. There is no problem putting a preprocessing directive there. @node Search Path @subsection Search Path GCC looks in several different places for header files to be included. On the GNU system, and Unix systems, the default directories for system header files are: @example @var{libdir}/gcc/@var{target}/@var{version}/include /usr/local/include @var{libdir}/gcc/@var{target}/@var{version}/include-fixed @var{libdir}/@var{target}/include /usr/include/@var{target} /usr/include @end example @noindent The list may be different in some operating systems. Other directories are added for C++. In the above, @var{target} is the canonical name of the system GCC was configured to compile code for; often but not always the same as the canonical name of the system it runs on. @var{version} is the version of GCC in use. You can add to this list with the @option{-I@var{dir}} command-line option. All the directories named by @option{-I} are searched, in left-to-right order, @emph{before} the default directories. The only exception is when @file{dir} is already searched by default. In this case, the option is ignored and the search order for system directories remains unchanged. Duplicate directories are removed from the quote and bracket search chains before the two chains are merged to make the final search chain. Thus, it is possible for a directory to occur twice in the final search chain if it was specified in both the quote and bracket chains. You can prevent GCC from searching any of the default directories with the @option{-nostdinc} option. This is useful when you are compiling an operating system kernel or some other program that does not use the standard C library facilities, or the standard C library itself. @option{-I} options are not ignored as described above when @option{-nostdinc} is in effect. GCC looks for headers requested with @code{@w{#include "@var{file}"}} first in the directory containing the current file, then in the @dfn{quote directories} specified by @option{-iquote} options, then in the same places it looks for a system header. For example, if @file{/usr/include/sys/stat.h} contains @code{@w{#include "types.h"}}, GCC looks for @file{types.h} first in @file{/usr/include/sys}, then in the quote directories and then in its usual search path. @code{#line} (@pxref{Line Control}) does not change GCC's idea of the directory containing the current file. @cindex quote directories The @option{-I-} is an old-fashioned, deprecated way to specify the quote directories. To look for headers in a directory named @file{-}, specify @option{-I./-}. There are several more ways to adjust the header search path. @xref{invocation, Invoking GCC, Invoking GCC, gcc, Using the GNU Compiler Collection}. @node Once-Only Headers @subsection Once-Only Headers @cindex repeated inclusion @cindex including just once @cindex wrapper @code{#ifndef} If a header file happens to be included twice, the compiler will process its contents twice. This is very likely to cause an error, e.g.@: when the compiler sees the same structure definition twice. The standard way to prevent this is to enclose the entire real contents of the file in a conditional, like this: @example @group /* File foo. */ #ifndef FILE_FOO_SEEN #define FILE_FOO_SEEN @var{the entire file} #endif /* !FILE_FOO_SEEN */ @end group @end example This construct is commonly known as a @dfn{wrapper #ifndef}. When the header is included again, the conditional will be false, because @code{FILE_FOO_SEEN} is defined. Preprocessing skips over the entire contents of the file, so that compilation will never ``see'' the file contents twice in one module. GCC optimizes this case even further. It remembers when a header file has a wrapper @code{#ifndef}. If a subsequent @code{#include} specifies that header, and the macro in the @code{#ifndef} is still defined, it does not bother to rescan the file at all. You can put comments in the header file outside the wrapper. They do not interfere with this optimization. @cindex controlling macro @cindex guard macro The macro @code{FILE_FOO_SEEN} is called the @dfn{controlling macro} or @dfn{guard macro}. In a user header file, the macro name should not begin with @samp{_}. In a system header file, it should begin with @samp{__} (or @samp{_} followed by an upper-case letter) to avoid conflicts with user programs. In any kind of header file, the macro name should contain the name of the file and some additional text, to avoid conflicts with other header files. @node Computed Includes @subsection Computed Includes @cindex computed includes @cindex macros in include Sometimes it is necessary to select one of several different header files to be included into your program. They might specify configuration parameters to be used on different sorts of operating systems, for instance. You could do this with a series of conditionals, @example #if SYSTEM_1 # include "system_1.h" #elif SYSTEM_2 # include "system_2.h" #elif SYSTEM_3 /* @r{@dots{}} */ #endif @end example That rapidly becomes tedious. Instead, GNU C offers the ability to use a macro for the header name. This is called a @dfn{computed include}. Instead of writing a header name as the direct argument of @code{#include}, you simply put a macro name there instead: @example #define SYSTEM_H "system_1.h" /* @r{@dots{}} */ #include SYSTEM_H @end example @noindent @code{SYSTEM_H} is expanded, then @file{system_1.h} is included as if the @code{#include} had been written with that name. @code{SYSTEM_H} could be defined by your Makefile with a @option{-D} option. You must be careful when you define such a macro. @code{#define} saves tokens, not text. GCC has no way of knowing that the macro will be used as the argument of @code{#include}, so it generates ordinary tokens, not a header name. This is unlikely to cause problems if you use double-quote includes, which are syntactically similar to string constants. If you use angle brackets, however, you may have trouble. The syntax of a computed include is actually a bit more general than the above. If the first non-whitespace character after @code{#include} is not @samp{"} or @samp{<}, then the entire line is macro-expanded like running text would be. If the line expands to a single string constant, the contents of that string constant are the file to be included. Preprocessing does not re-examine the string for embedded quotes, but neither does it process backslash escapes in the string. Therefore @example #define HEADER "a\"b" #include HEADER @end example @noindent looks for a file named @file{a\"b}. Preprocessing searches for the file according to the rules for double-quoted includes. If the line expands to a token stream beginning with a @samp{<} token and including a @samp{>} token, then the tokens between the @samp{<} and the first @samp{>} are combined to form the filename to be included. Any whitespace between tokens is reduced to a single space; then any space after the initial @samp{<} is retained, but a trailing space before the closing @samp{>} is ignored. Preprocessing searches for the file according to the rules for angle-bracket includes. In either case, if there are any tokens on the line after the file name, an error occurs and the directive is not processed. It is also an error if the result of expansion does not match either of the two expected forms. These rules are implementation-defined behavior according to the C standard. To minimize the risk of different compilers interpreting your computed includes differently, we recommend you use only a single object-like macro that expands to a string constant. That also makes it clear to people reading your program. @node Macros @section Macros @cindex macros A @dfn{macro} is a fragment of code that has been given a name. Whenever the name is used, it is replaced by the contents of the macro. There are two kinds of macros. They differ mostly in what they look like when they are used. @dfn{Object-like} macros resemble data objects when used, @dfn{function-like} macros resemble function calls. You may define any valid identifier as a macro, even if it is a C keyword. In the preprocessing stage, GCC does not know anything about keywords. This can be useful if you wish to hide a keyword such as @code{const} from an older compiler that does not understand it. However, the preprocessing operator @code{defined} (@pxref{defined}) can never be defined as a macro, and C@code{++}'s named operators (@pxref{C++ Named Operators, C++ Named Operators, C++ Named Operators, gcc, Using the GNU Compiler Collection}) cannot be macros when compiling C@code{++} code. The operator @code{#} is used in macros for stringification of an argument (@pxref{Stringification}), and @code{##} is used for concatenation of arguments into larger tokens (@pxref{Concatenation}) @menu * Object-like Macros:: * Function-like Macros:: @c * Macro Pragmas:: * Macro Arguments:: * Stringification:: * Concatenation:: * Variadic Macros:: * Predefined Macros:: * Undefining and Redefining Macros:: * Directives Within Macro Arguments:: * Macro Pitfalls:: @end menu @node Object-like Macros @subsection Object-like Macros @cindex object-like macro @cindex symbolic constants @cindex manifest constants An @dfn{object-like macro} is a simple identifier that will be replaced by a code fragment. It is called object-like because in most cases the use of the macro looks like reference to a data object in code that uses it. These macros are most commonly used to give symbolic names to numeric constants. @findex #define The way to define macros with the @code{#define} directive. @code{#define} is followed by the name of the macro and then the token sequence it should be an abbreviation for, which is variously referred to as the macro's @dfn{body}, @dfn{expansion} or @dfn{replacement list}. For example, @example #define BUFFER_SIZE 1024 @end example @noindent defines a macro named @code{BUFFER_SIZE} as an abbreviation for the token @code{1024}. If somewhere after this @code{#define} directive there comes a C statement of the form @example foo = (char *) malloc (BUFFER_SIZE); @end example @noindent then preprocessing will recognize and @dfn{expand} the macro @code{BUFFER_SIZE}, so that compilation will see the tokens: @example foo = (char *) malloc (1024); @end example By convention, macro names are written in upper case. Programs are easier to read when it is possible to tell at a glance which names are macros. Macro names that start with @samp{__} are reserved for internal uses, and many of them are defined automatically, so don't define such macro names unless you really know what you're doing. Likewise for macro names that start with @samp{_} and an upper-case letter. The macro's body ends at the end of the @code{#define} line. You may continue the definition onto multiple lines, if necessary, using backslash-newline. When the macro is expanded, however, it will all come out on one line. For example, @example #define NUMBERS 1, \ 2, \ 3 int x[] = @{ NUMBERS @}; @expansion{} int x[] = @{ 1, 2, 3 @}; @end example @noindent The most common visible consequence of this is surprising line numbers in error messages. There is no restriction on what can go in a macro body provided it decomposes into valid preprocessing tokens. Parentheses need not balance, and the body need not resemble valid C code. (If it does not, you may get error messages from the C compiler when you use the macro.) Preprocessing scans the program sequentially. A macro definition takes effect right after its appearance. Therefore, the following input @example foo = X; #define X 4 bar = X; @end example @noindent produces @example foo = X; bar = 4; @end example When preprocessing expands a macro name, the macro's expansion replaces the macro invocation, then the expansion is examined for more macros to expand. For example, @example @group #define TABLESIZE BUFSIZE #define BUFSIZE 1024 TABLESIZE @expansion{} BUFSIZE @expansion{} 1024 @end group @end example @noindent @code{TABLESIZE} is expanded first to produce @code{BUFSIZE}, then that macro is expanded to produce the final result, @code{1024}. Notice that @code{BUFSIZE} was not defined when @code{TABLESIZE} was defined. The @code{#define} for @code{TABLESIZE} uses exactly the expansion you specify---in this case, @code{BUFSIZE}---and does not check to see whether it too contains macro names. Only when you @emph{use} @code{TABLESIZE} is the result of its expansion scanned for more macro names. This makes a difference if you change the definition of @code{BUFSIZE} at some point in the source file. @code{TABLESIZE}, defined as shown, will always expand using the definition of @code{BUFSIZE} that is currently in effect: @example #define BUFSIZE 1020 #define TABLESIZE BUFSIZE #undef BUFSIZE #define BUFSIZE 37 @end example @noindent Now @code{TABLESIZE} expands (in two stages) to @code{37}. If the expansion of a macro contains its own name, either directly or via intermediate macros, it is not expanded again when the expansion is examined for more macros. This prevents infinite recursion. @xref{Self-Referential Macros}, for the precise details. @node Function-like Macros @subsection Function-like Macros @cindex function-like macros You can also define macros whose use looks like a function call. These are called @dfn{function-like macros}. To define one, use the @code{#define} directive with a pair of parentheses immediately after the macro name. For example, @example #define lang_init() c_init () lang_init () @expansion{} c_init () lang_init () @expansion{} c_init () lang_init() @expansion{} c_init () @end example There must be no space between the macro name and the following open-parenthesis in the the @code{#define} directive; that's what indicates you're defining a function-like macro. However, you can add unnecessary whitespace around the open-parenthesis (and around the close-parenthesis) when you @emph{call} the macro; they don't change anything. A function-like macro is expanded only when its name appears with a pair of parentheses after it. If you write just the name, without parentheses, it is left alone. This can be useful when you have a function and a macro of the same name, and you wish to use the function sometimes. Whitespace and line breaks before or between the parentheses are ignored when the macro is called. @example extern void foo(void); #define foo() /* @r{optimized inline version} */ /* @r{@dots{}} */ foo(); funcptr = foo; @end example Here the call to @code{foo()} expands the macro, but the function pointer @code{funcptr} gets the address of the real function @code{foo}. If the macro were to be expanded there, it would cause a syntax error. If you put spaces between the macro name and the parentheses in the macro definition, that does not define a function-like macro, it defines an object-like macro whose expansion happens to begin with a pair of parentheses. Here is an example: @example #define lang_init () c_init() lang_init() @expansion{} () c_init()() @end example The first two pairs of parentheses in this expansion come from the macro. The third is the pair that was originally after the macro invocation. Since @code{lang_init} is an object-like macro, it does not consume those parentheses. Any name can have at most one macro definition at a time. Thus, you can't define the same name as an object-like macro and a function-like macro at once. @node Macro Arguments @subsection Macro Arguments @cindex arguments @cindex macros with arguments @cindex arguments in macro definitions Function-like macros can take @dfn{arguments}, just like true functions. To define a macro that uses arguments, you insert @dfn{parameters} between the pair of parentheses in the macro definition that make the macro function-like. The parameters must be valid C identifiers, separated by commas and optionally whitespace. To invoke a macro that takes arguments, you write the name of the macro followed by a list of @dfn{actual arguments} in parentheses, separated by commas. The invocation of the macro need not be restricted to a single logical line---it can cross as many lines in the source file as you wish. The number of arguments you give must match the number of parameters in the macro definition. When the macro is expanded, each use of a parameter in its body is replaced by the tokens of the corresponding argument. (The macro body is not required to use all of the parameters.) As an example, here is a macro that computes the minimum of two numeric values, as it is defined in many C programs, and some uses. @example #define min(X, Y) ((X) < (Y) ? (X) : (Y)) x = min(a, b); @expansion{} x = ((a) < (b) ? (a) : (b)); y = min(1, 2); @expansion{} y = ((1) < (2) ? (1) : (2)); z = min(a+28, *p); @expansion{} z = ((a+28) < (*p) ? (a+28) : (*p)); @end example @noindent In this small example you can already see several of the dangers of macro arguments. @xref{Macro Pitfalls}, for detailed explanations. Leading and trailing whitespace in each argument is dropped, and all whitespace between the tokens of an argument is reduced to a single space. Parentheses within each argument must balance; a comma within such parentheses does not end the argument. However, there is no requirement for square brackets or braces to balance, and they do not prevent a comma from separating arguments. Thus, @example macro (array[x = y, x + 1]) @end example @noindent passes two arguments to @code{macro}: @code{array[x = y} and @code{x + 1]}. If you want to supply @code{array[x = y, x + 1]} as an argument, you can write it as @code{array[(x = y, x + 1)]}, which is equivalent C code. However, putting an assignment inside an array subscript is to be avoided anyway. All arguments to a macro are completely macro-expanded before they are substituted into the macro body. After substitution, the complete text is scanned again for macros to expand, including the arguments. This rule may seem strange, but it is carefully designed so you need not worry about whether any function call is actually a macro invocation. You can run into trouble if you try to be too clever, though. @xref{Argument Prescan}, for detailed discussion. For example, @code{min (min (a, b), c)} is first expanded to @example min (((a) < (b) ? (a) : (b)), (c)) @end example @noindent and then to @example @group ((((a) < (b) ? (a) : (b))) < (c) ? (((a) < (b) ? (a) : (b))) : (c)) @end group @end example @noindent (The line breaks shown here for clarity are not actually generated.) @cindex empty macro arguments You can leave macro arguments empty without error, but many macros will then expand to invalid code. You cannot leave out arguments entirely; if a macro takes two arguments, there must be exactly one comma at the top level of its argument list. Here are some silly examples using @code{min}: @smallexample min(, b) @expansion{} (( ) < (b) ? ( ) : (b)) min(a, ) @expansion{} ((a ) < ( ) ? (a ) : ( )) min(,) @expansion{} (( ) < ( ) ? ( ) : ( )) min((,),) @expansion{} (((,)) < ( ) ? ((,)) : ( )) min() @error{} macro "min" requires 2 arguments, but only 1 given min(,,) @error{} macro "min" passed 3 arguments, but takes just 2 @end smallexample Whitespace is not a preprocessing token, so if a macro @code{foo} takes one argument, @code{@w{foo ()}} and @code{@w{foo ( )}} both supply it an empty argument. @ignore @c How long ago was this? Previous GNU preprocessor implementations and documentation were incorrect on this point, insisting that a function-like macro that takes a single argument be passed a space if an empty argument was required. @end ignore Macro parameters appearing inside string literals are not replaced by their corresponding actual arguments. @example #define foo(x) x, "x" foo(bar) @expansion{} bar, "x" @end example @noindent See the next subsection for how to insert macro arguments into a string literal. The token following the macro call and the last token of the macro expansion do not become one token even if it looks like they could: @example #define foo() abc foo()def @expansion{} abc def @end example @node Stringification @subsection Stringification @cindex stringification @cindex @code{#} operator Sometimes you may want to convert a macro argument into a string constant. Parameters are not replaced inside string constants, but you can use the @code{#} preprocessing operator instead. When a macro parameter is used with a leading @code{#}, preprocessing replaces it with the literal text of the actual argument, converted to a string constant. Unlike normal parameter replacement, the argument is not macro-expanded first. This is called @dfn{stringification}. There is no way to combine an argument with surrounding text and stringify it all together. But you can write a series of string constants and stringified arguments. After preprocessing replaces the stringified arguments with string constants, the consecutive string constants will be concatenated into one long string constant (@pxref{String Constants}). Here is an example that uses stringification and concatenation of string constants: @example @group #define WARN_IF(EXP) \ do @{ if (EXP) \ fprintf (stderr, "Warning: " #EXP "\n"); @} \ while (0) WARN_IF (x == 0); @expansion{} do @{ if (x == 0) fprintf (stderr, "Warning: " "x == 0" "\n"); @} while (0); @end group @end example @noindent The argument for @code{EXP} is substituted once, as is, into the @code{if} statement, and once, stringified, into the argument to @code{fprintf}. If @code{x} were a macro, it would be expanded in the @code{if} statement but not in the string. The @code{do} and @code{while (0)} are a kludge to make it possible to write @code{WARN_IF (@var{arg});}. The resemblance of @code{WARN_IF} to a function makes that a natural way to write it. @xref{Swallowing the Semicolon}. Stringification in C involves more than putting double-quote characters around the fragment. It also backslash-escapes the quotes surrounding embedded string constants, and all backslashes within string and character constants, in order to get a valid C string constant with the proper contents. Thus, stringifying @code{@w{p = "foo\n";}} results in @t{@w{"p = \"foo\\n\";"}}. However, backslashes that are not inside string or character constants are not duplicated: @samp{\n} by itself stringifies to @t{"\n"}. All leading and trailing whitespace in text being stringified is ignored. Any sequence of whitespace in the middle of the text is converted to a single space in the stringified result. Comments are replaced by whitespace long before stringification happens, so they never appear in stringified text. There is no way to convert a macro argument into a character constant. To stringify the result of expansion of a macro argument, you have to use two levels of macros, like this: @example #define xstr(S) str(S) #define str(s) #s #define foo 4 str (foo) @expansion{} "foo" xstr (foo) @expansion{} xstr (4) @expansion{} str (4) @expansion{} "4" @end example @code{s} is stringified when it is used in @code{str}, so it is not macro-expanded first. But @code{S} is an ordinary argument to @code{xstr}, so it is completely macro-expanded before @code{xstr} itself is expanded (@pxref{Argument Prescan}). Therefore, by the time @code{str} gets to its argument text, that text already been macro-expanded. @node Concatenation @subsection Concatenation @cindex concatenation @cindex token pasting @cindex token concatenation @cindex @code{##} operator It is often useful to merge two tokens into one while expanding macros. This is called @dfn{token pasting} or @dfn{token concatenation}. The @code{##} preprocessing operator performs token pasting. When a macro is expanded, the two tokens on either side of each @code{##} operator are combined into a single token, which then replaces the @code{##} and the two original tokens in the macro expansion. Usually both will be identifiers, or one will be an identifier and the other a preprocessing number. When pasted, they make a longer identifier. Concatenation into an identifier isn't the only valid case. It is also possible to concatenate two numbers (or a number and a name, such as @code{1.5} and @code{e3}) into a number. Also, multi-character operators such as @code{+=} can be formed by token pasting. However, two tokens that don't together form a valid token cannot be pasted together. For example, you cannot concatenate @code{x} with @code{+}, not in either order. Trying this issues a warning and keeps the two tokens separate. Whether it puts white space between the tokens is undefined. It is common to find unnecessary uses of @code{##} in complex macros. If you get this warning, it is likely that you can simply remove the @code{##}. The tokens combined by @code{##} could both come from the macro body, but then you could just as well write them as one token in the first place. Token pasting is useful when one or both of the tokens comes from a macro argument. If either of the tokens next to an @code{##} is a parameter name, it is replaced by its actual argument before @code{##} executes. As with stringification, the actual argument is not macro-expanded first. If the argument is empty, that @code{##} has no effect. Keep in mind that preprocessing converts comments to whitespace before it looks for uses of macros. Therefore, you cannot create a comment by concatenating @samp{/} and @samp{*}. You can put as much whitespace between @code{##} and its operands as you like, including comments, and you can put comments in arguments that will be concatenated. It is an error to use @code{##} at the beginning or end of a macro body. Multiple @code{##} operators are handled left-to-right, so that @samp{1 ## e ## -2} pastes into @samp{1e-2}. (Right-to-left processing would first generate @samp{e-2}, which is an invalid token.) When @code{#} and @code{##} are used together, they are all handled left-to-right. Consider a C program that interprets named commands. There probably needs to be a table of commands, perhaps an array of structures declared as follows: @example @group struct command @{ char *name; void (*function) (void); @}; @end group @group struct command commands[] = @{ @{ "quit", quit_command @}, @{ "help", help_command @}, /* @r{@dots{}} */ @}; @end group @end example It would be cleaner not to have to write each command name twice, once in the string constant and once in the function name. A macro that takes the name of a command as an argument can make this unnecessary. It can create the string constant with stringification, and the function name by concatenating the argument with @samp{_command}. Here is how it is done: @example #define COMMAND(NAME) @{ #NAME, NAME ## _command @} struct command commands[] = @{ COMMAND (quit), COMMAND (help), /* @r{@dots{}} */ @}; @end example @node Variadic Macros @subsection Variadic Macros @cindex variable number of arguments @cindex macros with variable arguments @cindex variadic macros A macro can be declared to accept a variable number of arguments much as a function can. The syntax for defining the macro is similar to that of a function. Here is an example: @example #define eprintf(@dots{}) fprintf (stderr, __VA_ARGS__) @end example This kind of macro is called @dfn{variadic}. When the macro is invoked, all the tokens in its argument list after the last named argument (this macro has none), including any commas, become the @dfn{variable argument}. This sequence of tokens replaces the identifier @code{@w{__VA_ARGS__}} in the macro body wherever it appears. Thus, we have this expansion: @example eprintf ("%s:%d: ", input_file, lineno) @expansion{} fprintf (stderr, "%s:%d: ", input_file, lineno) @end example The variable argument is completely macro-expanded before it is inserted into the macro expansion, just like an ordinary argument. You may use the @code{#} and @code{##} operators to stringify the variable argument or to paste its leading or trailing token with another token. (But see below for an important special case for @code{##}.) @strong{Warning:} don't use the identifier @code{@w{__VA_ARGS__}} for anything other than this. If your macro is complicated, you may want a more descriptive name for the variable argument than @code{@w{__VA_ARGS__}}. You can write an argument name immediately before the @samp{@dots{}}; that name is used for the variable argument.@footnote{GNU C extension.} The @code{eprintf} macro above could be written thus: @example #define eprintf(args@dots{}) fprintf (stderr, args) @end example A variadic macro can have named arguments as well as variable arguments, so @code{eprintf} can be defined like this, instead: @example #define eprintf(format, @dots{}) \ fprintf (stderr, format, __VA_ARGS__) @end example @noindent This formulation is more descriptive, but what if you want to specify a format string that takes no arguments? In GNU C, you can omit the comma before the variable arguments if they are empty, but that puts an extra comma in the expansion: @example eprintf ("success!\n") @expansion{} fprintf(stderr, "success!\n", ) @end example @noindent That's an error in the call to @code{fprintf}. To get rid of that comma, the @code{##} token paste operator has a special meaning when placed between a comma and a variable argument.@footnote{GNU C extension.} If you write @example #define eprintf(format, @dots{}) \ fprintf (stderr, format, ##__VA_ARGS__) @end example @noindent then use the macro @code{eprintf} with empty variable arguments, @code{##} deletes the preceding comma. @example eprintf ("success!\n") @expansion{} fprintf(stderr, "success!\n") @end example @noindent This does @emph{not} happen if you pass an empty argument, nor does it happen if the token preceding @code{##} is anything other than a comma. @noindent When the only macro parameter is a variable arguments parameter, and the macro call has no argument at all, it is not obvious whether that means an empty argument or a missing argument. Should the comma be kept, or deleted? The C standard says to keep the comma, but the preexisting GNU C extension deleted the comma. Nowadays, GNU C retains the comma when implementing a specific C standard, and deletes it otherwise. C99 mandates that the only place the identifier @code{@w{__VA_ARGS__}} can appear is in the replacement list of a variadic macro. It may not be used as a macro name, macro parameter name, or within a different type of macro. It may also be forbidden in open text; the standard is ambiguous. We recommend you avoid using that name except for its special purpose. Variadic macros where you specify the parameter name is a GNU C feature that has been supported for a long time. Standard C, as of C99, supports only the form where the parameter is called @code{@w{__VA_ARGS__}}. For portability to previous versions of GNU C you should use only named variable argument parameters. On the other hand, for portability to other C99 compilers, you should use only @code{@w{__VA_ARGS__}}. @node Predefined Macros @subsection Predefined Macros @cindex predefined macros Several object-like macros are predefined; you use them without supplying their definitions. Here we explain the ones user programs often need to use. Many other macro names starting with @samp{__} are predefined; in general, you should not define such macro names yourself. @table @code @item __FILE__ This macro expands to the name of the current input file, in the form of a C string constant. This is the full name by which the GCC opened the file, not the short name specified in @code{#include} or as the input file name argument. For example, @code{"/usr/local/include/myheader.h"} is a possible expansion of this macro. @item __LINE__ This macro expands to the current input line number, in the form of a decimal integer constant. While we call it a predefined macro, it's a pretty strange macro, since its ``definition'' changes with each new line of source code. @item __func__ @itemx __FUNCTION__ These names are like variables that have as value a string containing the name of the current function definition. They are not really macros, but this is the best place to mention them. @code{__FUNCTION__} is the name that has been defined in GNU C since time immemorial; @code{__func__} is defined by the C standard. With the following conditionals, you can use whichever one is defined. @example #if __STDC_VERSION__ < 199901L # if __GNUC__ >= 2 # define __func__ __FUNCTION__ # else # define __func__ "" # endif #endif @end example @item __PRETTY_FUNCTION__ This is equivalent to @code{__FUNCTION__} in C, but in C@code{++} the string includes argument type information as well. It is a GNU C extension. @end table Those features are useful in generating an error message to report an inconsistency detected by the program; the message can state the source line where the inconsistency was detected. For example, @example fprintf (stderr, "Internal error: " "negative string length " "in function %s " "%d at %s, line %d.", __func__, length, __FILE__, __LINE__); @end example A @code{#line} directive changes @code{__LINE__}, and may change @code{__FILE__} as well. @xref{Line Control}. @table @code @item __DATE__ This macro expands to a string constant that describes the date of compilation. The string constant contains eleven characters and looks like @code{@w{"Feb 12 1996"}}. If the day of the month is just one digit, an extra space precedes it so that the date is always eleven characters. If the compiler cannot determine the current date, it emits a warning messages (once per compilation) and @code{__DATE__} expands to @code{@w{"??? ?? ????"}}. We deprecate the use of @code{__DATE__} for the sake of reproducible compilation. @item __TIME__ This macro expands to a string constant that describes the time of compilation. The string constant contains eight characters and looks like @code{"23:59:01"}. If the compiler cannot determine the current time, it emits a warning message (once per compilation) and @code{__TIME__} expands to @code{"??:??:??"}. We deprecate the use of @code{__TIME__} for the sake of reproducible compilation. @item __STDC__ In normal operation, this macro expands to the constant 1, to signify that this compiler implements ISO Standard C@. @item __STDC_VERSION__ This macro expands to the C Standard's version number, a long integer constant of the form @code{@var{yyyy}@var{mm}L} where @var{yyyy} and @var{mm} are the year and month of the Standard version. This states which version of the C Standard the compiler implements. The current default value is @code{201112L}, which signifies the C 2011 standard. @item __STDC_HOSTED__ This macro is defined, with value 1, if the compiler's target is a @dfn{hosted environment}. A hosted environment provides the full facilities of the standard C library. @end table The rest of the predefined macros are GNU C extensions. @table @code @item __COUNTER__ This macro expands to sequential integral values starting from 0. In other words, each time the program uses this macro, it generates the next successive integer. This, with the @code{##} operator, provides a convenient means for macros to generate unique identifiers. @item __GNUC__ @itemx __GNUC_MINOR__ @itemx __GNUC_PATCHLEVEL__ These macros expand to the major version, minor version, and patch level of the compiler, as integer constants. For example, GCC 3.2.1 expands @code{__GNUC__} to 3, @code{__GNUC_MINOR__} to 2, and @code{__GNUC_PATCHLEVEL__} to 1. If all you need to know is whether or not your program is being compiled by GCC, or a non-GCC compiler that claims to accept the GNU C extensions, you can simply test @code{__GNUC__}. If you need to write code that depends on a specific version, you must check more carefully. Each change in the minor version resets the patch level to zero; each change in the major version (which happens rarely) resets the minor version and the patch level to zero. To use the predefined macros directly in the conditional, write it like this: @example /* @r{Test for version 3.2.0 or later.} */ #if __GNUC__ > 3 || \ (__GNUC__ == 3 && (__GNUC_MINOR__ > 2 || \ (__GNUC_MINOR__ == 2 && \ __GNUC_PATCHLEVEL__ > 0)) @end example @noindent Another approach is to use the predefined macros to calculate a single number, then compare that against a threshold: @example #define GCC_VERSION (__GNUC__ * 10000 \ + __GNUC_MINOR__ * 100 \ + __GNUC_PATCHLEVEL__) /* @r{@dots{}} */ /* @r{Test for GCC > 3.2.0} */ #if GCC_VERSION > 30200 @end example @noindent Many people find this form easier to understand. @item __VERSION__ This macro expands to a string constant that describes the version of the compiler in use. You should not rely on its contents' having any particular form, but you can count on it to contain at least the release number. @item __TIMESTAMP__ This macro expands to a string constant that describes the date and time of the last modification of the current source file. The string constant contains abbreviated day of the week, month, day of the month, time in hh:mm:ss form, and the year, in the format @code{@w{"Sun Sep 16 01:03:52 1973"}}. If the day of the month is less than 10, it is padded with a space on the left. If GCC cannot determine that information date, it emits a warning message (once per compilation) and @code{__TIMESTAMP__} expands to @code{@w{"??? ??? ?? ??:??:?? ????"}}. We deprecate the use of this macro for the sake of reproducible compilation. @end table @node Undefining and Redefining Macros @subsection Undefining and Redefining Macros @cindex undefining macros @cindex redefining macros @findex #undef You can @dfn{undefine} a macro with the @code{#undef} directive. @code{#undef} takes a single argument, the name of the macro to undefine. You use the bare macro name, even if the macro is function-like. It is an error if anything appears on the line after the macro name. @code{#undef} has no effect if the name is not a macro. @example #define FOO 4 x = FOO; @expansion{} x = 4; #undef FOO x = FOO; @expansion{} x = FOO; @end example Once a macro has been undefined, that identifier may be @dfn{redefined} as a macro by a subsequent @code{#define} directive. The new definition need not have any resemblance to the old definition. You can define a macro again without first undefining it only if the new definition is @dfn{effectively the same} as the old one. Two macro definitions are effectively the same if: @itemize @bullet @item Both are the same type of macro (object- or function-like). @item All the tokens of the replacement list are the same. @item If there are any parameters, they are the same. @item Whitespace appears in the same places in both. It need not be exactly the same amount of whitespace, though. Remember that comments count as whitespace. @end itemize @noindent These definitions are effectively the same: @example #define FOUR (2 + 2) #define FOUR (2 + 2) #define FOUR (2 /* @r{two} */ + 2) @end example @noindent but these are not: @example #define FOUR (2 + 2) #define FOUR ( 2+2 ) #define FOUR (2 * 2) #define FOUR(score,and,seven,years,ago) (2 + 2) @end example This allows two different header files to define a common macro. You can redefine an existing macro with #define, but redefining an existing macro name with a different definition results in a warning. @node Directives Within Macro Arguments @subsection Directives Within Macro Arguments @cindex macro arguments and directives GNU C permits and handles preprocessing directives in the text provided as arguments for a macro. That case is undefined in the C standard. but in GNU C@ conditional directives in macro arguments are clear and valid. A paradoxical case is to redefine a macro within the call to that same macro. What happens is, the new definition takes effect in time for pre-expansion of @emph{all} the arguments, then the original definition is expanded to replace the call. Here is a pathological example: @example #define f(x) x x f (first f second #undef f #define f 2 f) @end example @noindent which expands to @example first 2 second 2 first 2 second 2 @end example @noindent with the semantics described above. We suggest you avoid writing code which does this sort of thing. @node Macro Pitfalls @subsection Macro Pitfalls @cindex problems with macros @cindex pitfalls of macros In this section we describe some special rules that apply to macros and macro expansion, and point out certain cases in which the rules have counter-intuitive consequences that you must watch out for. @menu * Misnesting:: * Operator Precedence Problems:: * Swallowing the Semicolon:: * Duplication of Side Effects:: * Macros and Auto Type:: * Self-Referential Macros:: * Argument Prescan:: @end menu @node Misnesting @subsubsection Misnesting When a macro is called with arguments, the arguments are substituted into the macro body and the result is checked, together with the rest of the input file, for more macro calls. It is possible to piece together a macro call coming partially from the macro body and partially from the arguments. For example, @example #define twice(x) (2*(x)) #define call_with_1(x) x(1) call_with_1 (twice) @expansion{} twice(1) @expansion{} (2*(1)) @end example Macro definitions do not have to have balanced parentheses. By writing an unbalanced open parenthesis in a macro body, it is possible to create a macro call that begins inside the macro body but ends outside of it. For example, @example #define strange(file) fprintf (file, "%s %d", /* @r{@dots{}} */ strange(stderr) p, 35) @expansion{} fprintf (stderr, "%s %d", p, 35) @end example The ability to piece together a macro call can be useful, but the use of unbalanced open parentheses in a macro body is just confusing, and should be avoided. @node Operator Precedence Problems @subsubsection Operator Precedence Problems @cindex parentheses in macro bodies You may have noticed that in most of the macro definition examples shown above, each occurrence of a macro parameter name had parentheses around it. In addition, another pair of parentheses usually surrounds the entire macro definition. Here is why it is best to write macros that way. Suppose you define a macro as follows, @example #define ceil_div(x, y) (x + y - 1) / y @end example @noindent whose purpose is to divide, rounding up. (One use for this operation is to compute how many @code{int} objects are needed to hold a certain number of @code{char} objects.) Then suppose it is used as follows: @example a = ceil_div (b & c, sizeof (int)); @expansion{} a = (b & c + sizeof (int) - 1) / sizeof (int); @end example @noindent This does not do what is intended. The operator-precedence rules of C make it equivalent to this: @example a = (b & (c + sizeof (int) - 1)) / sizeof (int); @end example @noindent What we want is this: @example a = ((b & c) + sizeof (int) - 1)) / sizeof (int); @end example @noindent Defining the macro as @example #define ceil_div(x, y) ((x) + (y) - 1) / (y) @end example @noindent provides the desired result. Unintended grouping can result in another way. Consider @code{sizeof ceil_div(1, 2)}. That has the appearance of a C expression that would compute the size of the type of @code{ceil_div (1, 2)}, but in fact it means something very different. Here is what it expands to: @example sizeof ((1) + (2) - 1) / (2) @end example @noindent This would take the size of an integer and divide it by two. The precedence rules have put the division outside the @code{sizeof} when it was intended to be inside. Parentheses around the entire macro definition prevent such problems. Here, then, is the recommended way to define @code{ceil_div}: @example #define ceil_div(x, y) (((x) + (y) - 1) / (y)) @end example @node Swallowing the Semicolon @subsubsection Swallowing the Semicolon @cindex semicolons (after macro calls) Often it is desirable to define a macro that expands into a compound statement. Consider, for example, the following macro, that advances a pointer (the parameter @code{p} says where to find it) across whitespace characters: @example #define SKIP_SPACES(p, limit) \ @{ char *lim = (limit); \ while (p < lim) @{ \ if (*p++ != ' ') @{ \ p--; break; @}@}@} @end example @noindent Here backslash-newline is used to split the macro definition, which must be a single logical line, so that it resembles the way such code would be laid out if not part of a macro definition. A call to this macro might be @code{SKIP_SPACES (p, lim)}. Strictly speaking, the call expands to a compound statement, which is a complete statement with no need for a semicolon to end it. However, since it looks like a function call, it minimizes confusion if you can use it like a function call, writing a semicolon afterward, as in @code{SKIP_SPACES (p, lim);} This can cause trouble before @code{else} statements, because the semicolon is actually a null statement. Suppose you write @example if (*p != 0) SKIP_SPACES (p, lim); else /* @r{@dots{}} */ @end example @noindent The presence of two statements---the compound statement and a null statement---in between the @code{if} condition and the @code{else} makes invalid C code. The definition of the macro @code{SKIP_SPACES} can be altered to solve this problem, using a @code{do @r{@dots{}} while} statement. Here is how: @example #define SKIP_SPACES(p, limit) \ do @{ char *lim = (limit); \ while (p < lim) @{ \ if (*p++ != ' ') @{ \ p--; break; @}@}@} \ while (0) @end example Now @code{SKIP_SPACES (p, lim);} expands into @example do @{ /* @r{@dots{}} */ @} while (0); @end example @noindent which is one statement. The loop executes exactly once; most compilers generate no extra code for it. @node Duplication of Side Effects @subsubsection Duplication of Side Effects @cindex side effects (in macro arguments) @cindex unsafe macros Many C programs define a macro @code{min}, for ``minimum'', like this: @example #define min(X, Y) ((X) < (Y) ? (X) : (Y)) @end example When you use this macro with an argument containing a side effect, as shown here, @example next = min (x + y, foo (z)); @end example @noindent it expands as follows: @example next = ((x + y) < (foo (z)) ? (x + y) : (foo (z))); @end example @noindent where @code{x + y} has been substituted for @code{X} and @code{foo (z)} for @code{Y}. The function @code{foo} is used only once in the statement as it appears in the program, but the expression @code{foo (z)} has been substituted twice into the macro expansion. As a result, @code{foo} might be called twice when the statement is executed. If it has side effects or if it takes a long time to compute, that may be undesirable. We say that @code{min} is an @dfn{unsafe} macro. The best solution to this problem is to define @code{min} in a way that computes the value of @code{foo (z)} only once. In general, that requires using @code{__auto_type} (@pxref{Auto Type}). How to use it for this is described in the following section. @xref{Macros and Auto Type}. Otherwise, you will need to be careful when @emph{using} the macro @code{min}. For example, you can calculate the value of @code{foo (z)}, save it in a variable, and use that variable in @code{min}: @example @group #define min(X, Y) ((X) < (Y) ? (X) : (Y)) /* @r{@dots{}} */ @{ int tem = foo (z); next = min (x + y, tem); @} @end group @end example @noindent (where we assume that @code{foo} returns type @code{int}). When the repeated value appears as the condition of the @code{?:} operator and again as its @var{iftrue} expression, you can avoid repeated execution by omitting the @var{iftrue} expression, like this: @example #define x_or_y(X, Y) ((X) ? : (Y)) @end example @noindent In GNU C, this expands to use the first macro argument's value if that isn't zero. If that's zero, it compiles the second argument and uses that value. @xref{Conditional Expression}. @node Macros and Auto Type @subsubsection Using @code{__auto_type} for Local Variables @cindex local variables in macros @cindex variables, local, in macros @cindex macros, local variables in The operator @code{__auto_type} makes it possible to define macros that can work on any data type even though they need to generate local variable declarations. @xref{Auto Type}. For instance, here's how to define a safe ``maximum'' macro that operates on any arithmetic type and computes each of its arguments exactly once: @example #define max(a,b) \ (@{ __auto_type _a = (a); \ __auto_type _b = (b); \ _a > _b ? _a : _b; @}) @end example The @samp{(@{ @dots{} @})} notation produces @dfn{statement expression}---a statement that can be used as an expression (@pxref{Statement Exprs}). Its value is the value of its last statement. This permits us to define local variables and store each argument value into one. @cindex underscores in variables in macros @cindex @samp{_} in variables in macros The reason for using names that start with underscores for the local variables is to avoid conflicts with variable names that occur within the expressions that are substituted for @code{a} and @code{b}. Underscore followed by a lower case letter won't be predefined by the system in any way. @c We hope someday to extend C with a new form of declaration syntax @c which all the newly declared variables' scopes would begin at the end @c of the entire declaration, rather than as soon as each variable's @c declaration begins. This way, all the variables' initializers would @c be interpreted in the context before the declaration. Then we could @c use any names whatsoever for the local variables and always get correct @c behavior for the macro. @node Self-Referential Macros @subsubsection Self-Referential Macros @cindex self-reference A @dfn{self-referential} macro is one whose name appears in its definition. Recall that all macro definitions are rescanned for more macros to replace. If the self-reference were considered a use of the macro, it would produce an infinitely large expansion. To prevent this, the self-reference is not considered a macro call: preprocessing leaves it unchanged. Consider an example: @example #define foo (4 + foo) @end example @noindent where @code{foo} is also a variable in your program. Following the ordinary rules, each reference to @code{foo} will expand into @code{(4 + foo)}; then this will be rescanned and will expand into @code{(4 + (4 + foo))}; and so on until the computer runs out of memory. The self-reference rule cuts this process short after one step, at @code{(4 + foo)}. Therefore, this macro definition has the possibly useful effect of causing the program to add 4 to the value of @code{foo} wherever @code{foo} is referred to. In most cases, it is a bad idea to take advantage of this feature. A person reading the program who sees that @code{foo} is a variable will not expect that it is a macro as well. The reader will come across the identifier @code{foo} in the program and think its value should be that of the variable @code{foo}, whereas in fact the value is four greater. It is useful to make a macro definition that expands to the macro name itself. If you write @example #define EPERM EPERM @end example @noindent then the macro @code{EPERM} expands to @code{EPERM}. Effectively, preprocessing leaves it unchanged in the source code. You can tell that it's a macro with @code{#ifdef}. You might do this if you want to define numeric constants with an @code{enum}, but have @code{#ifdef} be true for each constant. If a macro @code{x} expands to use a macro @code{y}, and the expansion of @code{y} refers to the macro @code{x}, that is an @dfn{indirect self-reference} of @code{x}. @code{x} is not expanded in this case either. Thus, if we have @example #define x (4 + y) #define y (2 * x) @end example @noindent then @code{x} and @code{y} expand as follows: @example @group x @expansion{} (4 + y) @expansion{} (4 + (2 * x)) y @expansion{} (2 * x) @expansion{} (2 * (4 + y)) @end group @end example @noindent Each macro is expanded when it appears in the definition of the other macro, but not when it indirectly appears in its own definition. @node Argument Prescan @subsubsection Argument Prescan @cindex expansion of arguments @cindex macro argument expansion @cindex prescan of macro arguments Macro arguments are completely macro-expanded before they are substituted into a macro body, unless they are stringified or pasted with other tokens. After substitution, the entire macro body, including the substituted arguments, is scanned again for macros to be expanded. The result is that the arguments are scanned @emph{twice} to expand macro calls in them. Most of the time, this has no effect. If the argument contained any macro calls, they were expanded during the first scan. The result therefore contains no macro calls, so the second scan does not change it. If the argument were substituted as given, with no prescan, the single remaining scan would find the same macro calls and produce the same results. You might expect the double scan to change the results when a self-referential macro is used in an argument of another macro (@pxref{Self-Referential Macros}): the self-referential macro would be expanded once in the first scan, and a second time in the second scan. However, this is not what happens. The self-references that do not expand in the first scan are marked so that they will not expand in the second scan either. You might wonder, ``Why mention the prescan, if it makes no difference? And why not skip it and make preprocessing go faster?'' The answer is that the prescan does make a difference in three special cases: @itemize @bullet @item Nested calls to a macro. We say that @dfn{nested} calls to a macro occur when a macro's argument contains a call to that very macro. For example, if @code{f} is a macro that expects one argument, @code{f (f (1))} is a nested pair of calls to @code{f}. The desired expansion is made by expanding @code{f (1)} and substituting that into the definition of @code{f}. The prescan causes the expected result to happen. Without the prescan, @code{f (1)} itself would be substituted as an argument, and the inner use of @code{f} would appear during the main scan as an indirect self-reference and would not be expanded. @item Macros that call other macros that stringify or concatenate. If an argument is stringified or concatenated, the prescan does not occur. If you @emph{want} to expand a macro, then stringify or concatenate its expansion, you can do that by causing one macro to call another macro that does the stringification or concatenation. For instance, if you have @example #define AFTERX(x) X_ ## x #define XAFTERX(x) AFTERX(x) #define TABLESIZE 1024 #define BUFSIZE TABLESIZE @end example @noindent then @code{AFTERX(BUFSIZE)} expands to @code{X_BUFSIZE}, and @code{XAFTERX(BUFSIZE)} expands to @code{X_1024}. (Not to @code{X_TABLESIZE}. Prescan always does a complete expansion.) @item Macros used in arguments, whose expansions contain unshielded commas. This can cause a macro expanded on the second scan to be called with the wrong number of arguments. Here is an example: @example #define foo a,b #define bar(x) lose(x) #define lose(x) (1 + (x)) @end example We would like @code{bar(foo)} to turn into @code{(1 + (foo))}, which would then turn into @code{(1 + (a,b))}. Instead, @code{bar(foo)} expands into @code{lose(a,b)}, which gives an error because @code{lose} requires a single argument. In this case, the problem is easily solved by the same parentheses that ought to be used to prevent misnesting of arithmetic operations: @example #define foo (a,b) @exdent or #define bar(x) lose((x)) @end example The extra pair of parentheses prevents the comma in @code{foo}'s definition from being interpreted as an argument separator. @end itemize @ignore @c This is commented out because pragmas are not supposed @c to alter the meaning of the program. @c Microsoft did something stupid in defining these. @node Macro Pragmas @subsection Macro Pragmas A pragma is a way of specifying special directions to the C compiler. @xref{Pragmas}, for the basic syntax of pragmas. Here we describe two pragmas that save the current definition of a macro on a stack, and restore it later. This makes it possible to redefine a macro temporarily and later go back to the previous definition. @table @code @item #pragma push_macro (@var{macro_name}) @itemx _Pragma ("push_macro (@var{macro_name})") The @samp{push_macro} pragma saves the current macro definition of @var{macro_name} on the macro definition stack. @item #pragma pop_macro (@var{macro_name}) @itemx _Pragma ("pop_macro (@var{macro_name})") The @samp{pop_macro} pragma pops a saved macro definition off the macro definition stack and defines @var{macro_name} with that definition. @end table Each macro name has a separate stack, and @samp{pop_macro} when the stack is empty has no effect. Here's an example of using these to pragmas to override temporarily the definition of @code{FOO}. @example #define FOO 42 /* @r{Do something with @var{FOO} defined as 42...} */ _Pragma ("push_macro (\"FOO\")") #undef FOO #define FOO 47 /* @r{Do something with @var{FOO} defined as 47...} */ _Pragma ("pop_macro (\"FOO\")") /* @r{@var{FOO} is now restored} @r{to its previous definition of 42.} */ @end example @end ignore @node Conditionals @section Conditionals @cindex conditionals A @dfn{conditional} is a preprocessing directive that controls whether or not to include a chunk of code in the final token stream that is compiled. Preprocessing conditionals can test arithmetic expressions, or whether a name is defined as a macro, or both together using the special @code{defined} operator. A preprocessing conditional in C resembles in some ways an @code{if} statement in C, but it is important to understand the difference between them. The condition in an @code{if} statement is tested during the execution of your program. Its purpose is to allow your program to behave differently from run to run, depending on the data it is operating on. The condition in a preprocessing conditional directive is tested when your program is compiled. Its purpose is to allow different code to be included in the program depending on the situation at the time of compilation. Sometimes this distinction makes no practical difference. GCC and other modern compilers often do test @code{if} statements when a program is compiled, if their conditions are known not to vary at run time, and eliminate code that can never be executed. If you can count on your compiler to do this, you may find that your program is more readable if you use @code{if} statements with constant conditions (perhaps determined by macros). Of course, you can only use this to exclude code, not type definitions or other preprocessing directives, and you can only do it if the file remains syntactically valid when that code is not used. @menu * Conditional Uses:: * Conditional Syntax:: * Deleted Code:: @end menu @node Conditional Uses @subsection Uses of Conditional Directives There are three usual reasons to use a preprocessing conditional. @itemize @bullet @item A program may need to use different code depending on the machine or operating system it is to run on. In some cases the code for one operating system may be erroneous on another operating system; for example, it might refer to data types or constants that do not exist on the other system. When this happens, it is not enough to avoid executing the invalid code. Its mere presence will cause the compiler to reject the program. With a preprocessing conditional, the offending code can be effectively excised from the program when it is not valid. @item You may want to be able to compile the same source file into two different programs. One version might make frequent time-consuming consistency checks on its intermediate data, or print the values of those data for debugging, and the other not. @item A conditional whose condition is always false is one way to exclude code from the program but keep it as a sort of comment for future reference. @end itemize Simple programs that do not need system-specific logic or complex debugging hooks generally will not need to use preprocessing conditionals. @node Conditional Syntax @subsection Syntax of Preprocessing Conditionals @findex #if A preprocessing conditional begins with a @dfn{conditional directive}: @code{#if}, @code{#ifdef} or @code{#ifndef}. @menu * ifdef:: * if:: * defined:: * else:: * elif:: @end menu @node ifdef @subsubsection The @code{#ifdef} directive @findex #ifdef @findex #endif The simplest sort of conditional is @example @group #ifdef @var{MACRO} @var{controlled text} #endif /* @var{MACRO} */ @end group @end example @cindex conditional group This block is called a @dfn{conditional group}. The body, @var{controlled text}, will be included in compilation if and only if @var{MACRO} is defined. We say that the conditional @dfn{succeeds} if @var{MACRO} is defined, @dfn{fails} if it is not. The @var{controlled text} inside a conditional can include preprocessing directives. They are executed only if the conditional succeeds. You can nest conditional groups inside other conditional groups, but they must be completely nested. In other words, @code{#endif} always matches the nearest @code{#ifdef} (or @code{#ifndef}, or @code{#if}). Also, you cannot start a conditional group in one file and end it in another. Even if a conditional fails, the @var{controlled text} inside it is still run through initial transformations and tokenization. Therefore, it must all be lexically valid C@. Normally the only way this matters is that all comments and string literals inside a failing conditional group must still be properly ended. The comment following the @code{#endif} is not required, but it is a good practice if there is a lot of @var{controlled text}, because it helps people match the @code{#endif} to the corresponding @code{#ifdef}. Older programs sometimes put @var{macro} directly after the @code{#endif} without enclosing it in a comment. This is invalid code according to the C standard, but it only causes a warning in GNU C@. It never affects which @code{#ifndef} the @code{#endif} matches. @findex #ifndef Sometimes you wish to use some code if a macro is @emph{not} defined. You can do this by writing @code{#ifndef} instead of @code{#ifdef}. One common use of @code{#ifndef} is to include code only the first time a header file is included. @xref{Once-Only Headers}. Macro definitions can vary between compilations for several reasons. Here are some samples. @itemize @bullet @item Some macros are predefined on each kind of machine (@pxref{System-specific Predefined Macros, System-specific Predefined Macros, System-specific Predefined Macros, gcc, Using the GNU Compiler Collection}). This allows you to provide code specially tuned for a particular machine. @item System header files define more macros, associated with the features they implement. You can test these macros with conditionals to avoid using a system feature on a machine where it is not implemented. @item Macros can be defined or undefined with the @option{-D} and @option{-U} command-line options when you compile the program. You can arrange to compile the same source file into two different programs by choosing a macro name to specify which program you want, writing conditionals to test whether or how this macro is defined, and then controlling the state of the macro with command-line options, perhaps set in the file @file{Makefile}. @xref{Invocation, Invoking GCC, Invoking GCC, gcc, Using the GNU Compiler Collection}. @item Your program might have a special header file (often called @file{config.h}) that is adjusted when the program is compiled. It can define or not define macros depending on the features of the system and the desired capabilities of the program. The adjustment can be automated by a tool such as @command{autoconf}, or done by hand. @end itemize @node if @subsubsection The @code{#if} directive The @code{#if} directive allows you to test the value of an integer arithmetic expression, rather than the mere existence of one macro. Its syntax is @example @group #if @var{expression} @var{controlled text} #endif /* @var{expression} */ @end group @end example @var{expression} is a C expression of integer type, subject to stringent restrictions so its value can be computed at compile time. It may contain @itemize @bullet @item Integer constants. @item Character constants, which are interpreted as they would be in normal code. @item Arithmetic operators for addition, subtraction, multiplication, division, bitwise operations, shifts, comparisons, and logical operations (@code{&&} and @code{||}). The latter two obey the usual short-circuiting rules of standard C@. @item Macros. All macros in the expression are expanded before actual computation of the expression's value begins. @item Uses of the @code{defined} operator, which lets you check whether macros are defined in the middle of an @code{#if}. @item Identifiers that are not macros, which are all considered to be the number zero. This allows you to write @code{@w{#if MACRO}} instead of @code{@w{#ifdef MACRO}}, if you know that MACRO, when defined, will always have a nonzero value. Function-like macros used without their function call parentheses are also treated as zero. In some contexts this shortcut is undesirable. The @option{-Wundef} requests warnings for any identifier in an @code{#if} that is not defined as a macro. @end itemize Preprocessing does not know anything about the data types of C. Therefore, @code{sizeof} operators are not recognized in @code{#if}; @code{sizeof} is simply an identifier, and if it is not a macro, it stands for zero. This is likely to make the expression invalid. Preprocessing does not recognize @code{enum} constants; they too are simply identifiers, so if they are not macros, they stand for zero. Preprocessing calculates the value of @var{expression}, and carries out all calculations in the widest integer type known to the compiler; on most machines supported by GNU C this is 64 bits. This is not the same rule as the compiler uses to calculate the value of a constant expression, and may give different results in some cases. If the value comes out to be nonzero, the @code{#if} succeeds and the @var{controlled text} is compiled; otherwise it is skipped. @node defined @subsubsection The @code{defined} test @cindex @code{defined} The special operator @code{defined} is used in @code{#if} and @code{#elif} expressions to test whether a certain name is defined as a macro. @code{defined @var{name}} and @code{defined (@var{name})} are both expressions whose value is 1 if @var{name} is defined as a macro at the current point in the program, and 0 otherwise. Thus, @code{@w{#if defined MACRO}} is precisely equivalent to @code{@w{#ifdef MACRO}}. @code{defined} is useful when you wish to test more than one macro for existence at once. For example, @example #if defined (__arm__) || defined (__PPC__) @end example @noindent would succeed if either of the names @code{__arm__} or @code{__PPC__} is defined as a macro---in other words, when compiling for ARM processors or PowerPC processors. Conditionals written like this: @example #if defined BUFSIZE && BUFSIZE >= 1024 @end example @noindent can generally be simplified to just @code{@w{#if BUFSIZE >= 1024}}, since if @code{BUFSIZE} is not defined, it will be interpreted as having the value zero. In GCC, you can include @code{defined} as part of another macro definition, like this: @example #define MACRO_DEFINED(X) defined X #if MACRO_DEFINED(BUFSIZE) @end example @noindent which would expand the @code{#if} expression to: @example #if defined BUFSIZE @end example @noindent Generating @code{defined} in this way is a GNU C extension. @node else @subsubsection The @code{#else} directive @findex #else The @code{#else} directive can be added to a conditional to provide alternative text to be used if the condition fails. This is what it looks like: @example @group #if @var{expression} @var{text-if-true} #else /* Not @var{expression} */ @var{text-if-false} #endif /* Not @var{expression} */ @end group @end example @noindent If @var{expression} is nonzero, the @var{text-if-true} is included and the @var{text-if-false} is skipped. If @var{expression} is zero, the opposite happens. You can use @code{#else} with @code{#ifdef} and @code{#ifndef}, too. @node elif @subsubsection The @code{#elif} directive @findex #elif One common case of nested conditionals is used to check for more than two possible alternatives. For example, you might have @example #if X == 1 /* @r{@dots{}} */ #else /* X != 1 */ #if X == 2 /* @r{@dots{}} */ #else /* X != 2 */ /* @r{@dots{}} */ #endif /* X != 2 */ #endif /* X != 1 */ @end example Another conditional directive, @code{#elif}, allows this to be abbreviated as follows: @example #if X == 1 /* @r{@dots{}} */ #elif X == 2 /* @r{@dots{}} */ #else /* X != 2 and X != 1*/ /* @r{@dots{}} */ #endif /* X != 2 and X != 1*/ @end example @code{#elif} stands for ``else if''. Like @code{#else}, it goes in the middle of a conditional group and subdivides it; it does not require a matching @code{#endif} of its own. Like @code{#if}, the @code{#elif} directive includes an expression to be tested. The text following the @code{#elif} is processed only if the original @code{#if}-condition failed and the @code{#elif} condition succeeds. More than one @code{#elif} can go in the same conditional group. Then the text after each @code{#elif} is processed only if the @code{#elif} condition succeeds after the original @code{#if} and all previous @code{#elif} directives within it have failed. @code{#else} is allowed after any number of @code{#elif} directives, but @code{#elif} may not follow @code{#else}. @node Deleted Code @subsection Deleted Code @cindex commenting out code If you replace or delete a part of the program but want to keep the old code in the file for future reference, commenting it out is not so straightforward in C. Block comments do not nest, so the first comment inside the old code will end the commenting-out. The probable result is a flood of syntax errors. One way to avoid this problem is to use an always-false conditional instead. For instance, put @code{#if 0} before the deleted code and @code{#endif} after it. This works even if the code being turned off contains conditionals, but they must be entire conditionals (balanced @code{#if} and @code{#endif}). Some people use @code{#ifdef notdef} instead. This is risky, because @code{notdef} might be accidentally defined as a macro, and then the conditional would succeed. @code{#if 0} can be counted on to fail. Do not use @code{#if 0} around text that is not C code. Use a real comment, instead. The interior of @code{#if 0} must consist of complete tokens; in particular, single-quote characters must balance. Comments often contain unbalanced single-quote characters (known in English as apostrophes). These confuse @code{#if 0}. They don't confuse @samp{/*}. @node Diagnostics @section Diagnostics @cindex diagnostic @cindex reporting errors @cindex reporting warnings @findex #error The directive @code{#error} reports a fatal error. The tokens forming the rest of the line following @code{#error} are used as the error message. The usual place to use @code{#error} is inside a conditional that detects a combination of parameters that you know the program does not properly support. For example, @smallexample #if !defined(UNALIGNED_INT_ASM_OP) && defined(DWARF2_DEBUGGING_INFO) #error "DWARF2_DEBUGGING_INFO requires UNALIGNED_INT_ASM_OP." #endif @end smallexample @findex #warning The directive @code{#warning} is like @code{#error}, but it reports a warning instead of an error. The tokens following @code{#warning} are used as the warning message. You might use @code{#warning} in obsolete header files, with a message saying which header file to use instead. Neither @code{#error} nor @code{#warning} macro-expands its argument. Internal whitespace sequences are each replaced with a single space. The line must consist of complete tokens. It is wisest to make the argument of these directives be a single string constant; this avoids problems with apostrophes and the like. @node Line Control @section Line Control @cindex line control Due to C's widespread availability and low-level nature, it is often used as the target language for translation of other languages, or for the output of lexical analyzers and parsers (e.g., lex/flex and yacc/bison). Line control enables the user to track diagnostics back to the location in the original language. The C compiler knows the location in the source file where each token came from: file name, starting line and column, and final line and column. (Column numbers are used only for error messages.) When a program generates C source code, as the Bison parser generator does, often it copies some of that C code from another file. For instance parts of the output from Bison are generated from scratch or come from a standard parser file, but Bison copies the rest from Bison's input file. Errors in that code, at compile time or run time, should refer to that file, which is the real source code. To make that happen, Bison generates line-control directives that the C compiler understands. @findex #line @code{#line} is a directive that specifies the original line number and source file name for subsequent code. @code{#line} has three variants: @table @code @item #line @var{linenum} @var{linenum} is a non-negative decimal integer constant. It specifies the line number that should be reported for the following line of input. Subsequent lines are counted from @var{linenum}. @item #line @var{linenum} @var{filename} @var{linenum} is the same as for the first form, and has the same effect. In addition, @var{filename} is a string constant that specifies the source file name. Subsequent source lines are recorded as coming from that file, until something else happens to change that. @var{filename} is interpreted according to the normal rules for a string constant. Backslash escapes are interpreted, in contrast to @code{#include}. @item #line @var{anything else} @var{anything else} is checked for macro calls, which are expanded. The result should match one of the above two forms. @end table @code{#line} directives alter the results of the @code{__FILE__} and @code{__LINE__} symbols from that point on. @xref{Predefined Macros}. @node Null Directive @section Null Directive @cindex null directive The @dfn{null directive} consists of a @code{#} followed by a newline, with only whitespace and comments in between. It has no effect on the output of the compiler.