The C compiler's facility for defining macros, typically implemented in "the" "preprocessor" "cpp". (Therefore also the C++ preprocessor's, along with a gazillion other little languages' and maybe a few more programming languages', but we won't discuss them here...). #define is used to set up a macro: an identifier subject to a particular expansion when it appears.

#define can take 3 forms (the first 2 below are really the same, but are usually different in intent, so I separate them). In each, the same rules for identifiers are observed as for C identifiers. Note, however, that they are very different things.

#define MACRO
MACRO will expand as (i.e. be replaced by) nothing (but you will be able to test for it being defined).
#define MACRO EXPANSION
MACRO will expand as "EXPANSION". EXPANSION itself will also be subject to macro expansion, except for occurrences of MACRO there, which will be left alone.
#define MACRO(arg1,...,argn) EXPANSION
"MACRO(x1,...,xn)" will be expanded as "EXPANSION", with each argi replaced by its corresponding xi. 0 arguments may be supplied (i.e. n above may be 0). Only occurrences of MACRO with n arguments will be expanded by this rule.

No whitespace is allowed between MACRO and the open brackets. This allows you to define a macro which expands as a bracketed string according to the unbracketed form above.

Macros with and without arguments occupy the same namespace. Whenever a macro is multiply defined, the last definition takes effect.

A line undergoes macro expansion multiple times, until no new macros appearing in it can be expanded. At each expansion, the preprocessor will select one macro eligible for expansion, and expand all its occurrences in the text of the "line". That macro is then no longer eligible for further expansions in the line. So

#define	X	X+Y
#define	Y	Y+X
X;
Y;
expands to
X+Y+X;
Y+X+Y;
(or a lexical equivalent). Each macro gets expanded only once.

"It is easy to see" that these macro expansion rules guarantee the expansion process must terminate, and the end result is independent of the order of expanding the different macros. (Ponder the expansion of "X+Y" for the significance of the last statement...)

The preprocessor also offers the "#x" and "##" facilities for use within macros. These are useful for some applications; they are the only way to perform certain text transformations on macro arguments.

Additionally, the preprocessor can test whether a macro is defined by using the "defined(MACRO)" expression or the "#ifdef MACRO" and "#ifndef MACRO" commands.

As an extension that is beyond the scope of the ISO C standard and other standard documents, many compilers will also let you define a macro on the command line using syntaxes like "-Dmacro", "-Dmacro=expansion" or "-Dmacro(args)=expansion", respectively. This is very useful to modify a program's source code at compile time, without even modifying a file. Many configuration management systems employ this method to pass particular hints into code.

Annoyances and/or features

#define provides only text substitution, albeit in a form that is very convenient for programming in a language like C. The EXPANSION part is treated in a lexically sensitive manner (i.e. whitespace is kept where needed), but the treatment is insensitive to all syntax issues.

This is a feature when used for good; see examples 3 and 4 below. On the other hand, it will potentially trample all over syntactic issues such as precedence, if/else nesting, and anything else where it can cause damage. Conventions 2 and (especially) 3 below are used precisely to mitigate this issue.

An issue which cannot be fixed is that of multiple evaluation of arguments. Suppose the standard library were to define (incorrectly!) isupper on an ASCII-only machine as

#define isupper(x)   ('A' <= (x) && (x) < 'Z')
Everything's nicely wrapped up in parentheses, so issues of precedence have been properly dealt with. Unfortunately, this code
int c = 'Z';
int lowercase_Z = isupper(c++);
will expand to an expression that includes "c++" twice. There is a sequence point in betweeen the two increments, so the result is not undefined; it just happens to be the wrong result.

The moral is to be wary of side effects in macro arguments, to create macros (where possible) in a way which won't be affected by them (isupper does in fact guarantee single evaluation), and to document everything properly.

Conventions

Several conventions are popular among the C crowd. Naturally, none of them are in any way enforced. (We're talking about a language that doesn't even enforce what's obviously wrong; you want it to enforce some bozo's ideas of good taste?)

  1. Macro names should be all upper case. This is almost always followed for constants, but sometimes flouted when a macro is being used to replace a function.

    The standard libraries (as well as POSIX.1) usually adhere to this convention, although <ctype.h> seems usually to ignore it -- tolower and friends are (also) macros (but they do act exactly like functions...). Also, C++ defines a macro __cplusplus which is all lower case (even if in both languages' reserved name space).

    This is actually a nice finesse: Give a macro a lower case name only if it can be called exactly like a function. Every arg must be evaluated exactly once, and the macro must expand to a single expression.

  2. Care must be taken with terminating semicolons. The do { ... } while(0) trick (note the lack of a terminating semicolon!) is particularly handy for this convention, but see also the if (...) ; else trick below.

    There's no syntactic obligation to do this. It just makes writing syntactically correct code easier, by preventing some of the more common traps.

  3. The seemingly useless parenthesis in #define macros (which see) is, again, often highly necessary.

Examples

  1. For conditional compilation, depending on (say) platform or other configuration details.

    /* (Maybe) in some header file ... */
    #define HAVE_USLEEP
    /* ... Later on, in real code ... */
    #ifndef HAVE_USLEEP
    /* Platform doesn't have usleep, define it using select */
    void usleep(int msec) { /* ... */ }
    #endif
    
    This popular use of macros makes it easy to write minor source code transformations, depending on the exact "flavour" of program required. Done right, it's a key towards writing reasonably portable code that has to interface with different operating systems or user expectations.

  2. For inlining useful constants.

    /* ... in math.h ... */
    # define M_PI		3.14159265358979323846	/* pi */
    

  3. Instead of inlined functions, or when an inlined function can't hack it.

    #define STRCMP(x,op,y) (strcmp((x),(y)) op 0)
    /* ... */
    /* Compare strings lexicographically... */
    int str_comes_before(const char * a, const char * b)
    {
      return STRCMP(a, <, b);
    }
    int str_equals(const char * a, const char * b)
    {
      return STRCMP(a, ==, b);
    }
    

  4. There are no variadic macros in C, at least not before ISO C99, but sometimes you can get close.

    extern int do_print_debugs;    /* Will be set to "1" somehow */
    extern int debug_printf(fmt, ...);
    #define DEBUG_PRINTF if (!do_print_debugs) ; else debug_printf
    /* ... */
    int x=2;
    DEBUG_PRINTF("%d+%d==5", x,x);
    

    This example is typical of real-life code. You want to prevent any function call when !do_print_debugs. You want to use the same debug facility (the above is a tiny example, a full facility is more complex) from several compilation units. And you happen to know that your compiler (or one of them, for a typical project...) cannot inline functions across compilation units.

    What the above does is to do the test outside the function call. debug_printf doesn't even get called when it shouldn't; the result can be a considerably more efficient debug print facility.

    Of course, if you're using a C99 compiler (and, presumably, it's 2038 and the world didn't end), you do have variadic macros in C.

Log in or register to write something here or to contact authors.