m4 is a Unix macro
processing language. Like cpp
, the C preprocessor
, m4 takes a byte stream as input, expands macros, and prints resulting byte stream. However, m4 is much more powerful and general purpose than cpp
. It can run shell commands, do integer arithmetic, and manipulate text in various ways.
m4 has been used for such tasks as building sendmail configuration files, the GNU autoconf configuration and Makefile generation tool, Kernighan's original ratfor Fortran system, and as an add-on preprocessor for languages lacking one like HTML or Java.
A string in an m4 input stream can be quoted by enclosing it between a backquote (`) and a forward quote/apostrophe('). This may seem unusual, but in practice it works very well, since strings within the input are unlikely to be quoted this way unless it is intended to be expanded by m4. Additionally, if these characters are unsuitable, they may be changed with the
changequote command. This should give you an idea of the flexibility and adaptibility of m4.
Quoted strings are exempt them from regular m4 processing, and will appear unchanged (sans quotes, of course) in the output stream.
m4 treats any text appearing between a number sign (#) and a newline as a comment (except within quoted strings). Again, if this is unsuitable, it can be changed with the
The real power of m4 is in the macros. A macro is defined by the
define command, which corresponds to cpp's
#define. The simplest type of macro simply redefines one string to another:
This macro will simply change all instances of foo into bar. Note that if there is a macro called bar, it will be invoked too, and the final output will be the definition of bar. For example:
The expansion can be avoided by quoting expansion of the foo macro:
Why two sets of quotes? The first set of quotes quote the string `bar' when the macro is defined. The result of the macro is `bar', which is treated as a quoted string and output as bar.
Just like cpp, m4 macros can accept arguments. Like most aspects of m4, the syntax is reminicent of a Unix shell script: the first argument is represented as $1, the second as $2, and so on. The number of arguments is represented as $#.
define(foo, `argc is $#, argv is $1')
Note that whenever a macro has arguments, you should quote the expansion, so the argument variables get expanded when the macro is invoked, rather than when the macro is defined.
Macros can be temporarily redefined, too. To do this, the macro must be defined with the
pushdef command rather than the
define command. The old macro definition can be restored with
popdef. An example:
Note that the macro name foo is quoted in this example. If the quotes were left out, the foo in the
pushdef command would be expanded to bar, pushing a definition for a macro called bar, not foo! Getting the quotes right in a complex m4 script is sometimes very difficult. Luckily, m4 has special commands for debugging:
dumpdef, which outputs the definition of a macro to standard error, and
traceon, which prints a message each time a macro is expanded.
Perhaps you've noticed that if macros are defined on their own line in the input, a blank line appears in the output. This is because the newline is not part of the macro definition, and thus is simply copied from input to output. To avoid all these unsightly blank lines littering your output, you can follow any m4 statement with the
dnl, short for "delete new line", simply causes m4 to eat the newline on the current line; it is not uncommon to see declarations like this:
define(`foo', ``bar'') dnl
define(`bar, ``blat'') dnl
Files can be included in m4 with the
sinclude commands. They work identically, except
include produces an error if the given file does not exist, whereas
sinclude simplely returns an empty string.
The include commands can also be used to read a file into a macro definition, like so:
Now infile will expand to the entire contents of afile.m4.
m4 also has a ton of other useful functions for manipulating strings and doing arithmetic. Here is are some of the most useful:
len(string) returns the length of string
index(string, substr) returns the zero-based index of substr in string, or -1 if it doesn't occur
substr(string, start[, length]) returns a substring of string beginning at index start of length characters
translit(string, chars[, replacement]) replaces the characters in chars with the characters in replacement, much like the Unix tr command
incr(number) returns a number one greater than number
eval(expression) returns the result of the mathematical expression expression
syscmd(cmd) runs the shell command cmd (but discards the output)