In C, a trigraph is
a special three character sequence
that maps to another character.
The intent of trigraphs is to support
machines which have character sets which
do not have certain characters.
(Specifically the ones not
in ISO 646-1983 Invariant Code Set.)
The trigraphs are:
- ??= means #
- ??/ means \
- ??' means ^
- ??( means [
- ??) means ]
- ??! means |
- ??< means {
- ??> means }
- ??- means ~
Trigraph replacements can occur anywhere.
For example:
??=include <stdio.h>
void prompt(int x, int y)
??<
printf("??= %d + %d? ", x, y);
??>
is actually the same as
#include <stdio.h>
void prompt(int x, int y)
{
printf("# %d + %d? ", x, y);
}
Trigraphs are usually not seen outside of
obfuscated code. In fact, GCC discourages
its use by requiring it be run in
strict ANSI conformance mode (the -ansi
option) or with the special -trigraphs
option for trigraphs to work.
In fact, concerning them the GCC texinfo documentation
says: "You don't want to know about this
brain-damage".
Apparently as a result of the dislike of trigraphs,
"digraphs" were added to the C standard,
which are alternate spellings of
certain tokens, avoiding unexpected
expansion in strings. Also, they
tend to be easier to read. However, there
are no digraphs for |, ~,
^, and \.
However, the first three can now be made up for
with iso646.h's
alternate names for operators.
With help from "The C Programming Language, Second Edition"
(by Brain W. Kernighan and Dennis M. Ritche),
and a recent draft of the C standard.