In C, a trigraph is a special three character sequence that maps to another character. The intent of trigraphs is to support machines which have character sets which do not have certain characters. (Specifically the ones not in ISO 646-1983 Invariant Code Set.)

The trigraphs are:

  • ??= means #
  • ??/ means \
  • ??' means ^
  • ??( means [
  • ??) means ]
  • ??! means |
  • ??< means {
  • ??> means }
  • ??- means ~

Trigraph replacements can occur anywhere.

For example:


??=include <stdio.h>

void prompt(int x, int y)

??<

  printf("??= %d + %d? ", x, y);

??>

is actually the same as


#include <stdio.h>

void prompt(int x, int y)

{

  printf("# %d + %d? ", x, y);

}

Trigraphs are usually not seen outside of obfuscated code. In fact, GCC discourages its use by requiring it be run in strict ANSI conformance mode (the -ansi option) or with the special -trigraphs option for trigraphs to work. In fact, concerning them the GCC texinfo documentation says: "You don't want to know about this brain-damage".

Apparently as a result of the dislike of trigraphs, "digraphs" were added to the C standard, which are alternate spellings of certain tokens, avoiding unexpected expansion in strings. Also, they tend to be easier to read. However, there are no digraphs for |, ~, ^, and \. However, the first three can now be made up for with iso646.h's alternate names for operators.

With help from "The C Programming Language, Second Edition" (by Brain W. Kernighan and Dennis M. Ritche), and a recent draft of the C standard.