Binary literals in C (thing) by call

The C programming language is an excellent one for all us bit twiddlers, blessed as it is with the full complement of bitwise operations, and no performance overheads that aren't visible in the language (in other words, integral variables are usually in registers).

Also of benefit to us, it has a variety of number bases in which we can write literal values: decimal (by just writing, eg. 9999), hexadecimal (by prefixing with '0x', eg 0xffff), and even octal (by prefixing with '0', eg. 07777).

Notable by its omission, however, and quite frustrating to us low-level types, is the presence of any way of representing binary literal values. This perplexed and annoyed me for many a year, until recently when I realised that there's a plethora of simple ways to construct an integer from something that looks very much like (but isn't really) a binary value.

Strings would be the obvious candidate; we all know it's possible to construct a binary value from a string. However, this almost always has to be done at run time; iteration over a variable length string can't be made constant, and few compilers are smart enough to realise that a const function with constant parameters can be entirely computed at compile time.

A better candidate for this is C's already existing numerical literals. Digits in every number base greater than binary can take on the values 0 and 1, as well as other values. We can exploit this by converting a literal (say, octal or decimal) which happens to look a lot like a binary number, (with a wildly differing value) into the value that the corresponding binary number would have.

Say, for example, we wanted to represent the binary constant 1101_bin (Eight and four and one; or thirteen) in our program source code. We can express this as a constant function of the integer 1101_dec (one thousand, one hundred and one). The value of the integer we want to represent (thirteen) is simply (1 if the 1's digit is 1) + (2 if the 10s digit is 1) + (4 if the 100s digit is 1) + (8 if the 1000s digit is 1). Since this is all simple arithmetic on constant integers, the literal can be completely evaluated by the compiler at compile time.

Here's a C macro that converts a binary-esque decimal value into an integer with the value that the corresponding binary would have had:


/* To extract each bit value; divide by appropriate power of 10,
   if result has a '1' set in that position, corresponds to 
   a 1 in that position in the literal => set bit. 

   This allows us to write the binary value 1101 (thirteen) as 
   BIN(1101), a constant function of the decimal value 1101 
   (one thousand one hundred and one).
 */
#define BIN_BIT(value, bit, dec) ((((unsigned) value/dec)&1 == 1)? (1<<bit) : 0)

/* Put 10 bits together to form a 10 bit constant. */
#define BIN(value) \
( BIN_BIT(value,  0, 1) | \
  BIN_BIT(value,  1, 10) | \
  BIN_BIT(value,  2, 100) | \
  BIN_BIT(value,  3, 1000) | \
  BIN_BIT(value,  4, 10000) | \
  BIN_BIT(value,  5, 100000) | \
  BIN_BIT(value,  6, 1000000) | \
  BIN_BIT(value,  7, 10000000) | \
  BIN_BIT(value,  8, 100000000) | \
  BIN_BIT(value,  9, 1000000000) )

/* GCC-specific: use a 64-bit long long literal to produce a 20-bit 
   binary number. */
#define BIN_BIT64(value, bit, dec) ((((unsigned long long) \
				value/dec)&1 == 1)? (1<<bit) : 0) 

#define BIN20(value) \
( BIN_BIT64(value,  0, 1ull) | \
  BIN_BIT64(value,  1, 10ull) | \
  BIN_BIT64(value,  2, 100ull) | \
  BIN_BIT64(value,  3, 1000ull) | \
  BIN_BIT64(value,  4, 10000ull) | \
  BIN_BIT64(value,  5, 100000ull) | \
  BIN_BIT64(value,  6, 1000000ull) | \
  BIN_BIT64(value,  7, 10000000ull) | \
  BIN_BIT64(value,  8, 100000000ull) | \
  BIN_BIT64(value,  9, 1000000000ull) | \
  BIN_BIT64(value, 10, 10000000000ull) | \
  BIN_BIT64(value, 11, 100000000000ull) | \
  BIN_BIT64(value, 12, 1000000000000ull) | \
  BIN_BIT64(value, 13, 10000000000000ull) | \
  BIN_BIT64(value, 14, 100000000000000ull) | \
  BIN_BIT64(value, 15, 1000000000000000ull) | \
  BIN_BIT64(value, 16, 10000000000000000ull) | \
  BIN_BIT64(value, 17, 100000000000000000ull) | \
  BIN_BIT64(value, 18, 1000000000000000000ull) | \
  BIN_BIT64(value, 19, 10000000000000000000ull) )

There are some obvious caveats with the above code:

When writing binary constants, it's tempting to write preceding zeros at the beginning of the number to pad it to a fixed number of bits. However, doing this here (eg. BIN(01001011)) would turn the decimal literal into an octal literal, giving entirely the wrong value.
Since decimal needs more than a single bit to represent each digit, the number of digits in literal values is smaller; only 10 bits can be represented by a 32-bit number, or 20 by a 64-bit number.

That being said, however, the above method can be generalised to any number base by using appropriate constants, so long as the base is smaller than the largest base that can be represented (base 16) in C's literal constants.

template meta-programming	In C	bit twiddling	variadic macros in C
Learn C, you hippies	Useful Number Theory functions in C	C99	metaprogramming
hexadecimal	Fire Is Alive	Mick	strlen
octal	Template	UL	why
Wicked	machine language	hex	source code