carry lookahead adder - Everything2.com

Very fast adder design for n-bit binary numbers. Whereas the naïve design (chaining full adders) have an O(n) component count and takes time O(n) to add, the carry lookahead adder requires O(n log n) components (which is a lot more), but adds in time O(log n).

The design is based on a fast method of calculating where the carry bits will be added. Suppose we're adding the 2 binary numbers x_n-1...x₀ and y_n-1...y₀. If we knew to which bits we need to add carry (define c_j to be 1 if a 1 is carried out from adding bit j, else 0), we could just XOR each of these columns to get the result:

c_n-1c_n-2...c₀
 x_n-1...x₁x₀
 y_n-1...y₁y₀

which takes constant time (and O(n) components).

So we need a fast way of calculating the c_j's. To do this, we define 3 possible carry states, and assign one such to each of the n bits:

If both x_j and y_j are 1, then a 1 will always be carried out of that bit; this is carry state S ("set").
If both x_j and y_j are 0, then 1 will never be carried out of that bit; this is carry state C ("clear").
If exactly one of x_j and y_j is 1, then 1 will be carried out of that bit iff a 1 was carried into that bit; this is carry state P ("propogate").

Note that we may define the same 3 states for any contiguous block of bits, specifying that a 1 will always be carried out of the block (S), that 1 will never be carried out of the block (C), or that 1 will be carried out if a 1 was carried in (P). We define "multiplication" on the carry states so that the carry state of the concatenation of two blocks is the multiplication of their carry states (read the state on the right on the column, and the state on the left on the row).

. S C P
S S C S
C S C C
P S C P

This operation certainly isn't commutative (nor does it define a group), but it turns out to be associative! We may calculate the carry state of each bit using the above rules in constant time and O(n) components; let k_j be the carry state for bit j. Then c_j is 1 iff k_j...k₁k₀ is S. So using the parallel prefix algorithm, we can calculate these block values, and therefore the c values, using O(n log n) components and in time O(log n).

The really amazing thing about this adder design is not that it works; it's that today similar designs are used in real CPUs!

ripple adder	parallel prefix algorithm	Strassen algorithm for polynomial multiplication	full adder
Binary mathematics	Digital circuit component	binary adder	Sudden Unexpected Death Syndrome
half adder	Compute Fibonacci numbers FAST!	Concealed carry law	Sea adder
Adder	IFF	fast exponentiation	semigroup
Discrete Logarithm