About Riddle 7 from

hard interview questions.

This riddle is about the order of an algorithm. As posed it is actually either a joke or a trick question, and a detail is unintentionally wrong.
Maybe you can score extra points with some interviewers if you can point out the problems.

The riddle is posed like this:

*
How do you quickly count the number of set bits in a 32-bit integer in linear time (with respect to the number of set bits)? In constant time?
*

Now, you can always do it in constant time, since the integer never has more than 32 bits, you need just up to 32 operations, and all the different answers above just change the number of operations, but not that it is always constant time. Actually, every algorithm that finishes does finish in constant time if you keep the problem small(<32!). The order O of an algorithm only makes sense when the size of the problem (usually given by integers named n and m)does vary. For instance, look at Compute Fibonacci numbers FAST!.

Another example(this is a bit inaccurate), breaking an RSA key of 128 size does take constant time, maybe a month, but approximately every 4th bit you add to the key doubles the time needed to break the key. This makes the order greater than the fourth root of the logarithm(=key length) of the key space, as you need only twice the effort for 16 times more keys.

This means the riddle at best can pass as a trick question, but not as it was intended. Fixing this, then consider that the riddle had been posed like this:

*
How do you quickly count the number of set bits in a N-bit integer in linear time (with respect to the number of set bits)? In constant time?
*

This is better, but the answers given above still are not giving the complexity of **with respect to the number of set bits**, as the algorithm doesn't finish any faster if only one bit is set(e.g. the highest bit).

So, the riddle should read:

*
How do you quickly count the number of set bits in a N-bit integer in linear time ? Can you give a fast implementation ?
*

This makes sense.

You can speed up the process only to apparently linear to the number of bits set if you have a CPU that can check lots of bits for zero-ness practically at once, like using a FPGA for memory. The problem is as difficult as adding up all N bits, which gives an addition tree with a number of nodes(each node stands for an addition) equal to the number of additions needed. So you can solve it in logarithmic time( log N ) by using N logic gates or in linear time ( N ) using a CPU which does the adding. This means you trade software loops for hardware gates.

Using a lookup table is very smart, but does not reduce the complexity when N gets greater than the the lookup table size. If you follow the excellent suggestion of bis (, which is to start with field size one bit, then repeat to mask every 2nd field with a bitmask, then shift and add to the adjacent field all in one CPU operation, then proceed with double field width until the field is as wide as the number ), you reduce the number of CPU adding operations, but increase the number of copying and shifting operations. This gives a speed increase, since you have the hardware to handle several bits quickly. However, in theory, the complexity is unchanged, since eventually you will run out of hardware that can do masking/shifting/addding operations in parallel.
Still, at least bis' suggestion makes full use of the CPU because in every CPU adding step every bit is significant and used.