Encyclopedia on a toothpick

by thismeansnothing

Wed Jan 22 2003 at 14:26:42

'How do you store an encyclopedia on a toothpick?' This theoretical puzzle is used by Haruki Murakami in his wonderful novel Hard-Boiled Wonderland and the End of the World to illustrate a point about eternity.

The theoretical answer to the puzzle is as elegant as it is simple.

To store an infinite amount of data on a toothpick, assign a numerical value to each letter of the alphabet (00 for 'a', 01 for 'b' and so on) - you could even use their ASCII codes. Next string all the letters making up the encyclopedia together to form a very large number. You'll end up with something like this 0419041713081924...
Next turn this number into a fraction by pre-pending a 0. to it and make a mark on the toothpick at exactly this point of its length (where 0 is the start of the toothpick and 1 is its end). This mark contains all the information of the encoded encyclopedia in its position.

Obviously this solution is not practical, but it does show how infinity (or eternity) is not necessarily a matter of size, but a matter of precision. What Murakami suggests with this puzzle is that experiencing eternity is not necessarily about living forever, it is about the level of detail at which you experience one moment.

see also: Zeno's Paradox

amnesiac said 're Encyclopedia on a Toothpick: cool! is the book good?'

I said 'very good, one third Ghost in the Shell, one third Princess Mononoke, one third Paul Auster'

Excalibre said re Encyclopedia on a toothpick : of course, even in theory this falls apart, as there is a planck length, something like 10^-35 meters, and there is no smaller meaningful unit of distance. you _can't_ get smaller than that. so your encyclopedia would have to be less than 35 digits worth of information (assuming your toothpick is no longer than a meter.) there IS such a thing as infinitely small, and it's bigger than you assume.

I like it!

2 C!s

(idea)

by Professor Pi

Wed Jan 22 2003 at 21:06:45

The entire Encyclopædia Britannica is in the digits of pi; I'm just not telling you at what digit it starts.

Haruki Murakami's puzzle is a thought experiment that could under no circumstances be carried out; at least, not in the way it is described. Nitpicking perhaps, but the following analysis perhaps adds some further insight about scale, infinity, and precision.

There are two practical reasons why Murakami's experiment would fail. First of all; atoms are of finite size. Marking the toothpick would mean a removal of material (at least one atom). It would be impossible to arbitrarily mark the toothpick at any desired location, without splitting the atom in two.

The size of an atom is determined by its atomic radius or, since we're dealing with molecules and not atoms, by its covalent radius. For instance, consider the following nanotoothpick consisting of two strands of carbon, over a length of ten atoms.

   C-C-C-C-C-C-C-C-C-C 
    \ \ \ \ \ \ \ \ \ \
     C-C-C-C-C-C-C-C-C-C

The reason for the double-strand nanotoothpick is that it could be marked by removal of a carbon atom (provided that you could actually synthesize such a molecule reliably):

   C-C-C-C-C-C-C-C-C-C 
    \ \ \   \ \ \ \ \ \
     C-C-C   C-C-C-C-C-C

     0 1 2 3 4 5 6 7 8 9

The covalent radius of carbon is approximately 0.77 Angstrom. Thus, the length of this theoretical toothpick is an amazing 1.54 nm. The toothpick above is marked at the "3" position. The entire toothpick can encode 10 positions. Written as a fraction of the entire length of the toothpick these positions are: 0.0, 0.1, ..., 0.9. Thus, we need 10 atoms to mark the toothpick up to one digit accuracy.

If we extend this analysis to larger toothpicks to encode more digits, we arrive at the second reason why Murakami's experiment is only a thought experiment. To mark the toothpick up to two digits accuracy, the carbon strand needs to be 100 atoms long. In this case, the fractional positions are: 0.00, 0.01, ..., 0.99.

The number of atoms required for each successive digit increases exponentially: for n digits accuracy, 10ⁿ atoms are required. Even though the atoms are very small, the length of the toothpick increases dramatically.

The 1999 edition of the Encyclopædia Britannica contains approximately 44 million words. The 2002 is even bigger; give or take 56 million words. For arguments' sake, let's assume the average encyclopedia contains 50 million words, with an average word length of 5 characters. That's a total of 250 million (2.5 × 10⁸) characters. We're not counting spaces. Let's also assume that we need two digits to encode each character (lowercase & uppercase characters, digits, and special characters). This would mean that we need 500 million (5 × 10⁸) digits to encode the entire encyclopedia. We could probably pack this a little more efficient, but considering the following analysis, it really doesn't matter:

A toothpick that could contain the entire Encyclopædia Britannica by the location of a single mark would require a length of:

10^{500 000 000} atoms

That is ten to the power of five hundred million, or a one followed by five hundred million zeros. To put that in perspective: a toothpick the size of our galaxy would be approximately 10³⁰ carbon atoms long. That's still 499 999 970 orders of magnitude smaller, but there is really no way to visualize numbers this large^*.

Let's go the other way, and calculate how many digits we could encode on a toothpick of (say) 3 inches long. This toothpick would be a carbon strand 5 × 10⁸ atoms in length. This carbon strand could encode log(5× 10⁸) = 9 digits precision. Just enough to spell out one single word.

A more feasible (ahem) way to encode the Encyclopædia Britannica onto the toothpick would be to use multiple markings as a binary code. Every 8 carbon atoms along the length would act as one byte. In this scheme, we would have 5× 10⁸ / 8 = 62.6 MB of data. Just enough to fit the entire text of the encyclopedia.

* Ok, Jongleur seems to be better at visualizing these large numbers and calculated that a chance of 1 in 10 to the power of 500 million would be equal to winning the Powerball lottery in all 20 states that offer it with the same lucky number and getting attacked by sharks on 6 occasions in one week, every week for the next 5000 years.

I like it!

4 C!s

(idea)

by spare aardvark

Wed Apr 09 2003 at 11:40:59

Quibble quibble. First, there is no guarantee at all that any particular sequence is in the digits of π, unless you've already calculated that far and found it. A common misconception, but see is pi normal? for the current state of play. So we can't be sure we'll find the Gödel number of the Britannica in that particular representation.

So let's store it as the data of a quantum computer. Toothpicks are made of wood: what's wood made out of? Carbon, oxygen, hydrogen, a few other elements. And how much does one weigh? Let's use a big chunky toothpick that contains a whopping 2 g of atoms the average size of carbon atoms: that's one sixth of a mole, and therefore contains 1/6 x Avogadro's number of atoms. That'll do for a round number: 10²³ atoms.

Call the content of the Britannica a round thousand million bits. Then each atom has to store 10¹⁴ bits, or about 2⁵⁰. A much more manageable number.* All we need now is a bit of quantum superposition

Read footnote, not this next bit...
Hmmm... tricky. A carbon atom's got about twelve nucleons and twelve electrons, oxygen a few more, but on average about 25 particles per atom. Each of these might have a few discrete states it can be in, like spin, but I foresee falling short if we rely on those. Let's instead just consider the six to eight shell electrons on each atom... say 8 because I can divide it into 2⁵⁰.
So we need to excite every electron in the toothpick simultaneously into the first 2⁴⁷ or so energy levels above their ground state.

* Hey, tdent points out an even easier way, which involves not getting the maths wrong. 10²³ atoms storing 10⁹ bits is a whopping 10¹⁴ atoms per bit. So we can pick and choose which of them get used in the superposition, and maybe even build a detector out of the rest.

As tdent said, the paper Britannica itself is not many orders of magnitude larger than a toothpick. Let's say it's 10 000 times bigger in round numbers, 20 kg as against 2 g. This prompts the thought of whether we could actually keep the text on the paper copy: how much heavier is a sheet of printed paper compared to the same blank? A ten-thousandth more? Throw away the heavy covers. Keep just a representative portion (say a central line preserving the shape) of the printed letters, attached to a continuous carbon nanotube: if that's thin enough it should no more than double the mass. Then wind this around the toothpick.

Nanotubes, did I say? You could probably do it in DNA and set up your own miniature printing press.

I like it!

(thing)

by Glowing Fish

Thu Oct 02 2008 at 3:54:38

The encyclopedia wand is a device mentioned in Haruki Murakami's seminal science fiction novel, Hardboiled Wonderland and The End of the World. I do not know if the book is the true locus classicus of the idea, but I can't find a reference for it elsewhere, and it seems likely enough that Murakami's inventive mind would have originated the idea.
How the encyclopedia wand could theoretically encode any amount of data by inscribing a single point on a rod of some sort. The way this would be done is by first taking a piece of text and encoding it in some form, such as taking the word "word" and encoding it as 2415184 (in the interest of brevity, let aside the technicality of how to distinguish this from BDAEAHD, as well as how to encode punctuation and various other character sets). However large the string ends up being, just put a decimal point in front of it, and then using some type of laser, notch a point on your rod at that point. So, for example, the word "word" would be notched .2415184 of the way up the stick. Whatever the length of text. it can be compressed this way and then inscribed as a single point on a rod.

There are two possible problems with this: one is that even though the data is only physically cut in one place, the data only makes sense in the context of the total length of the stick. This in a way relates to atomic paradox: if you take the notch as an atomic point, it doesn't make any sense, it only makes sense in terms of the entire continuum of the stick. That point being rather philosophical, the practical problems with the encyclopedia wand need to be mentioned. As I said above, the word "word", which is, after all, only four letters long, would be encoded as .2415184, a seven digit number. If you were to take a meter long stick, that means that the notch would have to be 100 nanometers long. Using normal materials engineering, the smallest amount of material that could be notched on our encyclopedia wand is a single atom, which is one ten-billionth of a meter. This means that the amount of data that could be inscribed on a one meter stick would be, at most, ten characters. This sentence would not fit on an encyclopedia wand. If you were to go beyond normal materials engineering, perhaps by finding some unobtanium, you could get the encyclopedia wand to be notched down to the Planck length of 10^-39 meters, you would only be able to inscribe around 39 characters, perhaps less. on a one meter long stick. That means that even using this theoretical unobtanium construct, this paragraph would not fit on an encyclopedia wand.

So while the encyclopedia wand is a fascinating idea, and has a lot of meaning in terms of Murakami's book (although what that meaning could be is a matter of complicated hermeneutics), it doesn't not seem to be a practical reality.

I like it!

1 C!

(thing)

by Pandeism Fish

Sun Sep 18 2011 at 6:04:06

Though the ability to generate a precise decimal number by cutting a notch a set length down a stick may be surprisingly limited, no such limitations would appear to apply if what is denoted by the mark is instead a fraction or some system indicating a series of fractions. For example, suppose we divide the stick into 22 equally distributed notches, and then placed a mark at the seventh notch. This would indicate that the number to be generated would be 22/7 -- which was for some time the best equivalent available for pi, and the reason why fraction lovers celebrate Pi Day on July 22 (using the European tradition of indicating day/month) instead of raising a ruckus on at 1:59 on March 14. Then again, anyone who would be excited about Pi Approximation Day would likely already have been excited by regular Pi Day. But back to fractions, 22/7 is a fraction convertable to the decimal 3.142857..., where that last '...' represents an infinite repetition of the '142857' sequence.

Such infinite strings of fractions are in fact far more common than the alternative, there being far more potential divisions of numerator by denominator generating like strings than those generating neat resolutions such as 3/4 equaling .75. Concededly, most such strings are simply self-repeating after a point, and so provide no additional information themselves, but at least their infiniteness makes them useful for modifying long strings of non-self repeating numbers. And indeed, in addition to common fractions, there are any number of other numerical functions which generate infinite strings of generally non-reapeating numbers, such as true pi, Euler's number, and the vast majority of square roots. Because pi itself is a non-repeating number, it would appear that a work of a hundred thousand or a million or a billion characters could be encoded into pi if only a fortuitous string of number could be found within it, or within some convenient multiple of it; if for example, having converted an encyclopedia into a string of numbers, we could determine that the entire string existed somewhere within pi (or pi times 3/17 or the like), perhaps starting at the 1,375,712,006th decimal place and ending at the 3,441,255,102nd decimal place. It must be conceded that the chances of finding a single useful string of such length even within an infinite series numbers are slim -- pi is not a random number (like the type we might expect to find if an infinite number of monkeys pounded out nothing but numbers on an infinite number of keypads), and even were the possibility of it existing substantial enough to consider, but the right string may lay hundreds of trillions or billions of trillions of digits in, making it unlikely that the computing power will ever exist for it to actually be so easily found.

But all hope is yet not lost, for we don't actually require that single billion-digit string. We need only find, in pi or some other irrational decimal, or in some multiplication of such terms (as in 3 times pi times the square root of 7), a string of correctly ordered numbers to give us the few hundred characters needed to allow us to provide instructions of how to calculate the next set of decimals and fractions capable of yielding an even longer string, culminating in one being the length we need. This function is recursive to a certain degree of simplicity. The calculation used to generate one billion digit string might be set forth in an instruction containing a few hundred thousand characters -- picture a typical book like the first Harry Potter book (which is just under 77,000 words, so probably around 400,000 characters), but containing nothing but instructions to generate a longer series of numbers -- describing where to pluck the first string of a few thousand-odd digits, then perhaps the next string of twelve hundred and some, and the next string of a few thousand more.

But once we have described the operations to be performed, we can convert that description itself into a string of a few hundred thousand numbers, and create a string of a few thousand numbers which provides the instruction of how to generate that few hundred thousand. And this sort of condensation may be continued until we reach the shortest easily generated string capable of telling us how to generate a longer string. Which might be something as simple as telling us "3/5•pi, 2,100 digits from 1,035,721st decimal." And if that instruction can be fractionally encoded on a toothpick, there is no actual limit to what can be encoded, provided a computer exists capable of calculating out and converting each successively expanding string of instructions at the other end, until the final instruction yields the billions of digits into which the entire encyclopedia has been encoded.

I like it!

How many angels can dance on the head of a pin?	Hard-Boiled Wonderland and the End of the World	The phrase 'God is mathematics' is hidden in the digits of pi	is pi normal?
Haruki Murakami	Visualizing large numbers	The most disturbing thing that I have ever seen on the back of a truck	Toothpicks: Harmless tools useful in maintaining dental hygiene, or HORRIBLE, DEADLY WEAPONS!?
A grain of rice with your name written on it	Planck length	thought experiment	Encyclopædia Britannica
Zeno's Paradox	Blender on a stick	Paul Auster	Godel number
Internet on a stick	Infinite monkeys theorem	Arabic transliteration	quantum computer
nuclear fission	Planck's constant	ASCII