Computer Parts

This lesson deals with the parts of the computer outside the CPU. This is a bit bizarre, because we have not exactly said what the CPU does yet. But the matter is that the business of the CPU (computation) is a rather arcane one, while the periphals and storage do well-defined, rather understandable things.

Bits and Bytes

notice that: Many things in this lesson will make sense only if you have a clear idea of what a bit and a byte are.
For our purposes a bit is "something" that can be turned on or off. It has no other states, nor any intermediate ones. It is either on or off. Aut aut. Traditionally, these states are called "on" and "off", or 1 and 0. They could be called "foo" and "bar" or "jack" and "jill". Do not attach any mystical significance to it.
One bit only distinguishes between two states. But we generally want to speak about more than two states; for example, a man can be single, married, divorced or widowed. This is already four states (more or less desirable).
By the artifice of ganging together two bits, we can distinguish among four states, namely 00 01 10 11 (the order is not important for now). Conventionally, we can establish a conventional mapping such as :

00 single
01 married
10 divorced
11 widowed

You will have noticed that by adding one bit to the representation, we have multiplied by two the number of states. If we added another bit, we would have a total of eight states, namely 000 001 010 011 100 101 110 111. These would be three-bit words, BTW. In other words,

a sequence of n bits has 2ⁿ states
a word of n bits represents 2ⁿ symbols

This is fine and dandy if what we want to represent is the marital status of humans; but what happens if we want to represent something very dear to us, that's to say text? Then we need to distinguish among characters; A B C ... a ... z ... 0 ... 9 * & # ^ ( @ and sundry. The number of characters necessary (including space, carriage return, tab and other strange and wonderful things) is quite large but not enormous.
In fact 7 bits (for a total of 2⁷ = 128 symbols) would be enough. For historical reasons (and also because we really like powers of 2) we use 8 bits. And what do we call a sequence of 8 bits?

one byte is 8 bits

at least in current usage. Computers are organized around bits and bytes. In many situations, bytes are combined in groups of 2 or 4 (that's to say, 2 bytes = 16 bits and 4 bytes = 32 bits). The names "word", "long word", "double word" are variously used - but there is no standard meaning; they have different definitions depending on the specific computer being discussed.

Storage

Q: WHAT DOES IT DO?
A: IT REMEMBERS THINGS

From the earliest computers onwards, we find that the issue of storage has always been one of pressing importance. What is storage? It is basically where the computer puts the data that is not sitting in one of the registers of the processor; it is also the place where the program that is being executed lives. This data must reside somewhere, since usually a processor has a very limited amount of registers (normally less than 10), insufficient for even the smallest useful programs.
So, having determined that we need some place outside the CPU for storing data and programs, where do we find it? We find it in the storage pyramid.

The storage pyramid

In engineering the tradeoff is one of the most basic concepts. Tradeoffs and market pressure explain why we have so many different technologies and standards in the field.
Tradeoffs also explain the coexhistence of different storage technologies; in the case of storage, it is usually tradeoffs between access speed and cost per megabyte. Access speed should be considered as a compound of latency and throughput.
Another important element is random access as opposed to sequential.

In discussing storage technologies, it should also be remembered that the channels between one layer and the other are increasingly smart; techniques based on locality (like "read ahead" in disks) try to reduce the performance hit incurred in going from one layer to the other.
Storage can be thought of as forming a pyramid or a hierarchy. The technologies at the top having high "speed", availability and price, and they are closer to the sancta sanctorum - the CPU.
The ones at the bottom have the lowest price and are the "slowest" (I am using quotes because the concept of "speed" conflates many different components that a finer analysis must separate). They are also farthest from the CPU.
Again, it should be noticed that in many systems one or more of these layers are missing.

CPU registers: a few bytes; access in the order of nanoseconds (one processor cycle).
The premium form of storage, well exploited only by optimizing compilers or by extremely well crafted assembler programming.

CPU cache: (AKA L1 cache, L2 cache and other commercial names). A chunk of RAM, 16 to 128 kilobytes in size, placed on the same silicon die with the CPU; access in the order of tens of nanoseconds. It is a good idea to have your loops fit completely within this cache (see L1 cache for more info). Code that exhibits good locality benefits greatly from this cache.

System RAM: order of megabytes, placed on the motherboard in PCs, connected to the CPU (and occasionally to periphals as well; see DMA) by a high speed BUS. Memory access is in the order of the hundreds of nanoseconds and bandwidths of 100 Mbyte/s. Memory price is less than 1 USD / Megabyte. Burst modes introduce elements of sequentiality, but the device is essentially random access.

Hard Disk: order of gigabytes. placed in the computer enclosure (occasionally outside), connected to the motherboard via a bus like SCSI or EIDE (in some cases a serial interface, like FireWire/Fiber Channel). They are electromechanical devices with moving parts; this brings their access times on the order of milliseconds. Bandwidth is in the order of some Mbyte/s at best. Access here is still random, although with elements of sequentiality. The price is in the order of 10 USD/Gigabyte (depending on quality and maker).

Removable spinning media: CDROM, CD-W, CD-RW, Zip, DVD, Jaz: capacities between 1/2 and 7 Gigabytes, access times on the order of many milliseconds (depending on device state), bandwidth on the order of 1 Megabyte/s at best.

Removable tape media: still the lowest cost per gigabyte, capacities on the orders of a hundred gigabyte per unit, often organized in tape libraries and robot-managed tape vaults (multi-terabyte capacity). Access is sequential (seconds and even minutes of latency), the bandwidth can be close to the previous category.

The quick and the dead

It is also instructive to look at the dead storage technologies, and try to understand why they died.

core memory, more precisely magnetic core memory. Died because silicon RAM is easier/cheaper to produce with litography techniques. Probably it would have been difficult to miniaturize it very much.
drum memory: hard disks have less spinning metal. Also, hard disks can be stacked in disk packs; drums cannot. Their one technology advantage is that the whole surface moves at a constant linear speed, while the disk has constant angular speed; this means that disks must either accept inefficiency or use different-sized tracks.
punch cards: very bulky (every card is one 80 character line of text). Non-reusable (this is not necessarily a defect).
paper tape: much less dense than magnetic tape. Has the advantage of being readable by humans (with considerable training). Again, very bulky.
[audio cassette: used as storage media in home computers. Rather unreliable, limited to sequential access. The huge potential capacity per unit and improvements in speed loaders did not allow it to survive in front of the floppy disk (in this case the 5 1/4" floppy disk).
3 1/2" floppy disk (not dead but people are trying to kill it). 1,44 megabytes are not big enough for the bloated file formats of nowadays. Not enough for most multimedia. Unbearable as installation media for programs. On the plus side, they are very cheap, but CD-W is approximating its price per unit, at a vastly cheaper cost per megabyte.

There are also storage technologies fighting to get themselves established, for example:

memory stick/memory card. They are essentially the same thing; the first is a standard mandated by Sony, the second is used by everybody else. Market phenomena will probably decide who wins. Watch for growing integrations of these two standards into mainstream portable and desktop machines.

Evaluating storage technology

In the past the factors that determined whether a storage technology would survive (or at least occupy a significant part of the market) were probably:

cost, dollars/megabyte
sequential vs. random access
removable vs. fixed
latency (time between the request and the arrival of the first data), in milliseconds
bandwidth, in megabits per second (why megabits and not megabytes? Why, why?)
maximum size of one unit, in megabytes

I would say that nowadays some other important factors are:

Power consumption (important for portable applications and, to some extent, for high density rackmount systems). The average number of watts/hour is interesting, but one should also take into account the power pattern: a steady drain is harder on batteries than occasional surges. Some technologies can enter "sleep" modes where they consume very little power.
Density (expressable in megabytes per unit of volume or also in megabytes per gram). Critical in portable and embedded applications, important for practical application in desktop units. Not so important in large systems.
Openness: a proprietary system will have a harder life, because users will be reluctant to commit data to media/standards that may not be around in the future.
Longevity: we already know that some storage mediadegrade faster than the others (for example, recovering historical recordings from old audio archives can be extremely difficult). It is possible to make a guess as to the typical duration of a given technological media.

Exercise: analyze a storage technology along these parameters (for example: writable DVDs)

the virtuous circle of application and technology

If there is no popular application that needs a given technology (no problem to solve, as it were) that technology will remain at best a niche phenomenon (like CCM or VR tech).
For example, large scale storage used to be a specialty technology used by remote sensing applications (satellites are a veritable data fountain, spewing gigabytes every day). The Windows OS and its ever-bulkier applications made multi-gigabyte hard disks a necessity for the PC (this means cheap megabytes).
In turn, these cheap megabytes coupled with diffused Internet technologies make content sharing networks (like Napster, Gnutella, Filetopia and others) possible on the PC level.

Notice that content sharing is nothing new; the alt.binaries hierarchy on USENET is at least 15 years old. But it was born at a time when only large shared systems had gigabytes of storage and Internet connectivity. Thus the failure to make the cultural impact that (for example) Napster had.

Of course, it took one brilliant idea to make Napster possible. But it took also the technologies that were already in place; those technologies were there for entirely different reasons.

I/O

Q: WHAT DOES IT DO?
A: IT MOVES DATA IN AND OUT OF THE SYSTEM

I/O, short for Input/Output, is the set of functions and devices that move data across the boundaries that separate the computer system from the external world. Usually networking is considered separately, so for now we will consider I/O as concerning communication between the computer and humans or the physical world.

Output: printing

Printers: hystorically, one of the major functions of computers is putting ink on paper. This has remained so; the computer revolution has never produced a paperless office. Our paper may be recycled, and certainly the fonts looks nicer. But is paper nonetheless. The very important business of staining this paper is done with two classes of technologies, impact printing and non-impact printing (a rather ad hoc distinction, but useful).

Impact printing

In this type of printing something physical smashes on the paper, transferring ink to it (usually from an ink tape). The teletypes printed like this, daisywheel, dot matrix, line printers and the IBM Selectric are all impact technologies.
Impact is good if you want cheap and fast printing. It is also mandatory if you want to make carbon copies of what you print (a surprisingly frequent legal requirement).

Non-impact printing

In non-impact technologies ink (or something) else is transferred to paper by:

shooting it out of a tiny nozzle: ink jet in its various forms. Ink jet is currently the cheapest way to good reproduction of continuous tone images.
static electricity: laser printers charge a photosensitive drum with a laser beam. The toner sticks to the drum and is subsequently transferred to the paper. A heat source melts the toner and makes it stick to the paper.
sublimation and condensation on the paper, as in dye sublimation printers.
scribbling: plotters use actual pens to write on paper. Very good for line arts, and also for reaching extreme sizes like A0. One really cute one was the Commodore 1520; big serious plotters are made by Schlumberger and Roland.
melting and solidifying, in thermal transfer printers like the Apple Scribe.

One non-impact technology is thermal paper printing. Prevalent in small applications like cash registers where the requirement of specially prepared thermal paper is offset by small size and simplicity of design. The print head is just a set of resistors that heat the paper in spots. The paper reacts by changing colors. There is no tape and no ink, and the printer head is really simple in design. The ZX printer used this technology.

Then there are all sorts of specialized printing technologies, like the Fuji Frontier (laser on photographic paper). Read about this beatiful freaks at http://www.large-format-printers.org/ - but they are all extremely expensive.

Output: display

Once upon a time there were two display technologies, vector displays and raster displays. Now raster has won, and vector displays are mostly historical curios (for reference, the old Atari Battlezone simulator used a vector display).
For a long time, raster displays were only VDUs, large vacuum devices that work by shooting an electron beam at a screen coated with fluorescent material.
Just like valves, VDUs are bulky and use large amounts of power (order of magnitude: 100 W).
The other widespread display technology is LCD, that allows one to make flat displays. Initially LCDs were monochrome, but now we have high quality color LCDs. LCDs can also be backlit.
Their defects are: angle of view is limited. Making them very large is extremely expensive. Their color rendition is generally worse than VDUs. They come in the passive matrix (cheaper) and TFT AKA active matrix (more expensive, better looking) flavors.

There are also plasma displays. Quite expensive to make, they can reach great sizes. Gas plasma was an older technology, that produced very sharp monochrome (usually orange) images.

Usually not considered in the display category are LEDs; by building a large array of them, you can create enormous (many square meters) screens, useful for displaying images on sides of buildings.
In the same categories are those displays where, by electromechanical means, a little "flap" is flipped between a black side and a colored one (like displays in many airports and train stations).

Output: sound

Modern computers tend to integrate into the machine synthetizers and specialized hardware to reproduce and create sounds.
The ancestor of the modern sound card was probably the SID chip on the Commodore 64.

Input

Human input devices: keyboards, mouse, trackball... Data input devices: normally based on some sampler (A/D converter); digitizers for sound, sensors...

this section to be expanded

Obsolescence/Deletion

a word to the wise guy: as users of slide film have painfully discovered, the permanence of media is a very serious issue. Many of the current printing technologies have very little information available about their permanence. Some of that information is very negative.
For example, standard ink jet inks fade in some years. Most thermal paper erases in a matter of years. Laser prints are supposed to be quite stable, at least as stable as the paper that they are printed on.
Issues of conservation of media are thorny. Should one use the latest and greatest at the risk of impermanence, or should one use older established technologies that are known to work?

hotographers face these dilemmas frequently; consider the Kodachrome/E6 conservation quandary. Both of them are slide films, fhe first lasts very long, if kept in the dark, but fades fast in light. The second fades slowly in all conditions. Ready to invest in that nitrogen-filled vault?

If you think that media decay is bad, you will really hate media obsolescence. And yet; consider the amount of data that you have on CDs today. Do you think that in 20 years from now you will be using CDs? Do you think that you will take the trouble to convert all that stuff into the new and improved storage format?
Of course, this pales in comparison to file format obsolescence (often forced) - even though insisting on open file formats can alleviate the problem (again, SGML is our friend, and so is plaintext).
It is clear that propelling our data into the future will turn into a major human activity. It is either that, or oblivion.

There is a very clear writeup under digital durability.

Class structure

Questions about the former class (The Evolution of the Computer)
the above material, with as much sidetracking as it is required by the matter, the students and the phase of the moon.
break: I have a cup of coffee and the students scutter madly to prepare their machines for showing the class their assignment. Students will mail their stuff do a designated and unlucky student that will connect his machine to the video projector and host the programs. Assignments were "visualize prime numbers (suggestion: the Square of Erathostenes)" or "show me a mouse trail (suggestion: use the <array> facility)".
Design by Numbers; additional hammering and chiseling of the programming block.
I know that some students have difficulties with the basic concepts of programming (variable, array, loop...) so we will go over them in great and painful detail, until all doubt has been eliminated. And then we will march hand in hand towards a glorious future, were everything is clear and beautiful, and all functions return have no side effects. Maybe.
The other assignment was a technology meditation assignment: "5 specific technologies that influenced the shape of your world; 5 minutes to explain why you chose them.". To be exposed in class.
Assignment: Show me in how many ways you can draw a checkerboard. Try to find non-obvious algorithms. Can you do it using only two repeat loops? Maybe using just one?

For additional fun: Civilization (http://www.lilback.com/civilization/ and http://www.freeciv.org/. Don't conflate concepts. Fear structured procrastination .

Evolution of the Computer <--- O --->>Listen to Me!

Listen to me!	Digital durability	Evolution of the computer	Dream Strangers
Ninety mile an hour tape	Civilization	September 11, 2005	such is the way for a survivor of broken promises
Code: The Hidden Language of Computer Hardware and Software	Structured Procrastination	tacos al pastor	TBD
slide film	AUT	array	cache
The Strangerhood	December 4, 2004	RAMAC	Vegetarian cornbread stuffing
R Tape Loading Error	Clear Channel's list of questionable songs	Beats & Pieces	IBM Selectric

Bits and Pieces