computers against spacetime

The computer against Space/Time

sixth or seventh (depending on the way you look at it) in Computer Science for Smart People

The life of a computer scientist at time can be pretty boring. Yet, it reserves the ultimate excitement; the eternal fight against space/time. In the previous classes, we saw that programs take up memory; memory takes up physical space in a computer case (it also uses energy, but so does computation itself). Also, it is very frequent to think of computer memory and storage in terms of space and physical things; we speak of an address space, of the disk being "full" or even of a "memory arena". Lack of memory space is -ultimately- on of the two big limiting factors, the other being lack of time.
Computationally difficult problems (like the travelling salesman problem) are hard because they take an amount of computation (and thus, time) that depends exponentially on the size of the problem. That's to say, if a given problem can be solved in ten seconds, when it gets twice as big it takes on minute; and if it gets three times as big, it may take a whole day.

There is also another sense in which we fight against space/time, and this is more germane to the subjet of this block of content that we humourously call a class. Consider your own computer, and consider mine; they are not the same computer (they would be if you were me, but I like to suppose that I am not writing this strictly for my own amusement).
The fact that your computer and my computer are not the same one, boring as it may be, implies that the data in my disk is not immediately accessible to you, or to the programs that you run on your own machine. Supposing that, for example, you want to use Winamp or Itunes or xmms to play my copy of stairwaytoheaven.mp3; the program must run on your own machine, to drive your speakers, but the file lives on my machine.
And this, in a sense, is the space barrier; the data that we are interested in usually does not live in one big host somewhere. It is divided among different machines. Because of this, your CPU cannot just read my files. By the way, the idea of putting everything relevant in one large machine is the idea behind the mainframe style of design; all the company data lives in one system. If it is all there, it is easier to keep it consistent; if it is one machine, it is easier to enforce security and to maintain the system properly. The mainframe is accessed through terminals, usually fairly simple-minded machines.

Contrast this with for lack of a better term) peer based computing or "personal computing" style; scads of smaller machines, each one with one user. The necessity of communication between these individual machines is readily apparent, but even the mainframe must solve the problem of getting the information to its terminals.

Even if the system you are interested in is an isolated system, it is often the case that the computer itself (in a box) must communicate with a peripheral (in another box), for example a printer. Since we can't just plop the printer on the system bus, we need some sort of communication setup.

A classification of distances

The final limiting factor, for connecting computers, is the speed of light (for the purpose of this discussion, 3 * 10⁸ m/s, in other words 30 centimeters in one nanosecond) . It is the speed of light (and, by consequence, the speed of electrical signals in a medium) that dictates many elements of the design of computer networks; space/time again.
By this point, you may have the impression that CS is all about classification; let us then classify communication technologies according to space and time:

micron to millimiter scale; propagation delays under the nanosecond (10^-9 seconds). At this scale, the problems are solved by digital design engineers inside the IC. This is the scale of the single component inside the computer.
millimiter to meter scale; propagation delays of the order of magnitude of one nanosecond. This begins to be relevant; for example, if you have a memory bus running at 100 MHz, every bit is 10 nanoseconds "long". If the bus is longer than three meters, then you can't suppose the bus line to be all in the same state at any given moment. This is the scale of the computer "case".
meter to kilometer; propagation delays of the microsecond order of magnitude. Consider a printer that is 100 meters away from your computer; by the time the bits reach the printer, 300 nanoseconds have passed, which means that your 1 Ghz Pentium processor has already gone through more or less 300 machine cycles. This is the scale of the LAN.
kilometer to thousands of kilometers scale; also known as the WAN and geographical network scale. At this scale, propagation delay becomes milliseconds; this is compounded with processing delays along the line (the signal usually goes through a lot of apparatus), and brings to round trip times on the seconds scale. To your CPU this is eternity.
Again, imagine that you are connecting to a site on the other site of the ocean via a satellite bounce. A geostationary satellite is at approximately 36.000 kilometers from earth. This alone means over 0.4 seconds of round-trip time.

So, up to this point we have established that computers in an organization (a term that can be interpreted quite loosely) need to be connected. But how do we connect them? Let us start with the simplest problem, the point-to-point link. That's to say, for now we will concern ourselves with how to connect just 2 computers.

A classification of topologies

Topology is a branch of mathematics concerned with certain aspects of space. In particular, topology is about connecting points with links (normally a point is called a vertex and a link is called an arc). Topology is only interested in which vertices are connected by which arcs, not in how long the arcs are, or in where the vertices actually are.

Point to point connections

More or less the same technologies are used to connect one computer to another (e.g. for file transfer purposes) and a computer to a peripheral (a printer, a modem, a scanner).

The parallel interface

In a parallel interface, the cable is comprised of a bunch of wires; several bits (usually one whole byte) are trasmitted at the same time by putting them at the same time on their respective wire. The old Centronics interface for printers was parallel. This style of interface is falling out of favor - its main advantage was the simplicity of design. The disadvantage is that it requires a large bundle of wires and a big connector. The crosstalk problems between the wires were a problem as well.

The serial interface

When we talk about a serial interface in a PC system, practically always we are talking about the RS-232 or some later variation like the RS-422 (many times they look like this). Serial interfaces transmit one bit at a time, chopping up bytes into their component bits. This clearly allows to build a "thinner" cable (less individual wires) and potentially smaller connectors.
RS232 manages to move up 115 Kbps (kilobits per second). Traditionally RS-232 uses a D connector, but on Macs a mini-DIN connector was also used. The Fibre Channel interface (employed for high-end hard disks and networking) is also serial, it runs at up to 1 Gbit/s and it can be configured as point-to-point

Bus

In the computer world, a bus is a bundle of conductors that a set of peripherals can share. This clearly saves cabling - picture the peripherals as the station of a railway, and the electrical bus as the rails; the stations are all connected via a shared medium.

There are buses inside most computer systems; PCI, ISA, EISA, MCA, NuBus... are all standards for buses that live inside the case and connect (for example) the CPU to the hard disks to the video card. But these buses, because of their design, are not really meant to reach outside the computer case. To do that we need another breed of buses, designed for rapid insertion and removal of devices, and for being hardy enough to withstand casual manipulation by the user.
The first of these buses to reach popularity was SCSI, still in use and made famous by the Macintosh. SCSI (that at the time seemed unspeakably fast), allows 8 devices to be present at the same time on the bus. Each device must have a unique address, set mechanically. SCSI had complex issues with termination; another problem was that the cables and the connectors were bulky, and expensive things (they had to be; the standard has 50 lines plus shielding).

Other contemporary buses, still in expansion, are USB and FireWire. The first one is meant to connect things like keyboards and mice (even if some Macs already feature irritating USB speakers, and there USB external drives), while the second was designed for devices that move large amounts of data, like DV video cameras or (again) hard disks.
These buses, conceptually, are sons of the Apple Desktop Bus (aka ADB), and have features like auto-addressing (devices on the bus are smart enough to identify themselves automatically) and hot plug, which means that devices can suddenly appear or disappear without the bus taking offense (if you try that with SCSI, for example, there is a good chance of the bus getting horribly stuck and taking the whole system down).

Network

All the systems presented under bus share some caracteristics; limited distance (some meters) and limited scalability (there is usually a hard limit on the number of devices that you can connect at the same time). For connecting computers in large numbers and over typical office distancesh you need something different. Something that deals well with the problem of space and time.

What you need is a network architecture. A network architecture tries to solve all the problems from the scale of meters to the scale of the thousands of chilometers; it usually consists of a set of protocols and specifications. A network architecture can be difficult to grasp at the beginning because of abstraction; the idea is (as it is usual in CS) that a difficult problem is broken down into a layered set of solutions to problems.

The network problem: making a machine A communicate with a machine B, regardless of the A B distance The network solution: a hierarchy of well defined protocols with well defined interfaces between the layers

One brilliant (and, by now, well documented way of looking at network is the ISO OSI mode (read it up under OSI Reference Model and The OSI Reference Model and TCP/IP). It would rather pointless for me to rewrite that up. Let me just remark on some important aspects of the OSI model (they would probably apply to other views of networking, like SNA):

you can stop at any layer, and think that you are talking to the homologue layer on the other end system. In other words, your Web browser really believes that it is talking directly to an other program running on the server. It knows nothing about modems, Ethernet or satellites.
one key point of dividing networking in layers is sanity of the implementer.
the other big point is flexibility. For example, when IP was designed they did not have ISDN; yet you can run IP on ISDN. This is obtained by having each layer talk only to the layer underneath it.
what is not well documented will be implemented in a thousand different ways, all incompatible. Fear the loose, complacent protocol. Love the fascist standard.

Networking is one of the most fascinating and complex and frustrating themes in Computer Science. There will be classes dedicated specifically to it. For now, let me just remark that the Macintosh (once again) was the first personal computer to have integrated networking hardware, in the form of the AppleTalk protocol stack; with some LocalTalk cables you could build a really cheap LAN.
The most influential networking protocol currently, TCP/IP is on the other hand originally a UNIX idea, and so are most of the services associated with it, like the DNS.

Class structure

I explain the above material and its relevance to the eschaton.
after a necessary coffee break, revision of the assignment (Make a puzzle in Design by Numbers)
assignment for next time: using the axonometry mini-library you can
1. either do something fun with it
2. add another primitive (like a cube, a sphere or whatever tickles your fancy
additional reading for your entertainment: wheel of reincarnation - think about Ethernet.

Notice that I finally found the time for writing a Design by Numbers language reference. I also recently ranted on axonometry.

Writing a program <--oOO--> Wherever the heart may take us

Computer Science for Smart People	Design by Numbers language reference	wheel of reincarnation	axonometry
The traveling salesman problem	Writing a program	The meek shall inherit the Earth... the rest of us will take the stars!	The OSI Reference Model and TCP/IP
Listen to me!	serial port pinout	Take This, Brother, May It Serve You Well	Apple Desktop Bus
Goat and Adding Machine Ritual	E.T. may have helped us evolve	Design by Numbers	Fermi Problem
Humorous Writings of E2	Eschaton	bomb alcohol	Computer programming - The ultimate power trip
pep pills	The Tao of Programming	Peripheral