display | more...
The distinction between processes and threads (in the computing/operating system sense of these terms) is one of those points that seems to puzzle a lot of technical people. Personally, I like to explain the two concepts using the following analogy/thought experiment:
Imagine a room with whiteboards on all of the walls. There are a number of telephones in the room and one of the walls has a long series of instructions on it. There's nobody in the room right now and the room's lights are turned off. This room is a place where things might happen although there's nothing happening there right now.

Now imagine that someone enters the room, turns on the lights, and starts to follow the instructions on the wall. These instructions guide the person through a task which involves communicating with "the outside world" via telephone calls. The instructions also "tell" (i.e. instruct) the person to keep track of the progress of their task using the whiteboards. When they're done, the instructions tell the person to turn off the lights and leave.

A process is like the room. It's a place where something can happen. Like the room, a process contains memory (i.e. the whiteboards) and has a program (i.e. the instructions on the wall) associated with it. Also like the room, a process has ways to communicate with the outside world.

A thread is like the person. It's the entity that does things in the room. It's what makes whatever happens in the room happen. Like the person, a thread follows the instructions in the process's program (i.e. the instructions on the wall), updates the process's memory (i.e. the whiteboards) and communicates with the outside world.

Most importantly (and also like the room), without a thread, a process is just a "place" where something could happen but nothing is happening.

Let's resume the analogy/thought experiment again:

Imagine a different room. It's much the same as the first room but the instructions on the wall are different. The room is currently empty and the lights are out.

Now imagine that someone enters the room, turns on the lights, and starts to follow the instructions starting with the first step. At some point in the instructions, the person is told to get help. Help arrives in the form of another person. There are now two people in the room. Fortunately, it isn't a small room and there's lots of floor space and whiteboard space. The original person and the new arrival proceed to follow the instructions on the wall (the new person was told to start at some step other than the first step). The two people are so focused on their work that they are, for all practical purposes, totally unaware of each other although they are able to see what the other writes on the whiteboard. As the two people work through the instructions, they're (independently) told to communicate with the outside world and update the whiteboard with their progress.

Eventually, one of the people reaches a step in the instructions telling them to leave the room (which they do). Sometime later, the other person reaches a step telling them to leave the room. Since this person is the last person to leave the room, the person turns off the lights on the way out the door.

In many ways, this room is very similar to the first room. In fact, the only meaningful difference between the two rooms before the first person arrives is that the two rooms have different instructions on the wall. What makes this second room interesting is that there are, at least for a while, two people working in the room at the same time.

This is analogous to a process with more than one thread. Just like the second room, a process capable of having multiple threads must have a sequence of instructions telling an existing thread to request a new thread. Just like in the second room, the new thread starts working at a specified place in the program (i.e. not step 1). Also like the second room, the two threads then work together and must share the process's memory (i.e. whiteboard space).

Time to return to the analogy/thought experiment:

Now, imagine what would happen in the second room if the instructions being following by the two people instructed them to both use the same part of the whiteboard at the same time. Clearly, things could get ugly as the two people "bump into each other" as they try to follow the instructions. It's quite possible that they'll both try to write something at the same place on the whiteboard at the same time. Assuming that they somehow manage to do this without "bumping elbows", the contents of the place on the whiteboard is likely to be little more than utter nonsense.
The same thing can happen in a process which has multiple threads - two or more of the threads can be "instructed" to manipulate the same place in memory at the same time. The result is that the "place" is almost certainly left containing "nonsense".

There are a number of solutions to this problem although they all come down to one of three approaches:

  1. make sure that the instructions don't request that the same place be updated by different people/threads (i.e. avoid the problem entirely by having each person/thread work in a separate part of the whiteboard/memory)

  2. make sure that it isn't possible for two or more people/threads to try to work on the same place at even approximately the same time (i.e. write the instructions such that two or more people/threads simply never even come close to manipulating the same place at the same time)

  3. implement some mechanism so that when a person/thread wants to manipulate a place which some other person/thread might already be updating or be about to update, the person/thread can ensure that no other person/thread is actually manipulating the place right now
The first two approaches almost certainly work the best as they completely avoid the necessity to be careful about updating the contents of a place (one could argue that the three approaches are merely different points on a range of approaches; one could but we won't).

The third approach becomes necessary when neither of the first two approaches is feasible. In the analogy/thought experiment context, the people would be instructed to somehow (e.g. verbally) ensure that they "reserve" the place before manipulating it. In the operating system context, the threads would be instructed (i.e. guided by instructions in the program) to "reserve" the place before manipulating it. In operating system terminology, such a mechanism is colloquially called "obtaining a lock" or "obtaining a semaphore1".

If done correctly, it's possible to have dozens if not hundreds of threads operating within the same process (just like, if the room was big enough, it would be possible to have dozens if not hundreds of people in the second room). Obviously, things get "unpleasant" if the coordination of access to shared places isn't done correctly (in either context).


This writeup has been an attempt to explain the operating system concepts of processes, threads and semaphores using a (somewhat artificial) "real world" analogy. Like all analogies, the "room" analogy doesn't exactly capture the full meaning of the process, thread and semaphore concepts. Let's have a (somewhat more technical) look at the operating system concepts of processes, threads and semaphores:

  • in most operating systems, processes are created at the same moment as the first thread in the process is created. In addition, if a process should ever reach a point when it has no threads in it then the process is deleted. i.e. unlike the room analogy, a process is never without threads (and there is no light switch as the lights, which don't really exist in a process, are "always on").

  • unlike people, threads don't enter or leave processes. They are created in the room either when the room is created or as a result of a request from an existing thread in the room. They also simply "cease to exist" when they're no longer needed (i.e. they don't "leave the process" but rather just "vanish").

  • one solution in the room context to the problem of coordinating (nearly) simultaneous attempts to manipulate the same location is to have the people communicate with each other verbally. Threads are simply not capable of communicating or even seeing each other in any way (they can see most of the results of what other threads do but they can't see the other threads). Consequently, coordinating access to a "shared" location by communicating directly between the threads (i.e. talking) just isn't possible. As a result, the threads ask the operating system kernel (i.e. the building manager in the room context) to mediate the access using a semaphore.

  • there are other coordination mechanisms available on some operating systems. These mechanisms, which go by names like mutexes, condition variables, barriers and such, are all conceptually equivalent to semaphores although they're dressed up in fancy clothes and, in some cases, are slightly easier to use for certain problems.
There is, of course, much more to the concepts of processes, threads and semaphores but the discussion above covers, I trust, the key aspects of these concepts.

Finally, it should be pointed out that the analogy/thought experiment works even if there's only one copy or a limited number of copies of the instructions in the room and they're written on something that can't be shared between people. The people in the room then need to take turns (i.e. it isn't necessary that the people be able to actually work simultaneously as long as they take turns at points in time which are random compared to where they might happen to be in the instructions).

In other words, there's nothing, absolutely nothing, about the notion of threads which requires the ability to have more than one thread actually active at any given moment in time. Also, a computer is typically capable of having one or a small number of threads active at any moment. The computer forces the threads which would like to be active to take turns so rapidly that neither the threads or any human observer is able to realize that the threads aren't actually all active simultaneously.


1The "semaphore" concept didn't get mentioned in the title of this node as I didn't want to scare people away with such an ugly word. I suppose that I could also argue, but I won't, that I was trying to save the "semaphore" concept as a surprise.


Source

I've developed this analogy/thought experiment over the years as I've taught various classes and courses on operating system internals. It seems to work well when I use it in a classroom context. Hopefully, you found it useful as well.

P.S. It's my experience that computing concepts that computer geeks seem to think are "too complicated for mere mortals to comprehend", like virtual memory, threads, processes and such, are often the easiest concepts to explain to "mere mortals" if done correctly. The concepts which I've found the most difficult to explain to "mere mortals" are ones that geeks are so familiar with that they forget that they exist (e.g. the notion that the computer follows EXACTLY the sequence of instructions given, the notion that the computer isn't actually "intelligent" even though geeks keep using words like "the computer decides" and such, and the simple fact that a computer is so stupid that it actually does what we tell it to do and that it, i.e. the computer, rarely makes a mistake (rather, it is the programmer (i.e. the human) who makes the mistakes)).

Log in or register to write something here or to contact authors.