The new Pentium 4, running at 3Ghz, has Hyperthreading (HT)? What in the world is Hyperthreading, and does it actually do anything? Basically, it is a new technology developed by Intel that puts simultaneous multithreading (the ability to run two things at the same time) in consumer processors.

Consistent breakthroughs in the technological realm are tough to maintain. Wall street requires consistent growth and consistent improvement. Intel has answered the pressure put on by AMD (rival chip company) with not only higher clock speeds in it's processors but also this new technology. Hyperthreading causes performance increase of anywhere from 0-25% by using more of the CPU at once, instead of simply increasing the clock speed like normal.

How it works
Hyperthreading (HT), aka. Jacksonville Technology, is Intel's version of simultaneous multithreading (SMT). Multi-threading involves running more than one thread at the same time. A thread is a portion of a program that the Operating System tells the CPU to run, a stream of instructions. A Pentium 4 processor with HT enabled appears as two "virtual" processors to the Operating System. The Operating System can now seemingly run both CPU's at once, scheduling two independent threads at the same time, however underneath there is still only one processor. The processor takes both threads and runs them at the same time, utilizing as much of the processor at once as is possible. This makes the processor much more effective because more parts of it are being used at once, thereby increasing the Instruction Level Parallelism (ILP).

Another large benefit of Hyperthreading is that it is fairly cheap to implement. Because the additional threads all run on the same CPU elements (FPU, ALU, etc) the only additions that are needed are the initial scheduling process. The CPU has to have two sets of basic CPU registers such as the Instruction Pointer or the advanced programmable interrupt controller (APIC) registers. So the cpu state space, as far as the operating system is concerned, has two cpu's. The Operating System must support dual processor (at least) or specifically be designed for Hyperthreading technology in order to utilize it. Windows XP, Linux 2.4.12, and Windows 2000(multiprocessor only), have support for Hyperthreading.

Intel additionally wanted the processor to be able to run one thread normally if the other happens to stall (because of a memory access, interrupt, or branch mis-prediction). This means at each stage, there are two sets of regtisters to store the current instruction. This creates the flexibility for the normal CPU pipeline to run one thread at full speed, or two threads at the same time. Because of this there is practically no downside to Hyperthreading. Running a single thread at a time will run just as fast (or with negligible slowdown) as a regular non-Hyperthreading processor.

                         
     Pipeline            Instruction Scheduling 
                        
    +----------+         THREAD1         THREAD2
    | Fetch    |          \ AA \         / BB /
    +----------+           \ AA \       / BB /  
        ||                  \ AA \ CPU / BB /  
  Queue1  Queue2             \ AA \   / BB /
        ||                  ----------------   |
    +----------+            |  A  A  A  B  |   |
    | Decode   |            |  B  A  B  B  |   |
    +----------+            |  A  B  A  B  |   |
        ||                  ----------------   V
  Queue1  Queue2            / AA /     \ BB \
        ||                 / AA /       \ BB \
    +----------+          / AA /         \ BB \
    |Cache Fill|         / AA /           \ BB \
    +----------+        THREAD1           THREAD2        
        ||  
  Queue1  Queue2
       
      Fig. 1                     Fig. 2  
Figure 1 shows the beginning of the instruction pipeline. As the uOps (Micro-Operations that the Pentium 4 uses) come down the pipeline they are temporarily stored in the queue corresponding to which virtual processor they came from. The core processor stages are shared, but each virtual processor has its own pipelining queue.

Figure 2 is an ascii art representation of a common Intel graphic used to illustrate Hyperthreading. The instructions "AA" and "BB" come from two different threads. They move down the illustration into the processor, where they are run at the same time, and are sorted out at the end. This makes better use of the processor than would otherwise be possible. A normal processor has several different "components", such as a FPU, an ALU, a memory access interface, and usually they are not all being utilized at the same time. If one instruction "AA" consists of only Floating Point operations, while the other instruction "BB" consists of memory accesses and integer operations, then both can be executed at the same time. This increases the ILP and makes overall execution time shorter. CPU utilization can increase from about 35% non-HT enabled to around 50% with Hyperthreading.

Performance
Hyperthreading makes you single processor look like two separate processors. This, however, will not make your computer run twice as fast. There is still only one processor, but it is being used more efficiently. Typically, on multithreaded applications, or when running more than one program at a time, the end user will see roughly a 2-30% speedup (depending on where you get your information). Applications will open much faster, and if you like doing more than one thing at a time, you will see a vast improvement.

Hyperthreading will not, however, make your 3D game get more Frames per second (FPS), or make a single-threaded benchmark run significantly faster. In fact, when running single-threaded applications (that do not use multiple threads at the same time, and therefore cannot take advantage of the second "virtual cpu") you might see an increase, but the vast majority of the time you will see no difference. This is in itself a great achievement, the added overhead of scheduling and trying to handle two different threads at the same time should make single threaded applications run slower, but it doesn't on the newest implementation of Hyperthreading. (There were one or two benchmarks that were slower with Hyperthreading enabled of about 50 - 100 I surveyed).

Utilizing HT
In order to make applications and programs use this new technology, you have to make them able to run in parallel. A common system used to do this is OpenMP (suggested by Intel, but anything that parallelizes the code will work). With OpenMP you can insert pre-compiler directives into C or C++ code that tell the compiler and operating system how to split up your program into multiple threads. Here is a short code example, see the write-up for more information :


/*parallelize iterations of a for loop using OpenMP*/

#pragma omp parallel for private(x)
  for (x=0 ; x<limit ; x++ ){
     array1[x] = x * y;
  }
This particular code example may not speed up on a HT enabled machine, but it does show how to parallelize code using OpenMP. It is probably worth noting that if you implement parallelism like this, you must pay attention to dependencies, to make sure that each thread has the data it needs to run correctly. (see OpenMP and data dependency for more information)

Conclusion
As technology pushes forward, and large multi-national corporations struggle to consistently gain market share it is hard to separate marketing hype from true technological improvements. Hyperthreading is not merely a marketing buzzword like NetBurst (Intel) or QuantiSpeed (AMD). It is a significant architectural improvement that will yield noticeable improvements in execution time. The technology was streamlined on the Xeon processor before released on the Pentium 4 so that single threaded applications would not be hurt at all. Intel has managed to pull itself way ahead of AMD lately in performance and clock speed. AMD, however, has not been sitting idly by while Intel gains back lost market share. The release of AMD's Opteron (32 and 64 bit processing) may be the big hit that AMD needs. For now, however, Intel has the lead with the fastest CPU in the desktop market.


Sources :
De Gelas, Johan. "3.06 GHz Pentium 4 and HyperThreading," Ace's Hardware. 14 November 2002. <http://www.aceshardware.com/read.jsp?id=50000319> (Nov 2002)

"Intel Pentium 4 Processor with Hyperthreading Technology," Intel Home Computing. Feb 2002. <http://www.intel.com/home/desktop/pentium4/hyperthreading.htm> (Nov 2002)

Marr, D.; Binns, F.; Hill, D.; Hinton, G.; Koufaty, D.; Miller, J.; Upton, M. "Hyper-Threading Technology Architecture and Microarchitecture: A Hypertext History," Intel Technology Journal. Feb 2002. <http://developer.intel.com/technology/itj/2002/volume06issue01/> (Nov 2002)

Völkel, F.; Töpelt, B.; Scheffel, U.; "Single CPU in Dual Operation: P4 3.06 GHz with Hyper-Threading Technology," Tom's Hardware Guide. 14 November 2002. <http://www.tomshardware.com/cpu/02q4/021114/index.html> (Nov 2002)

Log in or register to write something here or to contact authors.