Simultaneous Multithreading (SMT) is another evolution in processor architecture that allows the CPU
to process a greater number of instruction
s per clock cycle
Out of Order
processors allow a processor to execute instructions in an arbitrary order with some instructions happening in parallel
. Only one process
can run at a time, a pipeline flush
has to occur to switch to another thread. Often a single thread can only be parallelized a small amount by the hardware.
SMT combines out of order processing capability with the ability to run multiple processes or threads "at the same time." Since the threads don't use the same register
s or memory
space(*) then the processor can run many more instructions in parallel than with a single thread - there are no dependencies
between the instructions. The additional complexity
added to the processor is not trivial, but the performance
increase can be very large.
Intel's version of SMT is called Hyperthreading
, and can run 2 threads or processes simultaneously. The processor keeps the instructions in the same buffers, giving the processes different register
sets and a few other seperate buffers.
The original P4
contained a full SMT implementation that actually worked, but in a few corner cases it slowed the entire processor to a crawl. Intel
decided to release the processor with "Hyperthreading" turned off until they fixed the performance issues bogging these exceptional cases.
(*) The SMT architecture appears as two seperate processors to the operating system
, so it's like having two processing units, two sets of registers, two sets of memory spaces, etc. The reality is that they are sharing the same pool of registers and processing units. So program A's register 1 is not the same register as program B's register 1. This means that any instruction from program A and any instruction from program B can be run simultaneously
even if they use the 'same' architectural registers (with some minor exceptions regarding locks
access, etc). This is not as difficult as it may seem since out of order
processing, used in processors for many years, rename
s the registers as they come into the processer to a shared pool of registers.