Of all the bits of jargon and terminology used in computer architecture,
the term 'bubble' is one of the more whimsical, as well as being
wonderfully descriptive and almost poetically succinct. The word is
used to refer to the phenomenon of idle stages in a processor pipeline,
and the name, by analogy, tells us a little about the behaviour of such
idle units.
Let's illustrate this with a simple example. Consider a simple 5-stage
RISC pipeline, like an early MIPS processor, or the ARM processor in
your mobile phone, and a simple program fragment:
Pipeline:
Fetch -> Decode -> ALU -> Memory -> Writeback
...and our program fragment, a simple implementation of the strlen() function:
mov r1, r0 ; save pointer
mov r0, #0 ; initialise count
loop: ldrb r2, [r1, #1]! ; load byte from memory at r1, increment r1.
cmp r2, #0 ; compare byte with '\0'.
addne r0, r0, #1 ; increment string length
bne loop ; loop back round if it's not
mov pc, lr ; return from function
(This is obviously some handwritten assembly code rather than the
output of a compiler. You can tell because of the comments, and
because any compiler worth its salt would have eliminated that 'addne'
in the loop in favour of a subtraction outside the loop using the
exit value of the induction variable...)
Instructions flow through the pipeline in the obvious order: they're
fetched from memory in the Fetch stage. They're decoded in the Decode
stage, where the processor works out what else it needs to do in order
to execute the instruction. In the ALU stage, simple arithmetic
instructions are executed, and the addresses for memory instructions
are calculated. In the Memory stage, loads and stores access memory
using the addresses they previously calculated in the ALU, and in
Writeback, assuming any memory accesses went off without a hitch, the
results of the instruction are written back to the register file.
So the instruction sequence above can execute, with the help of a a
little register bypassing without much problem for the first few
cycles, each instruction following hot on the heels of the previous
one, and each pipeline stage busy doing something useful in every
cycle. Until, that is, the pipeline hits the following state:
Fetch -> Decode -> ALU -> Memory -> Writeback
addne cmp ldrb add mov r0, #0
What happens in the next cycle? At first glance it seems obvious: the
add moves to the Writeback stage, the ldrb moves into the Memory
stage, and the cmp instruction moves to the ALU. But there's a
problem. The cmp instruction needs the value loaded to r0 by the ldrb
instruction. But the load instruction hasn't yet entered the Memory
stage. The value loaded from memory will only be available at the end
of the next cycle, but it's needed as an input to the ALU at the
beginning of the cycle.
So the comparison instruction must wait in Decode until the load
instruction completes its memory access, moving into the Memory
stage.
So during the next cycle, the pipeline looks a little like this:
Fetch -> Decode -> ALU -> Memory -> Writeback
addne cmp (empty) ldrb add
During this cycle, the ALU pipeline stage is idle waiting on the
result from the load in the Memory stage. It's a gap in an otherwise
full, efficiently-operating pipeline.
In the next cycle, though, the memory result is available, and the
pipeline can go about its business. Every instruction is free to move
on to the next pipe stage:
Fetch -> Decode -> ALU -> Memory -> Writeback
bne addne cmp (empty) ldrb
But what's this? Not only have the instructions moved on, but
so has the idle pipeline stage. Since there was no
instruction in the ALU stage in the last cycle, there is now no
instruction to move into the Memory stage. It's an idle pipeline
stage, but it moves through the pipeline in a way that makes it look
almost like a real instruction.
It's this behaviour that leads us to call these phenomena
'bubbles'. Much like bubbles in a narrow pipe, these gaps move along the
pipeline, reducing the efficiency of the processor.
Pipeline bubbles occur any time a pipeline must stall some
instruction while waiting for the results of some other
instruction, or when a branch is mispredicted, or when cache misses
hold things up. In simpler pipelines, they'll also occur on
write-after-write and write-after-read dependencies, and things like
that.
So! Next time you think your computer is going too slow, you know what
to do: check it for bubbles!