The Velocity Engine is a Single Instruction Multiple Data (SIMD) processor contained on the G4 CPU.1 It is a 128-bit vector execution unit, which operates concurrently with existing integer and floating-point units.2 All data paths and execution units are 128 bits wide.5 It is worth noting that only Apple Computer calls the Altivec processing functions of the G4 series CPUs the "velocity engine"; Motorola (who created it) simply refers to it as "Altivec". It is worth speaking about as a functional unit in itself (even though it is made up of other units) since it is essentially grafted onto the architecture, which could function without it (though without SIMD.) The only significant change to the base architecture to enable Altivec is an upgraded memory controller.

The Velocity Engine
The PowerPC G4 with Velocity Engine works with the PowerPC architecture to accelerate the data-intensive processing required by next-generation video, voice and graphics applications. Among the G4 key features is a single instruction multiple data (SIMD) function capable of processing several calculations in the same instruction. You can think of a vector as a multicomponent number, such as a = {1, 3, 2, 5} or b = {2, 12, 4, 40}. To add (a + b) the Velocity Engine adds all components in the same cycle, with a result of {3, 15, 6, 45}, greatly enhancing performance of complex calculations. These vector processing advantages give the PowerPC G4 a significant edge when it comes to visualization. Making the PowerPC G4 perfect for everything from digital video, graphics and 3D games to astronomy, the biosciences and predictive modeling.

G4 Processors, Apple Computer Website (3)

The Velocity Engine of the PowerPC G4 family (7xxx) processors is the name given to the four vector units which process SIMD operations. They are:

  • VPU, the Vector Permute Unit, which executes permutation instructions such as pack, unpack, merge, splat, and permute on vector operands;
  • VIU1, or Vector Integer Unit 1, which handles short-latency AltiVec integer instructions such as addition;
  • VIU2, or Vector Integer Unit 2, which handles longer-latency AltiVec integer instructions such as multiplication;
  • VFPU, or Vector Floating-point Unit, which handles floating point operations.

Additional support for Altivec is provided by the G4's Load/Store Unit (LSU) which contains a four-entry vector touch queue. All of this is above and beyond the four integer units and the floating point unit (FPU). Non-vector integer math is handled by three identical units (IU1a, IU1b, and IU1c) which process all normal instructions except except multiply, divide, and move to/from special-purpose register instructions, and a fourth unit (IU2) which executes the balance of integer operations. The G4 processors can execute vector operations and non-vector operations simultaneously (in superscalar fashion).


References:

  1. Velocity Engine. Apple Computer, 2002. (http://developer.apple.com/samplecode/Sample_Code/Devices_and_Hardware/Velocity_Engine.htm)
  2. Apple's Altivec Home Page. Apple Computer, 2002. (http://developer.apple.com/hardware/ve/)
  3. G4 Processors. Apple Computer, 2002.(http://www.apple.com/powermac/processor.html)
  4. MPC7455 RISC Microprocessor Hardware Specifcations. Motorola, February 2002. (http://e-www.motorola.com/brdata/PDFDB/docs/MPC7455EC.pdf)
  5. MPC7450 RISC Microprocessor Family User’s Manual. Motorola, December 2001. (http://e-www.motorola.com/brdata/PDFDB/docs/MPC7450UM.pdf)