display | more...
Processor that trades potential performance for backward compatibility with dinosaur programs. Apparently Intel feels there is something wrong with having a few people simply recompile code. If they streamlined everything, it would be killer speed. Instead it wastes CPU cycles on CISC code and managing the stone-age ISA bus. (although I must confess, my modem is ISA, due to all the PCI shit being winmodems and all)

I wonder how much of an improvement IA-64 really is. Probably much more bloated than the current x86's.

Zorin: Nice. It should be noted that the Pentium Pro up to Pentium III are all the same processor family: i686.

The Intel manufactured processors in the x86 line:

  • i8088 - 16 bit processor. Started the line. Had an 8 bit external bus to reduce cost. Oddly enough, many were manufactured by future competitor AMD.
  • i8086 - 16 bit processor. Started the line. Had a full 16 bit external bus. Not used much in PC's.
  • i80286 - 16 bit processor. Introduced the 286 protected mode, which failed because it was rather clunky.
  • i80386(DX) - Intel's first true 32 bit processor. Introduced 386 protected mode, which had all the proper memory management features any full fledged CPU should have. The "DX" nomenclature was tacked on when the 386SX (see below) was introduced.
  • i80386SX - Identical to the 386 in function, but had a 16 bit external bus. This saved money on lower end systems.
  • i80486(DX) - Basically, a 386 with a few extra instructions and an onboard cache. This cache improved performance vastly. It also integrated the x87 math coprocessor into the CPU for the first time. The "DX" nomenclature was tacked on when the 486SX (see below) was introduced.
  • i80486SX - A 486 with a broken or disabled math coprocessor.
  • Pentium Classic, or just plain "Pentium" - Intel's first x86 chip to use multiple instruction pipelines.
  • Pentium Pro - A slightly improved Pentium which has a large in-package L2 cache. Cache and core ran at the same speed. Failed due to high production costs- if the cache chip OR the die were bad, the whole thing had to be junked.
  • Pentium MMX - Added the MMX instruction set to the Pentium Classic processor. All Intel CPU's from this point on have MMX instructions available.
  • Pentium II - The answer to the Pentium Pro cache problem: two separate packages on a PC board that plugs into a slot, known as Slot One.
  • Pentium Xeon processors - Improved versions of the Pentium II, III, and 4, containing larger caches, and the ability to use more than two (or in the case of Pentium 4, one) processors in SMP mode. Intended to be used in servers.
  • Celeron - A name for any number of stripped down Pentium II or III chips. Usually they lack cache, or run at a slower front side bus speed.
  • Pentium III Katmai - Improved Pentium II. Has some extra instructions and runs at higher clock speeds.
  • Pentium III Coppermine - An improved version of the Pentium III that has an on-die 256K cache and runs at higher clock speeds. Contrary to its name, it does not actually use copper interconnect technology.
  • Pentium M - Mobile processor, based on the Pentium III with power saving tweaks. Actually does more work per clock-cycle than the Pentium 4, and requires less cooling, which has led to some people using it in fanless desktop systems.
  • Pentium 4 - A partial redesign of the Pentium III, simplified in some aspects so that it can run at higher clock speeds. This is a marketing gimmick, of course; the CPU can run at around 3GHz but it doesn't do nearly as much per clock cycle as the older Pentium III.

The line starts to blur as you get into the Pentium II/III and Celeron series. There are many different versions of the Celeron, Pentium III, and so on. Please note that this list is in vague order of introduction; the "DX" versions of the 386 and 486 came before the "SX" versions. Also, /msg me if you find any errors.

This is a timeline of milestones in the history Intel Microprocessors. Just listed are the first models, as Intel usually releases updated models with improved clock speeds within a few years of a line's initial launch.

  • i8088 - February 1978
  • i8086 - June 8, 1978
  • i80286 - February 1982
  • i80386DX - October 17, 1985
  • i80386SX - June 16, 1988
  • i80486DX - April 10, 1989
  • i80486SX - April 22, 1991
  • Pentium Classic 60/66 MHz - March 23, 1993
  • Pentium Pro (200, 180, 166, 150 MHz) - November 1, 1995
  • Pentium MMX (200, 166 MHz) - January 8, 1997
  • Pentium II (300, 266, 233 MHz) - May 7, 1997
  • Mobile Pentium MMX (200/233 MHz) - September 8, 1997
  • Mobile Pentium II (233/266 MHz) - April 2, 1998
  • Celeron 266 MHz - April 15, 1998
  • Pentium II Xeon 400 MHz - June 29, 1998
  • Pentium III (450, 500, 550, and 600MHz) - February 26, 1999
  • Pentium III Xeon 500/550 MHz - March 17, 1999
  • Mobile Celeron 333 MHz - April 5, 1999
  • Mobile Pentium III 400/450/500 MHz - Oct. 25, 1999
  • Pentium 4 1.4/1.5 GHz - November 20, 2000
x86 is both the generic name given to the family of microprocessors based upon (and backwards compatible with) the Intel 8086, and the instruction set architecture (ISA) implemented by these CPUs. The x86 ISA has been extended many times over the years and today barely resembles the 16-bit architecture introduced in 1978. It is by far the most dominant ISA for personal computers, and has also made inroads in the areas of high-performance computing and embedded systems. 

The initial success of x86 was tied to its use in the original IBM PC 5150 in the form of the Intel 8088.  The unexpected success of the IBM PC and its eventual series of (compatible) imitators brought the 8088 and its ISA along with them, establishing x86 as one of the prominent microcomputer architectures of the 1980s. Its main competition came in the form of the Motorola 68000 and its successors, whose minicomputer-derived architecture was widely considered to be superior to x86 and its descent from the grubby 8-bit processor architectures of the 1970s. The 68000 architecture would go on to power the main competitors to the PC architecture, including the Macintosh and the Amiga, until its replacement by PowerPC in the early 1990s.

In the end, what powered x86 past its competition was not any intrinsic advantage in the processor itself, but the PC-AT system architecture and its accompanying MS-DOS and Windows operating systems. The freely-available hardware specifications and promiscuously-licensed OS fueled a broad, competitive hardware marketplace that drove prices down and availability up, regardless (and often despite) the limitations of the 8088 and successor 80286 processor architectures, that the single-vendor competitor systems ultimately could not keep up with. 

The x86 architecture was updated to overcome many of these limitations with the introduction of the Intel 80386 in 1986. The 386 extended the ISA to 32 bits and, in 32-bit, backwards incompatible mode, did away with several of the most egregious limitations of the original x86 ISA. Most important of these was the awkward memory addressing system of the 8088 and 80286, familiar to MS-DOS users through the distinctions between 'conventional memory', 'expanded memory', and 'extended memory'. 32-bit mode replaced this with a straightforward 'flat' address space similar to the 68000. Though coprocessors for floating point math had been available since the early days of the PC, the 386 era saw the math coprocessor become much more common; with its integration on to the main CPU with the 486, these 'x87' coprocessors added their own (ugly) instructions to the x86 architecture.

With a power struggle over the replacement of the ISA system bus removing IBM from central guidance of the PC architecture and the death of all competitors except the now PowerPC-based Apple Macintosh, the x86 architecture consolidated its hold on the general computing market in the early 1990s. The rising Windows operating system carried x86 with it wherever it went, despite ports of Windows NT to PowerPC and Alpha machines, and with IBM no longer the central touchstone of new PC hardware, Intel became the face of the core PC architecture. The seeming double monopoly of Intel on the hardware side and Microsoft on the software side led the PC platform to be redubbed 'Wintel', a term used equally as a neutral description by its proponents and as a derogatory term by its detractors.

The continuing ugliness of x86 and unreliability of Windows led many to predict/hope for a new, more modern platform to supplant the Wintel hegemony. By this point, x86 was the only processor architecture remaining from before the 'RISC revolution', where it was found that simplifying the set of processor instructions allowed better hardware and software efficiency. Aesthetically, the RISC designs that competed with x86, with their 'less is more' philosophy, were much better embodiments of good engineering practice and thus deserved to win the day.

However, x86 processors were re-architected to benefit from the strength of RISC processor design, scrapping the original micro-architecture for the combination of a RISC-like execution core with a translation unit that converts the ugly CISC x86 instruction stream into a series of clean, simple RISC-style instructions. This allows the ugliness of the x86 instruction set to only have a constant effect on the overall performance of the processor, rather than requiring the main processor core to be complicated by its requirements. This constant penalty was originally quite large, but as Moore's Law has provided ever greater amounts of computing power, it has dwindled to near meaninglessness. 

The Wintel core in the otherwise highly-competitive PC marketplace was broken in the late 1990s and early 2000s. In 1999 AMD, a longtime manufacturer of Intel-compatible processors, released the original Athlon, an x86-compatible processor comparable, for the first time, with Intel's best and most expensive processors. The Athlon entered a long period of competition with Intel's Pentium III and later Pentium 4 processors, with the end result that AMD was comfortable introducing the next major step for the x86 ISA: x86-64, which extends the ISA to 64 bits while eliminating more of its limitations.

In the meantime, the upstart Linux operating system had mostly destroyed the older proprietary UNIX systems and has now been positioned as the strongest competition to Windows on its own hardware platform. Also, in 2006, Apple abandoned PowerPC and rebuilt the Macintosh on top of a PC-architecture Intel-based machine. While the Mac OS will still only run on custom Apple hardware, it is a testament to the power of the commodity x86 architecture that the Mac made the difficult transition.

Overall, the focus on backwards compatibility in both hardware and software has allowed x86 to become the standard architecture that it has. Change in the computer market, despite the manifold claims of computer companies, is generally gradual, with backward compatibility at every stage allowing people to adopt the new without discarding the old. This is true for both hardware, where the ISA bus lived long past its replacement by PCI, and in the world of proprietary software, where simple recompilation for a new architecture just isn't so simple. The PC architecture and the x86 processor at its core have evolved into something vastly different than they were initially, while retaining backward compatibility so that old hardware and software can be kept, if not forever, then at least for a little longer. Having now defeated all competition in the personal computer market, x86 is now moving up to challenge SPARC and Itanium in large-scale server, and also down to challenge the ubiquitous ARM microcontrollers in the embedded device market.

Intel: from 8086 to 80486 - an emotional story

Of course, one of the best processors made in the 70's is the 8086, and also its cheaper almost analogue 8088. The architecture of these processors is pleasantly distinguished by the absence of mechanical borrowings and adherence to abstract theories, the thoughtfulness and balance of architecture, steadiness and focus on further development. Of the drawbacks of the architecture of x86, you can call it a bit cumbersome and prone to an extensive increase in the number of instructions.

One of the brilliant constructive solutions of the 8086 was the invention of segment registers. This, as it were, simultaneously achieved two goals – the "free" ability to relocate codes of programs, up to 64 KB in size (this was even a decent amount for computer memory for one program up to the mid-80's), and addressability up to 1 MB of address space. You can also notice that the 8086, like the 8080 or z80, also has a special address space for 64 KB I/O ports (this is 256 bytes for the 8080 and 8085). Segment registers are only four: for the code, for the stack, and two for the data. Thus, 64 * 4 = 256 KB of memory is available for quick use, but it was very much even in the mid-80's. In fact, there is no problem with the size of the code, since it is possible to use long subroutine calls with loading and storing a full address from two registers. There is only a limit of 64 KB for the size of one subroutine – this is enough even for many modern applications. Some problem is created by the impossibility of fast addressing to data arrays larger than 64 KB - when using such arrays, it is necessary to load a segment register and the address itself with each access, which reduces the speed of work with such large arrays several times.

Segment registers are implemented in such a way that their presence is almost invisible in the machine codes, so, when time had come, it was easy to abandon them.

The architecture of the 8086 retained its proximity to the architecture of the 8080, which allowed relatively small efforts to transfer programs from 8080 to 8086, and especially if the source code was available.
The 8086's instructions are not very fast, but they are comparable to competitors, for example, the Motorola 68000, which appeared a year later. One of the innovations, some accelerating rather slow 8086, became instructions queue.
8086 uses eight 16-bit general purpose registers, some of which can be used as two one-byte registers, and some as index registers. Thus, the 8086 registers characterize some heterogeneity, but it is well balanced and the registers are very convenient to use. This heterogeneity, by the way, allows having more dense codes. 8086 uses the same flags as the 8080, plus a few new ones. For example, a flag appeared typical for the architecture of PDP-11 – step-by-step execution.

8086 allows you to use very interesting addressing modes, for example, the address can be made up of a sum of two registers and a constant 16-bit offset, on which the value of one of the segment registers is superimposed. From the amount that makes up the address, you can leave only two or even one summand. Such on the PDP-11 by one command will not turn out. Most commands in the 8086 do not allow both operands of memory type, one of the operands must be a register. But there are string commands that just know how to work with memory using addresses. String commands allow you to do quick block copying (17 cycles per byte or word), search, fill, load and compare. In addition, string commands can be used when working with I/O ports. Very interesting is the idea of ​​using the 8086 instruction prefixes, allowing to use often very useful additional functionality without significantly complicating the encoding schemes of CPU instructions.

8086 has one of the best design of work with the stack among all computer systems. Using only two registers (BP and SP), the 8086 allows solving all problems when organizing subroutine calls with parameters.

Among the commands there are signed and unsigned multiplication and division. There are even unique commands for decimal corrections for multiplication and division instructions. It's hard to say that in the 8086 command system, something is clearly missing. Quite the contrary. The division of a 32-bit dividend into a 16-bit divisor to obtain a 32-bit quotient and 16-bit remainder may require up to 300 clock cycles - not particularly fast, but several times faster than such a division on any 8-bit processors (except 6309) and is comparable in speed with 68000. The division in x86 has one unexpected feature - it corrupts all arithmetic flags.

It's worth adding that in the x86 architecture, the XCHG command inherited from the 8080 has been improved. In addition, the later processors began to use instructions XADD, CMPXCHG and CMPXCHG8B, which can also perform atomic exchange of arguments. Such instructions are one of the features of x86, they are difficult to meet on the processors of other architectures.

It can be summarized that 8086 is a very good processor, which combines the ease of programming and attachment to the limitations on the amount of memory of that time. 8086 was used comparatively rarely, giving way to a cheaper 8088 honorable place to be the first processor for the mainstream PC for the personal computers of our time. 8088 used 8-digit data bus what did him somewhat slower, but allowed to build systems on its base more accessible to the customers.

80186 and 80286 appeared in 1982. Thus, it can be assumed that Intel had two almost independent development teams. 80186 is 8086 improved by several commands and shortened timings plus several chips integrated into the chip typical of the x86 architecture: a clock generator, timers, DMA, interrupt controller, delay generator, etc. Such a processor could greatly simplify the production of computers based on it, but due to some unclear reason it is almost never used in the PC. The author knows only the BBC Master 512 based on the BBC Micro computer, which did not use built-in circuits, even a timer, but there were several other systems using 80186. Addressed memory with 80186 remained as with 8086 sizes in 1 МБ.

80286 had even better timings than 80186, among which stands out just a fantastic division (32/16=16,16) for 22 clock cycles - since then they have not learned how to do the division faster! 80286 supports working with all new instructions of 80186 plus many instructions for working in a new, protected mode. 80286 became the first processor with built-in support for protected mode, which allowed to organize memory protection, proper use of privileged instructions and access to virtual memory. Although the new mode created many problems (the protected mode was rather unsuccessful) and was relatively rarely used, it was a big breakthrough. In this new mode, segment registers have acquired a new quality, allowing up to 16 MB of addressable memory and up to 1 GB of virtual memory per task. The big problem with 80286 was the inability to switch from the protected mode to real mode, in which most programs worked. Using the "secret" undocumented instruction LOADALL, it was possible to use 16 MB of memory being in the real mode.

In 80286, the calculation of an address in an instruction operand became a separate scheme and stopped slowing down the execution of commands. This added interesting features, for example, with the command LEA AX,[BX + SI + 4000] in just 3 cycles it became possible to perform two additions and transfer the result to the AX register!

The number of manufacturers and specific systems using 80286 is huge, but, indeed, the first computers were IBM PC AT with almost fantastic personal computer performance indicators for speed. With these computers, memory began to lag behind the speed of the processor, wait states appeared, but then it seemed still something temporary.

Protected mode of 80286 was extremely inconvenient, divided all memory into segments of no more than 64 KB and required complicated software support for working with virtual memory. 80386, appeared in 1985, made the work in the protected mode quite comfortable, allowed to use up to 4 GB of addressable memory and easily switch between modes. In addition, to support multitasking for programs for the 8086, the virtual 8086 mode was made. For virtual memory, it became possible to use a relatively easy-to-manage page mode. 80386 for all its innovations has remained fully compatible with the programs written for the 80286. Among the innovations of 80386, you can also find the extension of registers to 32-bits and the addition of two new segment registers. The timings have changed, but ambiguously. A barrel shifter was added, which allowed multiple shifts with timings of one. However, this innovation for some reason very slowed down the execution of the commands of cyclic rotates. The multiplication became slightly slower than that of 80286. Working with memory became, on the contrary, a little faster, but this does not apply to string commands that stayed faster for 80286. The author of this material has often had to come across the view that in real mode with 16-bit code 80286 in the end is still a little bit faster.
Several new instructions were added to 80386, most of which just gave new ways for work with data, actually duplicating with optimization some already present instructions. For example, the following commands were added:

* to check, set and reset a bit by number, similar to those that were made for z80;
* bit-scan BSF and BSR;
* copy a value with a signed or zero bit extension, MOVSX and MOVZX;
* setting a value depending on the values of operation flags by SETxx ;
* shifts of double values by SHLD, SHRD.

X86 processors before the appearance of 80386 could use only short, with an offset of one-byte conditional jumps – this was often not enough. With 80386 it became possible to use offset of two or four bytes, and despite the fact that the code of new jumps became two or three times longer, the time of its execution remained the same as in previous, short jumps.

The support for debugging was radically improved by the introduction of 4 hardware breakpoints, using them, it became possible to stop programs even on memory addresses that may not be changed.

The protected mode became much easier to manage than in 80286, which made a number of inherited commands unnecessary rudiments. In the main protected, so-called flat-mode, segments up to 4 GB in size are used, which turns all segmented registers into an unobtrusive formality. A semi-documented unreal mode allowed even to use all the memory as in flat-mode, but from easy to install and control the real mode.

Since 80386, Intel has refused to share its technology, becoming in fact the monopoly processor manufacturer for IBM PC architecture, and with the weakening of Motorola's positions, and for other personal computer architectures. Systems based on the 80386 were very expensive until the early 90's, when they became finally available to mass consumers at frequencies from 25 to 40 MHz. Since 80386 IBM began to lose its position as a leading manufacturer of IBM PC compatible computers. This was manifested, in particular, in that the first PC based on 80386 was in 1986 a computer made by Compaq.

It's hard not to hold back admiration for the volume of work that was done by the creators of the 80386 and its results. I dare even suggest that 80386 contains more achievements than all the technological achievements of mankind before 1970, and maybe even until 1980.

Quite interesting is the topic of errors in 80386. I will write about two. The first chips had some instructions, which then disappeared from the manuals for this processor and stopped executing on later chips. It's about the instructions of IBTS and XBTS. All 80386DX/SXs produced by both AMD and Intel (which reveals their curious internal identity) have a very strange and unpleasant bug that manifested itself in destroying the value of the EAX register if, after writing to the stack or unloading from there all registers with POPAD or PUSHAD used a command that used the address with the BX register. In some situations, the processor could even hang. Just a nightmare bug and very massive, but in Wikipedia, there is still not even a mention of it. There were other bugs, indeed.

The emergence of ARM changed the situation in the world of computer technology. Despite the problems, the ARM processors continued their development. The answer of Intel was 80486. In the struggle for speed and for the first place in the world of advanced technologies Intel even took a decision to use a cooling fan that spoils the look of the PC till present time.

In 80486, timings for most instructions were improved and some of them began to be executed as on ARM processors during one clock. Although the multiplication and division for some reason became slightly slower. There is quite a big built-in cache memory for those years, with the size of 8 KB. There were also new instructions, for example, CMPXCHG – it took the place of the imperceptibly missing instructions of IBTS and XBTS (interesting, as a secret this instruction was available already at the late 80386). There are very few new instructions - only six, of which it is worth mentioning a very useful instruction for changing the order of bytes in the 32-bit word BSWAP. A big useful innovation was the presence of a built-in arithmetic coprocessor chip - so no one else did.

The first systems based on the 80486 were incredibly expensive. Quite unusual is that the first computers based on 80486, the VX FT model, were made by the English firm Apricot - their price in 1989 was from 18 to 40 thousand dollars, and the weight of the system unit is over 60 kg! IBM released the first computer based on 80486 in 1990, it was a PS/2 model 90 with a cost of $17,000.

It's hard to imagine the Intel processors without secret, undocumented officially features. Some of these features have been hidden from users since the very first 8086. For example, such an albeit useless fact that the second byte in the instructions of the decimal correction of AAD and AAM matters and can be different, generally non-decimal (it was documented only for the Pentium processor after 15 years!). It is more unpleasant to silence the shortened AND/OR/XOR instructions with an operand byte constant, for example, AND BX, 7 with an opcode of three bytes length (83 E3 07). These commands, making the code more compact, which was especially important with the first PCs, were quietly inserted into the documentation only for 80386. It is interesting that the Intel's manuals for 8086 or 80286 have a hint about these commands, but there are no specific opcodes for them. Unlike similar instructions ADD/ADC/SBB/SUB, for which the full information was provided. This, in particular, led to the fact that many assemblers (all?) could not produce shorter codes. Another group of secrets may be called some strange thing – a number of instructions have two codes of operations. It is, for example, the instructions SAL/SHL (opcodes D0 E0, D0 F0 or D1 E0, D1 F0). Usually, and maybe always, only the first operation code is used. Second, the secret is used almost never. One can only wonder why Intel so carefully preserves these superfluous, cluttering space of opcodes duplicating instructions? The SALC instruction waited for its official documentation until 1995 almost 20 years! Instruction for debugging ICEBP was officially non-existent for 10 years from 1985 to 1995. Most of all, it was written about the secret instructions LOADALL and LOADALLD – they will remain forever secret, as they could be used for easy access to large memory sizes only on 80286 and 80386 respectively. Until recently, there was an intrigue around the UD1 (0F B9) instruction, which was unofficially an example of an incorrect opcode. The informal has recently become official.

In the USSR, the production of clones of processors 8088 and 8086 was mastered, but it could not fully reproduce 80286.

It is a copy of https://litwr.livejournal.com/436.html

Log in or register to write something here or to contact authors.