Analysis of the 30-Year Pentium FDIV Bug and Intels $475 Million Recall

Published by

teaser
In 1994, Intel faced its first major recall due to the FDIV bug in the original Pentium processor, resulting in a financial loss of approximately $475 million USD. The issue stemmed from specific transistors within the programmable logic array (PLA) responsible for managing the division table. Hardware historian and reverse-engineer Ken Shirriff conducted a detailed microscopic examination of the Pentium's CPU die, successfully identifying the faulty transistors that led to the error. The Pentium, Intel's inaugural CPU based on the P5 architecture, was manufactured using an 800nm process and contained around 3.1 million transistors. Unlike today's processors, which incorporate tens of billions of transistors, the Pentium's design allowed individual transistor grids to be visible under microscopic analysis, enabling Shirriff to pinpoint the flaw effectively.

The FDIV bug was a mathematical error within the floating-point unit (FPU) of the Pentium processor, arising from inaccuracies in the PLA's division calculations. The Pentium's FPU utilized the SRT division algorithm, allowing it to perform two bits per clock cycle compared to the single bit per cycle of its predecessor. This efficiency was achieved through a 2,048-cell table on the die, listing values from -2 to 2 across 112 rows. Each value's presence or absence was determined by the arrangement of transistors at specific grid points. However, the omission of transistors in five table entries resulted in incorrect floating-point calculations. Initially, Intel estimated that the error would occur once every 27,000 years, a claim that was later challenged when IBM reported that the bug could manifest every 24 days, leading them to halt Pentium sales. Faced with significant financial pressure, Intel decided to recall all affected processors to maintain market trust.

Further investigation by Shirriff revealed that the FDIV bug was more extensive than initially identified, uncovering 16 missing data points in the PLA instead of the originally recognized five. These additional flaws contributed to errors that Intel had not accounted for in their initial assessments. To address the issue, Intel implemented a solution that involved populating all unused entries in the PLA with the correct value of 2, thereby eliminating the errors without necessitating major changes to the processor's design. This fix not only resolved the immediate problem but also streamlined future Pentium revisions by conserving space on the die.

1733902532_guru3d

Source:  Ken Shirriff via tomshardware

Share this content
Twitter Facebook Reddit WhatsApp Email Print