|[ Team LiB ]|
Although more than a dozen companies make microprocessors, any new personal computer you buy will likely be based on a chip from one of only three companies: Advanced Micro Devices, Intel, or Transmeta. Intel, the largest semiconductor-maker in the world, makes the majority of computer processors—about 80 percent in the first quarter of the year 2002, according to Mercury Research. In the same period, AMD sold 18.2 percent of the chips destined for personal computers.
Intel earned its enviable position by not only inventing the microprocessor but also by a quirk of fate. IBM chose one of its chips for its first Personal Computer in 1981, the machine that all modern personal computers have been patterned after. Computers must use Intel chips or chips designed to match the Intel Architecture to be able to run today's most popular software. Because microprocessors are so complicated, designing and building them requires a huge investment, which prevents new competitors from edging into the market.
Currently Intel sells microprocessors under several brand names. The most popular is the Pentium 4, which Intel claims as its best-selling microprocessor, ever. But the Pentium 4 is not a single chip design. Rather, it's a trade name. Intel has marketed two distinctly different chip designs as Pentium 4. More confusing still, Intel markets the same core logic under more than one name. At one time, the Pentium, Celeron, and Xeon all used essentially the same internal design. The names designated the market segment Intel hoped to carve for the respective chips—Celeron for the price-sensitive low end of the personal computer marketplace, Pentium for the mainstream, and Xeon for the pricey, high-end server marketplace. Although Intel did tailor some features to justify its market positioning of the chips, they nevertheless shared the same circuitry deep inside.
Today, that situation has changed. Intel now designs Xeon chips separately, and Celerons often retain older designs and technologies longer than mainstream Pentiums. The market position assigned the microprocessor names remains the same. Intel offers Celeron chips for the budget conscious. It sacrifices the last bit of performance to make computers more affordable. Pentium processors are meant for the mainstream (most computer purchasers) and deliver full performance for single-user computers. Xeons are specialized microprocessors designed primarily for high-powered server systems. Intel has added a further name, Itanium, to its trademark lineup. The Itanium uses a new architecture (usually termed IA64, shorthand for 64-bit Intel Architecture) that is not directly compatible with software meant for other Intel chips.
The important lesson is that the names you see on the market are only brand names and do not reflect what's inside a chip. Some people avoid confusion by using the code name the manufacturer called a given microprocessor design during its development. These code names allow consumers to distinguish the Northwood processor from the Willamette, both of which are sold as Pentium 4. The Northwood is a newer design with higher-speed potentials. Table 5.7 lists many of Intel's microprocessor code names.
Announced in January and officially released on February 26, 1999, the Pentium III was the swan song for Intel's P6 processor core, developed for the Pentium Pro. Code-named Katmai during its development, the Pentium III chip is most notable for adding SSE to the Intel microprocessor repertory. SSE is a compound acronym for Streaming SIMD Extensions (SIMD itself being an acronym for Single Instruction, Multiple Data). SIMD technology allows one microprocessor instruction to operate across several bytes or words (or even larger blocks of data). In the Pentium III, SSE (formerly known as the Katmai New Instructions or KNI) is a set of 70 new SIMD codes for microprocessor instructions that allows programs to specify elaborate three-dimensional processing functions with a single command.
Unlike the MMX extensions, which added no new registers to the basic Pentium design and instead simply redesignated the floating-point unit registers for multimedia functions, Intel's Streaming SIMD Extensions add new registers to Intel architecture, pushing the total number of transistors inside the core logic of the chip above 9.5 million.
At heart, the Pentium III uses the same core logic as its Pentium II and Pentium Pro forebears, amended to handle its larger instruction set. The enhancements chiefly involve the floating-point unit, which does double-duty processing multimedia instructions. In other words, the Pentium III does not mark a new generation of microprocessor technology or performance. Even Intel noted that on programs that do not take advantage of the Streaming SIMD Extensions of the Pentium III, the chips deliver performance that's about the same as a Pentium II.
Although the initial fabrication used 0.25-micron technology, Intel rapidly shifted to 0.18-micron technology with the Coppermine design. The result is that the circuitry of the chip takes less silicon (making fabrication less expensive), requires less power, and is able to operate at higher speeds. The initial Pentium III releases, and all versions through at least the 600MHz chip, operate with a 100MHz memory bus. Many of the new chips using the Coppermine core ratchet the maximum memory speed to 133MHz using Rambus memory technology, although some retain a 100MHz maximum memory speed.
In going to the new Coppermine design, Intel replaced earlier Pentium III chips with 0.25-micron design features, in particular those at the 450, 533, 550, and 600MHz speeds. The newer chips are designed with the suffix E. In addition, to distinguish chips with 133MHz front-side bus capability (when 100MHz versions of the chip were once offered), Intel added a B suffix to the designation of 533 and 600MHz chips that are capable of running their memory buses at the higher 133MHz speed.
The Pentium III was the first Intel processor to cross the line at a 1GHz clock speed with a chip released on March 8, 2000. The series ended its development run at 1.13GHz on July 31, 2000, although Intel continued to manufacturer the chip into 2002.
The Pentium III was designed to plug into the same Slot 1 as the Pentium II, however, the Pentium III now comes in three distinct packages. One, the SEC cartridge, is familiar from the Pentium II. Pentium III is also available in the SEC cartridge 2 (or SECC2). In all aspects, the SECC2 is identical to the SEC cartridge and plugs into the same slot (Slot 1, which Intel has renamed SC242), but the SECC2 package lacks the thermal plate of the earlier design. Instead, the SECC2 is designed to mate with an external heatsink, and, because of the lack of the thermal plate, it will make a better thermal connection for more effective cooling. (Well, it makes sense when Intel explains it.) In addition, the 450 and 550E are available in a new package design termed FC-PGA (for Flip-Chip Pin Grid Array), which is more compact and less expensive than the cartridge design.
As with the Pentium II, the 0.25-micron versions of the Pentium III have a 32KB integral primary cache and include a 512KB secondary cache on the same substrate but not in the same hermetic package at the core CPU. The secondary cache runs at half the chip speed. The newer 0.18-micron versions have a smaller, 256KB secondary cache but operate it at full chip speed and locate it on the same silicon as the core logic. In addition, Intel has broadened the data path between the core logic and the cache to enhance performance (Intel calls this Advanced Transfer Cache technology). According to Intel, these improvements give the newer Coppermine-based (0.18-micron) Pentium III chips a 25-percent performance advantage over older Pentium III chips operating at the same clock speed. The entire Pentium III line supports multiprocessing with up to two chips.
The most controversial aspect of the Pentium III lineup is its internal serial number. Hard-coded into the chip, this number is unique to each individual microprocessor. Originally Intel foresaw that a single command—including a query from a distant Web site—would cause the chip to send out its serial number for positive identification (of the chip, of the computer it is in, and of the person owning or using the computer). Intel believed the feature would improve Internet security, not to mention allowing the company to track its products and detect counterfeits. Consumer groups saw the "feature" as in invasion of privacy, and under threat of boycott Intel changed its policy. Where formerly the Pentium III would default to making the identification information available, after the first production run of the new chip, the identification would default to off and require a specific software command to make the serial number accessible. Whether the chip serial number is available becomes a setup feature of the BIOS in PCs using the Pentium III chip, although a software command can override that setting. In other words, someone can always interrogate your PC to discover your Pentium III's serial number. Therefore, you might want to watch what you say online when you run with the Pentium III. Table 5.8 summarizes the history of the Pentium III.
Pentium III Xeon
To add its Streaming SIMD Extensions to its server products, on March 17, 1999, Intel introduced the Pentium III Xeon. As with the Pentium III itself, the new instructions are the chief change, but they are complemented by a shift to finer technology. As a result, the initial new Xeons start with a speed of 500MHz. At this speed, Intel offers the chip with either a 512KB, 1MB, or 2MB integral secondary cache operating at core speed. The new Slot 2 chips also incorporate the hardware serial number feature of the Pentium III chip.
Developed under the code name Tanner, the Pentium III Xeon improved upon the original (Pentium II) Xeon with additions to the core logic to handle Intel's Streaming SIMD Extensions. Aimed at the same workstation market as the original Xeon, the Pentium III Xeon is distinguished from the ordinary Pentium III by its larger integral cache, its Slot 2 packaging, and its wider multiprocessor support—the Pentium III Xeon design allows for servers with up to eight processors.
When Intel introduced its Coppermine 0.18-micron technology on October 25, 1999, it unveiled three new Pentium III Xeon versions with speed ratings up to 733MHz. Except for packaging, however, these new Xeons differed little from the ordinary Pentium III line. As with the mainstream processors, the Xeons supported a maximum of two processors per system and had cache designs identical to the ordinary Pentium III with a 256KB secondary cache operating at full processor speed using wide-bus Advanced Transfer Cache technology. In May of 2000, Intel added Xeons with larger, 1MB and 2MB caches as well as higher-speed models with 256KB caches. Table 5.9 lists the characteristics of all of Intel's Pentium III Xeon chips.
Intel Pentium 4
Officially introduced on November 20, 2000, the Pentium 4 is Intel's newest and most powerful microprocessor core for personal computers. According to Intel, the key advance made by the new chip is its use of NetBurst micro-architecture, which can be roughly explained as a better way of translating program instructions into the micro-ops that the chip actually carries out. NetBurst is the first truly new Intel core logic design since the introduction of the P6 (Pentium Pro) in 1995.
Part of the innovation is an enhancement to the instruction set; another part is an improvement to the underlying hardware. All told, the first Pentium 4 design required the equivalent of 42 million transistors.
Chips designated Pentium 4 actually use one of two designs. Intel code-named the early chips Willamette and used 0.18-micron design rules in their fabrication. At initial release, these chips operated at 1.4GHz and 1.5GHz, but Intel soon upped their clock speeds. At the time, Intel and AMD were in a horserace for the fastest microprocessor, and the title shifted between the Athlon and Pentium 4 with each new chip release.
In 2002, Intel shifted to 0.13-micron design rules with a new processor code designed under the name Northwood. This shift resulted in a physically smaller chip that also allows more space for cache memory—whereas the Willamette chips boasts 256KB of on-chip Level Two cache operating at full core speed, the Northwood design doubles that to 512KB. The difference is in size only. Both chips have a 256-bit-wide connection with the caches, which use an eight-way set-associative design.
Of particular note, the Pentium 4 uses a different system bus from that of the Pentium III chip. As a practical matter, that means the Pentium 4 requires different chipsets and motherboards from those of the Pentium III. Although this is ordinarily the concern of the computer manufacturer, the new design has important benefits. It adds extra speed to the system (memory) bus by shifting data up to four times faster than older designs using a technology Intel calls Source-Synchronous Transfer. In effect, this signaling system packs four bits of information into each clock cycle, so a bus with a 133MHz nominal clock speed can shift data at an effective rate of 533MHz. The address bus is double-clocked, signaling twice in each clock cycle, yielding an effective rate of 266MHz. In that the Pentium 4, like earlier Pentiums, has a 64-bit-wide data bus, that speed allows the Pentium 4 to move information at a peak rate of 4.3GBps (that is, 8 bytes times 533MHz).
Only Northwood chips rated at 2.26GHz, 2.4GHz, and 2.53GHz have 533MHz system buses. Other Northwood chips, as well as all Willamette versions, use a 400MHz system bus (that is, a quadruple-clocked 100MHz bus). Note that chips operating at 2.4GHz may have either a 400MHz or 533MHz system bus. The system bus speed is set in the system design and cannot be varied through hardware or software.
The Pentium 4 has three execution units. The two integer arithmetic/logic units (ALUs) comprise what Intel calls a rapid execution engine. They are "rapid" because they operate at twice the speed of the rest of the core logic (that is, 5.06GHz in a 2.53GHz chip), executing up to two instructions in each clock cycle. The registers in each ALU are 32 bits wide.
Unlike previous Intel floating-point units, the registers in the Pentium 4 FPU are 128 bits wide. The chief benefit of these wider registers is in carrying out multimedia instructions using a further enhancement on Intel's Streaming SIMD Extensions (SSE), a set of 144 new instructions (mainly aimed at moving bytes in and out of the 128-bit registers but also including double-precision floating-point and memory-management instructions) called SSE2.
Intel lengthened the pipelines pumping instructions into the execution units to 20 stages, the longest of any microprocessor currently in production. Intel calls this design hyperpipelined technology.
One way Intel pushes more performance from the Pentium 4 is by double-clocking the integer units in the chip. They operate at twice the external clock frequency applied to the chip (that is, at 3GHz in the 1.5GHz Pentium 4 chip). Balancing the increased speed is a 400MHz system bus throughput as well as an improved primary cache and integral 256KB secondary cache. Intel connects this secondary cache to the rest of the chip through a new 256-bit-wide bus, double the size of those in previous chips.
Intel also coins the term hyperpipelining to describe the Pentium 4. The term refers to Intel's doubling the depth of the instruction pipeline, as compared to the previous line-leader, the Pentium III. One (but not all) of the pipelines in the Pentium 4 stretches out for 20 stages. Intel claims that the new NetBurst micro-architecture enabled the successful development of the long pipeline because it minimizes the penalties associated with mispredicting instruction branches.
The Pentium 4 recognizes the same instruction set as previous Intel microprocessors, including the Streaming SIMD Extensions introduced with the Pentium III, but the Pentium 4 adds 144 more instructions to the list. The result is termed by Intel SSE2. The chip has the same basic data and address bus structure as the Pentium III, allowing it to access up to 64GB of physical memory eight bytes (64-bits) at a time.
Table 5.10 summarizes the Intel Pentium 4 line.
Xeon (Pentium 4)
On May 21, 2001, Intel released the first of its Xeon microprocessors built using the NetBurst core logic of the Pentium 4 microprocessor. To optimize the chip for use in computer servers, the company increased the secondary cache size of the chip, up to 2MB of on-chip cache. For multiprocessor systems, Intel later derived a separate chip, the Xeon MP processor, for servers with two to four microprocessors. The chief difference between the MP chip and the base Xeon chip is the former chip's caching—a three-level design. The primary cache is 8KB, the secondary cache is 256KB, and the tertiary cache is either 512KB or 1MB.
Table 5.11 summarizes the Pentium 4 Xeon, including Xeon MP processors.
The diversity of models that Intel puts on the desktop is exceeded only by the number of microprocessors it makes for portable computers. The current lineup includes four major models, each of which includes chips meant to operate on three different voltage levels, in a wide range of frequencies. The choices include Mobile Celeron, Mobile Pentium III (soon to fall from the product line, as of this writing), Mobile Pentium III-M, and Mobile Pentium 4-M. Each has its target market, with Celeron at the low end, Pentium 4 at the highest, and the various Pentium III chips for everything in between. Each chip shares essentially the same core as the desktop chip bearing the same designation. But mobile chips have added circuitry for power management and, in some models, different packaging.
Microprocessors for portable computers differ from those meant for desktop systems in three ways: operating power, power management, and performance. The last is a result of the first and, in normal operation, the second.
To help portable computers run longer from a single charge of their batteries, Intel and other microprocessor manufacturers reduce the voltage at which their chips operate. Intel, in fact, produces chips in three voltage ranges, which it calls very (or sometimes, ultra) low voltage, low voltage, and nothing. Very-low-voltage chips operate as low as 0.95 volts. Low-voltage chips dip down to about 1.1. Low-voltage operation necessitates lower-speed operation, which limits performance. Of the chips that Intel produces in all three power levels, the very-low-voltage chips inevitably have the slowest megahertz rating.
Power management aids in the same end—prolonging battery life—and incidentally prevents your laptop computer from singeing you should you operate it on your lap. Some machines still get uncomfortably hot.
Intel aims its Mobile Celeron at the budget market. The chips are not only restricted to lower clock speeds than their Pentium siblings, but most Mobile Celeron chips lack Intel's SpeedStep power-saving technology. The Mobile Celeron perpetually lags Intel's desktop processors in other performance indexes. For example, much like the desktop Celerons, the mobile chips usually are one step out-of-date in front-side bus speed. Although Intel has bumped its top memory bus speed to 533MHz, it only endows the latest versions of the Mobile Celeron with the older quad-pumped 400MHz rating. Most Mobile Celerons still have 100MHz front-side buses.
Much like the desktop line, the first Mobile Celeron was a modest alteration of the Pentium II. Introduced on January 25, 1999, it used the same core logic but with a different cache design, a 128KB cache on the chip substrate operating at full core speed. Otherwise, the chip followed the Pentium II design and followed it up in speed, from 266MHz up to 466MHz, using the same 66MHz front-side bus as the desktop chips and the same 0.25-micro design rules. It differs chiefly by operating at a lower, power-saving voltage, 1.6 volts.
On February 14, 2000, Intel revised the Mobile Celeron design to take advantage of the Pentium III core and its 0.18-micro design rules. The newer Mobile Celerons gained two advantages: the higher, 100MHz front-side bus speed and the Streaming SIMD Extensions to the instruction set. In addition to a faster chip (at 500MHz), Intel also introduced a 450MHz chip, slower than the quickest of the old Mobile Celeron design but able to take advantage of the higher bus speed. Intel continued to upgrade this core up to a speed of 933MHz, introduced on October 1, 2001.
When Intel move the Mobile Celeron line to 0.13-micron technology, the company cut the core voltage of the chips down to 1.45 volts while pushing up its top clock speed to 1.2GHz. The smaller design rules left more space on the chip's silicon, which Intel utilized for an enhanced secondary cache, pushing it to 256KB.
On June 24, 2002, Intel switched the Mobile Celeron core once again, bringing in a design derived from the Pentium 4. The new core allowed Intel to trim the chip's operating voltage once again, down to 1.3 volts. In addition, new versions of the Mobile Celeron boast the quad-pumped 400MHz front-side bus speed of the Pentium 4 as well as its enhanced Streaming SIMD Extensions 2 instruction set.
Table 5.12 summarizes the life history of Intel's Mobile Celeron product line.
Mobile Pentium III
For about a year, Intel's most powerful mobile chips wore the Pentium III designation. As the name implies, they were derived from the desktop series with several features added to optimize them for mobile applications and, incidentally, to bring the number of transistors on their single slice of silicon to 28 million. Table 5.13 summarizes the Mobile Pentium III product line.
Mobile Pentium III-M
When Intel shifted to new fabrication, the company altered the core logic design of the Pentium III. Although the basic design remained the same, the tighter design rules allowed for higher-speed operation. In addition, Intel improved the power management of the Mobile Pentium III with Enhanced SpeedStep technology, which allows the chip to shift down in speed in increments to conserve power. Table 5.14 summarizes the Mobile Pentium III-M lineup.
Mobile Pentium 4-M
Intel's highest performance mobile chip is the Mobile Pentium 4-M. Based on the same core logic as the line-leading Pentium 4, the mobile chip is enhanced with additional power-management features and lower-voltage operation. Table 5.15 summarizes the Pentium 4-M lineup.
For use in computers with strict power budgets—either because the manufacturer decided to devote little space to batteries or because the maker opted for extremely long runtimes—Intel has developed several lines of low-voltage microprocessors. These mobile chips have operating voltages substantially lower than the mainstream chips. Such low-voltage chips have been produced in three major mobile processor lines: the Mobile Celeron, the Mobile Pentium III, and the Mobile Pentium III-M.
For systems in which power consumption is absolutely critical, Intel has offered versions of its various mobile chips designed for ultra-low-voltage operation. Ultra, like beauty, is in the eye of the beholder—these chips often operate at voltages only a fraction lower than ordinary low-voltage chips. The lower operating voltage limits the top speed of these chips; they are substantially slower than the ordinary low-voltage chips at the time of their introduction. But again, they are meant for systems where long battery life is more important than performance. Table 5.16 summarizes Intel's ultra-low-voltage microprocessors.
Intel's Itanium microprocessor line marks an extreme shift for Intel, entirely breaking with the Intel Architecture of the past—which means that the Itanium cannot run programs or operating systems designed for other Intel chips. Instead of using the old Intel design dating back to the 4004, the Itanium introduces Intel's Explicitly Parallel Instruction Computing architecture, which is based on the Precision Architecture originally developed by Hewlett-Packard Corporation for its line of RISC chips. In short, that means everything you know about Intel processors doesn't apply to the Itanium, especially the performance you should expect from the megahertz ratings of the chips. Itanium chips look slow on paper but perform fast in computers.
The original Itanium was code-named Merced and was introduced in mid-2001, with an announcement from Intel on May 29, 2001, that systems soon would be shipping. The original Itanium was sold in two speeds: 733MHz and 800MHz. Equipped with a 266MHz system bus (double-clocked 133MHz), the Itanium further enhanced performance with a three-level on-chip cache design with 32KB in its primary cache, 96KB in its secondary cache, and 2MB or 4MB in its tertiary cache (depending on chip model). All aspects of the chip feature a full 64-bit bus width, both data and address lines. Meant for high-performance servers, the Itanium allows for up to 512 processors in a single computer.
Introduced on July 8, 2002, the Itanium 2 (code-named McKinley) pushed up the clock speed of the same basic design as the original Itanium by shifting down the design rules to 0.13 micron. Intel increased the secondary cache of the Itanium 2 to 256KB and allowed for tertiary caches of 1.5MB or 3MB. The refined design also increased the system bus speed to 400MHz (actually, a double-clocked 200MHz bus) with a bus width of 128 bits. Initially, Itanium 2 chips were offered at speeds of 900MHz and 1.0GHz. Table 5.17 summarizes the Itanium line.
Advanced Micro Devices Microprocessors
Advanced Micro Devices currently fields two lines of microprocessor. Duron chips correspond to Intel's Celeron line, targeting the budget-minded consumer. Athlon chips are mainstream, full-performance microprocessors. Around the end of the year 2002, AMD will add the Opteron name to its product line. Meant to compete with the performance of Intel's Itanium, the Opteron will differ with a design meant to run today's software as well or better than current Intel processors. Opteron will become the new top-end of the AMD lineup.
AMD's answer to the Pentium III and its P6 core was the Athlon. The Athlon is built on a RISC core with three integer pipelines (compared to the two inside the Pentium III), three floating-point pipelines (versus one in the Pentium III), and three instruction decoders (compared one in the Pentium III). The design permits the Athlon to achieve up to nine operations per clock cycle, compared to five for the Pentium III. AMD designed the floating-point units specifically for multimedia and endowed them with both the MMX (under Intel license) and 3DNow! instruction sets.
Program code being what it is—not very amendable to superscalar processing—the Athlon's advantage proved more modest in reality. Most people gave the Athlon a slight edge on the Pentium III, megahertz for megahertz. The Athlon chip was more than powerful enough to challenge Intel for leadership in processing power. For more than a year, AMD and Intel ran a speed race for the fastest processor, with AMD occasionally edging ahead even in pure megahertz.
The Athlon has several other features that help to boost its performance. It has both primary (L1) and secondary (L2) caches on-chip, operating at the full speed rating of the core logic. A full 128KB is devoted to the primary cache, half for instructions and half for data, and 256KB is devoted to the secondary cache, for a total (the figure that AMD usually quotes) of 384KB of cache. The secondary cache connects to the chip through a 64-bit back-side bus operating at the core speed of the chip.
The system bus of the Athlon also edged past that used by Intel for the Pentium III. At introduction, the Athlon allowed for a 200MHz system bus (and 133MHz memory). Later, in March, 2001, the system bus interface was bumped up to 266MHz. This bus operates asynchronously with the core logic, so AMD never bothered with some of the odd speeds Intel used for its chips. The instruction set of the Athlon includes an enhanced form of AMD's 3DNow! The Athlon recognizes 45 3D instructions, compared to 21 for AMD's previous-generation K6-III chip.
The design of the Athlon requires more than 22 million transistors. As with other chips in its generation, it has registers 32 bits wide but connects to its primary cache through a 128-bit bus and to the system through a 64-bit data bus. It can directly address up to 8TB of memory through an address bus that's effectively 43-bits wide.
AMD has introduced several variations on the Athlon name—the basic Athlon, the Athlon 4 (to parallel the introduction of the Pentium 4), and the Athlon XP (paralleling Microsoft's introduction of Windows XP). The difference between the Athlon and Athlon 4 are in name alone. The basic core of all these Athlons is the same. Only the speed rating has increased with time. With the XP designation, however, AMD added Intel's Streaming SIMD Extensions to the instruction set of the chip, giving it better multimedia performance.
The Athlon comes in cartridge form and slides into AMD's Slot A. Based on the EV6 bus design developed by Digital Equipment Corporation (now part of Compaq) for the Alpha chip (a microprocessor originally meant for minicomputers but now being phased out), the new socket is physically the same as Intel's Slot 1, but the signals are different and the AMD chip is incompatible with slots for Intel processors.
AMD fabricated its initial Athlon chips using 0.25-micron design rules. In November, 1999, the company shifted to new fabrication facilities that enabled it to build the Athlon with 0.18-micron design rules.
Table 5.18 summarizes the features of the AMD Athlon line.
For multiprocessor applications, AMD adapted the core logic of the Athlon chip with bus control circuitry meant for high-bandwidth transfers. These chips are specifically aimed at servers rather than desktop computers. Table 5.19 summarizes the AMD offerings.
To take on Intel's budget-priced Celeron chips, AMD slimmed down the Athlon to make a lower-priced product. Although based on the same logic core as the Athlon, the Duron skimps on cache. Although it retains the same 128KB primary cache, split with half handling data and half instructions, the secondary cache is cut to 64KB. As with the Athlon, however, both caches operate at full core logic speed. The smaller secondary cache reduces the size of the silicon die required to make the chip, allowing more Durons than Athlons to be fabricated from each silicon wafer, thus cutting manufacturing cost.
The basic architecture of the Duron core matches the Athlon with three integer pipelines, three floating-point pipelines (which also process both 3DNow! and Intel's MMX instruction sets), and three instruction/address decoders. Duron chips even share the same 0.18-micron technology used by the higher-priced Athlon. For now, however, Durons are restricted to lower speeds than the Athlon line and have not benefited from AMD's higher-speed 266MHz system bus. All Durons use a 200MHz bus.
During development AMD used the code name Spitfire for the Duron. The company explains the official name of the chip as "derived from the Latin root durare, meaning 'to last' and on, meaning 'unit.'" The root is the same as the English word durability. Table 5.20 summarizes the characteristics of the AMD Duron line.
As with its desktop processors, AMD has two lines of chips for portable computers, the Athlon and Duron, for the high and low ends of the market, respectively. Unlike Intel, AMD puts essentially the same processors as used on the desktop in mobile packages. The AMD chips operate at the same low voltages as chips specifically designed for mobile applications, and AMD's desktop (and therefore, mobile) products all use its power-saving PowerNow! technology.
The one difference: AMD shifted to 0.13-micron technology for its portable Athlon XP while the desktop chip stuck with 0.18-micron technology. Table 5.21 summarizes AMD's Mobile Athlon product line.
As with desktop chips, the chief difference between AMD's Mobile Athlon and Mobile Duron is the size of the secondary cache—only 64KB in the Duron chips. Table 5.22 summarizes the Mobile Duron line.
AMD chose the name Opteron for what it calls its eighth generation of microprocessors, for which it has used the code-name Hammer during development. The Opteron represents the first 64-bit implementation of Intel architecture, something Intel has neglected to develop.
The Opteron design extends the registers of Pentium-style computers to a full 64-bits wide. It's a forthright extension of the current Intel architecture, and AMD makes the transition the same way Intel extended the original 16-bit bus of the 8086-style chips to 32 bits for the 386 series. The new, wide registers are a superset of the 32-bit registers. In the Opteron's compatibility mode, 16-bit instructions simply use the least significant 16 bits of the wide registers, 32-bit instructions use the least significant 32 bits, and 64-bit instructions use the entire register width. As a result, the Opteron can run any Intel code at any time without the need for emulators or coprocessors. Taking advantage of the full 64-bit power of the Opteron will, of course, require new programs written with the new 64-bit instructions.
The Opteron design also changes the structure of the processor core, rearranging the pipelines and processing units. The Opteron design uses three separate decode pipelines that feed a packing stage that links all three pipelines to more efficiently divide operations between them. The pipelines then feed into another stage of decoding, then eight stages of scheduling. At that point, the pipelines route integer and floating-point operations to individual processors. AMD quotes a total pipeline of 12 stages for integers and 17 for floating-point operations. The floating-point unit understands everything from MMX through 3DNow! to Intel's latest SSE2.
As important as the core logic is, AMD has made vast improvements on the I/O of the Opteron. Major changes come in two areas. AMD builds the memory controller into the Opteron, so the chip requires no separate memory control hub. The interface uses DDR memory through two 128-bit-wide channels. Each channel can handle four memory modules, initially those rated for PC1600 operation, although the Opteron design allows for memory as fast as PC2700. According to AMD, building the memory interface into the Opteron reduces latency (waiting time), the advantage of which is an increase with every step up in clock speed.
Strictly speaking, the Crusoe processors from Transmeta Corporation are not Intel architecture chips. They use an entirely different instruction set from Intel chips, and by themselves could run a Windows program on a dare. Transmeta's not-so-secret weapon is what it calls Code Morphing software, a program that runs on the Crusoe chip and translates Intel's instruction set into its own. In effect, the Crusoe chip is the core logic of a modern Intel Architecture stripped of its hardware code translation.
The core is a very long instruction word processor, one that uses instructions that can be either 64 or 128 bits long. The core has two pipelines—an integer pipeline with seven stages and a floating-point pipeline with 10. Transmeta keeps the control logic for the core logic simple. It does not allow out-of-order execution, and instruction scheduling is handled by software.
Transmeta provides both a 64KB primary instruction cache and a 64KB primary data cache. The Crusoe comes with either of two sizes of secondary cache. The TMS5500 uses a 256KB secondary cache, and the TMS5800 has a 512KB secondary cache. At the time this was written, the chips were available with speed ratings of 667, 700, 733, 800, 867, and 900MHz.
To help the chip mimic Intel processors, the Crusoe family has a translation look-aside buffer that uses the same protection bits and address-mapping as Intel processors. The Crusoe hardware generates the same condition codes as Intel chips, and their floating-point units use the same 80-bit format as Intel's basic FPU design (but not the 128-bit registers used by SSE2 instructions).
The result of this design is a very compact microprocessor that does what it does very quickly while using very little power. Transmeta has concentrated its marketing on the low power needs of the Crusoe chips, and they are used almost exclusively in portable computers. A less charitable way of looking at the Crusoe is that its smaller silicon needs make for a chip that's far less expensive to manufacturer and easier to design. That's not quite fair because developing the Code Morphing software is as expensive as designing silicon logic. Moreover, the current Crusoe chips take advantage of small silicon needs of their small logic cores, adding more features onto the same die. The current Crusoe models include the north bridge circuits of a conventional chipset on the same silicon as the core logic. The Crusoe chip includes the system bus, memory, and PCI bus interfaces, making portable computer designs potentially more compact. Current Crusoe versions support both SDR and DDR memory with system bus speeds up to 133MHz.
Another way to look at Code Morphing is to consider it as a software emulator, a program that runs on a chip to mimic another. Emulators are often used at the system level to allow programs meant for one computer to run on another. The chief distinctions between Code Morphing and traditional emulation is that Code Morphing works at the chip level, and the Crusoe chip keeps the necessary translation routines in firmware stored in read-only memory (ROM) chips.
According to Transmeta, Code Morphing also helps the Crusoe chip to be faster, enabling it to keep up with modern superscalar chips. The Code Morphing software doesn't translate each Intel instruction on the fly. Instead, it translates a series of instructions, potentially even full subroutines. It retains the results as if in a cache so that if it encounters the same set of Intel instructions again, it can look up the code to use rather than translating it again. The effect doesn't become apparent, according to Transmeta, until the Intel-based routine has been executed several times. The tasks typically involved with running a modern computer—the Windows graphic routines, software drivers, and so on—should benefit greatly from this technology. In reality, Crusoe processors don't test well, but they deliver adequate performance for the sub-notebook computers that are their primary application.
|[ Team LiB ]|