Sustaining Moore's Law - 10 Years of the CPU

Over the years, we have seen the central processing unit or CPU of the computer shrink from the size of a house to that of a stamp while its raw processing power has increased exponentially. We take a look at how far CPUs have progressed in the last 10 years and the major highlights from 1998 to 2008.

Moore's Law

In early 1998, Intel released its low-end budget processor, the Celeron, which was then based on the Pentium II. It ran at a mere 266MHz and was manufactured on a 250nm process. When Intel introduced its latest iteration of the Celeron brand ten years later in January 2008, the Celeron had two processing cores, ran at 2.0GHz and was built on a 65nm process.

As anyone involved in the field of computing will tell you, Moore's Law is probably one of, if not the definitive statement of the industry. Ever since Intel's co-founder, Gordon E. Moore noted in 1965 that the number of transistors placed on an integrated circuit doubles roughly every two years, this has remained more or less true. Implicit in this observation is that computing power grows exponentially and even as we entered the 21st century, futurists have predicted that this trend will continue for some years to come.

Image courtesy of Wikipedia.

Image courtesy of Wikipedia.

While Moore's Law has since been expanded to include the exponential growth of all aspects of computing hardware, the original statement referred specifically to the semiconductor industry. Hence we felt that it was very appropriate to cite this as we look back on the CPU developments of the past decade.

So far, computer scientists and engineers have succeeded at maintaining this remarkable trend. We'll be highlighting some of the significant milestones during this period in more detail later and you can see for yourself the fruits of their labor. Since the scope for processors can be extremely broad, we are limiting our discussion to the x86 platform and its main players, with the occasional digression.

In our opinion, the last ten years of the CPU industry can be summarized in a single sentence:

"The race for clock speeds i.e. the Megahertz/Gigahertz race has evolved into one between multiprocessors."

In 1998, Intel's flagship Pentium II processor was running at a maximum clock speed of 450MHz. This would be supplanted within a year by the Pentium III, which despite the new name, did not differ that much from its predecessor. It even started at the same 450MHz clock as the Pentium II. However, the Pentium III had Intel's first implementation of SSE (Streaming SIMD Extensions) instructions, which reduced the number of instructions needed for each data set, thereby improving the efficiency of the operation. With new registers and floating point support, the SSE instruction set has been widely adopted by both AMD and Intel for their microprocessors and the current iteration is SSE5, introduced by AMD in 2007.

That same year in 1999 would also see the debut of AMD's K7, known as the Athlon, which would become the company's most successful CPU. Featuring a next generation x86 micro-architecture, the Athlon would seriously challenge Intel's dominance of the x86 market. Among other notable technological feats found in the Athlon was a new triple-issue floating point unit (FPU) that turned AMD's traditional FPU weakness into a strength, such that enthusiasts would be talking about AMD's FPU performance lead for years to come.

The original Athlon K7 500MHz processor as tested by HardwareZone.

The original Athlon K7 500MHz processor as tested by HardwareZone.

While the K7 heralded the beginning of an era where the incumbent market leader Intel faced serious competition for the first time in a long while, the company continued to ramp up the megahertz with further iterations of the Pentium III, with the 180nm Coppermine Pentium III that ran up to a maximum of 733MHz. Unfortunately, production woes plagued the transition to 180nm and despite the fact that the new Coppermine processors were significantly faster than the original Pentium III thanks to its full-speed 256KB L2 cache, the Athlon proved to be very compelling, particular in terms of pricing.

Climbing to Higher Frequencies

With such a competitive landscape, both AMD and Intel started launching newer grades of their processors, with increasing processor frequencies, all within a short time frame between 1999 and 2000. Not surprisingly, the significant 1GHz mark was soon breached and AMD's Athlon claimed that honor. It was obvious. AMD's star was in ascendancy and its next revision of the Athlon, the Thunderbird would improve on the original Athlon by having a faster and better cache design. These were heady days for the company and with the Pentium III falling behind in benchmarks, AMD was quickly gaining market share at the expense of Intel. To match its aspirations, the company started ramping its manufacturing capacity with new fabrication plants, though they remain far behind Intel.

This clock speed race between the two major microprocessor firms raged on as we entered the 21st century. Intel came back with a new micro-architecture, NetBurst that could be scaled up to extremely high clock speeds, due to a 20-stage deep instruction pipeline first used on the original Pentium 4. This meant that Intel was soon launching CPUs with clock speeds that were higher than anything from AMD, though they were not necessarily faster in benchmark performance. Less informed consumers who have always relied on processor clock speeds as a rough gauge of performance were hence inclined to favor Intel's Pentium 4 products.

This prompted AMD to introduce its PR (performance rating) system of marketing its processors, which pegged the processors relative to a baseline system. This was seen with the third Athlon revision, the Palomino, also known as the Athlon XP. AMD also sought to highlight the 'myth' of the clock speed with advertising efforts and concealed the true clock speeds of its processors (except within the BIOS) in favor of the PR system, as they were often inferior to Intel's higher clocked Pentium 4.

Of course, there came to be a time when Intel's Pentium 4 started encroaching onto the Athlon's performance lead. The introduction of the new Intel 845 chipset that used cheaper SDRAM instead of Intel's ill-conceived venture into RAMBUS in 2001 helped bring Pentium 4 to the mainstream consumer. Meanwhile, Intel continued to scale up its micro-architecture with newer cores that soon ventured into the 2GHz territory and beyond. Intel was aided by its transition to a 130nm process, allowing the company to increase transistor count and clock speeds. These newer Northwood cores introduced in 2002 also came with Intel's HyperThreading technology, which was a form of pseudo-multiprocessing that allowed multiple threads to be executed by simulating the presence of two logical processors such that the supported operating system can schedule two threads or processes.

Meanwhile, AMD's own switch to 130nm did not yield the higher clock speeds needed to overcome Intel's new Pentium 4. Other improvements to the design like a larger cache and higher FSB also failed to raise the performance sufficiently. It was time for AMD to come up with a new micro-architecture.

This new K8 micro-architecture would be found first in a server oriented processor, the Opteron. Representing AMD's hopes of making inroads into Intel's server class Xeon processors, the Opteron was launched in April 2003 and was unique in being able to run legacy 32-bit applications without any performance penalty, despite the fact that it was actually a 64-bit processor. Enabling this was AMD's x86-64 instruction set (AMD64), which would eventually be duplicated by Intel (Intel 64) a year later. However, this meant that when the Opteron debuted, Intel had no quick response to this additional feature. As a result, AMD started posting impressive growth figures for its server processors and some major vendors like Sun and HP would eventually offer Opteron powered workstations and servers.

The original AMD Opteron from the 2003-era designed for the Socket-940.

The original AMD Opteron from the 2003-era designed for the Socket-940.

AMD would follow up the server version with the desktop version, dubbed the Athlon 64, to leave consumers no doubts about its 64-bit pedigree. With performance that restored competitiveness with the faster Pentium 4 processors in the market, AMD had a moderately successful product on its hands but it could only produce very limited quantities. Core revisions in the following year led to various improvements like a faster HyperTransport bus and a dual channel memory controller but before this, Intel already had its newest 90nm Pentium 4 revision, Prescott (with an even deeper pipeline), available for a couple of months.

By then, the clock speeds of the faster Prescott Pentium 4 were already over the 3GHz mark. AMD's powerful FX-55 processor was not too shabby either in the clock department at 2.6GHz, especially when it was on a 130nm process technology. While AMD was to also migrate over to a 90nm process for its next core revision, the writing was on the wall for Intel. The Prescotts were running warmer than consumers would have liked, along with a corresponding increase in power consumption. Intel's earlier predictions of hitting 5GHz and higher seemed like fantasy as even 4GHz Prescotts looked difficult to achieve. The era of the frenetic clock race was coming to an end.

Counting Cores

What was to follow was that the clock speed race turned into a race to fit as many processing cores on a single die as possible. Intel went back to the drawing board and took a different route from NetBurst and the clock obessed Pentium 4. Returning to the Pentium III and in particular, the Pentium M chips that had been derived from the last Pentium III core, Tualatin and found in the company's mobile Centrino platform. Intel's Israeli outfit spearheaded the development of the new architecture, which would feature dual-cores and a return to a less complex 14-stage instruction pipeline. Most importantly, this new Core architecture would have significantly lower power consumption compared to the Pentium 4.



AMD obviously was on the same track, though they were building on its existing K8 series while working on its next generation architecture. It followed up its Athlon 64 with the Athlon 64 X2, which had two Athlon 64 cores on the same die package. This was in late 2005. Intel however had a stopgap measure available earlier that year, featuring a dual-core Pentium D (Smithfield), which was still based on the NetBurst micro-architecture and was basically two 64-bit Prescott cores side by side on the same package.



The main event however was yet to come but in July 2006, Intel finally lifted the wraps off its new Core 2 processors. Before the official product launch, there has been quite a lot of buzz about the new Core micro-architecture and its introduction at IDF Spring 2006 had enthusiasts awaiting Intel's return to form. The new processors, consisting of five Core 2 Duo CPUs - ranging from 1.86GHz to 2.67GHz, with a Core 2 Extreme at 2.93GHz - were all based on the 65nm Conroe core, with between 2MB to 4MB of L2 cache and significantly, had a maximum TDP rating of only 65W (75W for the Extreme version). Suffice to say, its performance more than lived up to its billing and with lower power consumption than any competing AMD processor, the Core 2 brand was off to a great start.

Back when it was new, the E6300 model of the Core 2 Duo series as shown here was the most famous of the lot because of its high overclocking potential. Even we've dedicated an overclocking article on it.

Back when it was new, the E6300 model of the Core 2 Duo series as shown here was the most famous of the lot because of its high overclocking potential. Even we've dedicated an overclocking article on it.

This lead that the Core 2 processors had over its market rival would be maintained up to its present iteration. Intel spared no time to expand its initial desktop range, with a lower-end Allendale core (lower clock speeds and lesser L2 cache), the single core Conroe-L for the Celeron brand and naturally, given its power efficiency, the Merom core was created for the Centrino platform. The server space meanwhile had Woodcrest to aid the Xeon against AMD's Opteron. Further highlighting its resurgence, Intel launched its first quad-core processor, Kentsfield at the end of the year. This quad-core was made up of two Core 2 Duo processors on a single die package, rather than four discrete cores. While performance benefits vary depending on the nature of the applications, power consumption were similar to doubling that of a Core 2 Duo. Overheads and bandwidth issues also make it less than ideal but AMD's processors then were unfortunately still on 90nm for most of 2006 and were hence unable to compete in terms of power efficiency and performance.



Those who favored AMD however were anticipating the company's next micro-architecture, the K10. These would be quad-core processors that would be available in the middle of 2007 for the Opteron server market before the consumer versions later in the year. AMD was in dire need of these reinforcements, since the Core micro-architecture based Woodcrest was making a great impression and looked capable of quickly retaking the gains made by Opteron in the past few years.



It was also in 2007 that Intel started to implement its tick-tock model of microprocessor development. Also known as Silicon Cadence, this was a rigorous schedule of following every architectural revision with a shrinking of the process technology. In this case, tick referred to the shrinking while tock was a new micro-architecture, with either one happening once in a year. Such an ambitious time frame was probably only possible with a company of Intel's resources and so far it has been executing this on time. For instance, following the 65nm Core 2 processors in mid-2006, Intel was gearing up for its transition to 45nm and was able to give previews of these new 45nm Penryn processors at IDF Spring 2007.



AMD's Barcelona Core Arrives but Core 2 still Ahead



In September 2007, AMD at last unveiled its latest Barcelona processors for the Opteron line featuring what AMD touts as a native quad-core design. Some of the benefits from the new processors include greater power efficiency (maximum TDP of 68 - 95W) from independent clock domains and power management for each core and internal core and cache enhancements like HyperTransport 3.0. However, AMD's initial Opterons were at a modest 2GHz compared with 3GHz processors from Intel. Even clock for clock, the in terms of both performance and power efficiency. It had obvious improvements from the previous micro-architecture but they looked inadequate against Intel's Core micro-architecture.



The desktop K10 variants, the Phenom were to follow the Opterons in November 2007. Dubbed the Phenom X4 for its quad-cores, these desktop processors hit a snag in the form of a TLB (translation lookaside buffer) bug shortly after its launch. This was quickly fixed via a BIOS update that unfortunately also had a performance penalty side effect. AMD had to go back to the factory and months later in March 2008, released a new B3 stepping of the Phenom that solved the issue. However, time waits for no man and certainly not for AMD. By then however, Intel's 45nm Wolfdale and Yorkfield processors were already available in the market and their process technology edge meant that in power efficiency, Intel only extended its lead while in performance benchmarks, the quad-core Phenom X4 found itself facing higher clock dual-core rivals from the 'big blue'.



New Forays and What's to Come



2008 also saw Intel going back to its roots in a big way, with a new line of low power processors known as the Atom that takes its architectural inspiration from Intel's older Pentium micro-architecture. Emphasizing a low TDP rating of 4W and less, the Atom has helped fuel a growing interest in portable low-power computing devices and the entry of the chip giant has seen a chorus of new products utilizing the Atom processor. Possible competitors have already sprung up in the form of NVIDIA's Tegra and VIA's Isaiah and it's still too early to tell if Intel will come to dominate this segment too.



As the year drew to an end, AMD's 45nm process shrink for its K10 processors came to fruition ahead of schedule. The Shanghai core Opterons were introduced, with dual-core versions of its Phenom processors to follow. The consumer version of the Shanghai, dubbed Phenom II will also debut next year at CES 2009 and like the 45nm Opterons, bring the advantages of the die shrink along with other core and cache enhancements that will narrow the gap slightly between AMD and Intel.



Intel meanwhile is going full steam on its next generation which is to succeed the Core and which we first saw . With up to 8 processing cores and a new integrated memory controller supporting DDR3, Nehalem shows that the core count will remain the next frontier for the x86 platform for the near future. Whether that is sufficient to extend Moore's Law remains to be seen.

The World not According to AMD and Intel

While we have kept most of our discussion to the x86 platform and the two major players left in the industry now, AMD and Intel, the microprocessor world is much broader than just these two companies. However, we are also not going to digress too much into alternate platforms, though we'll be highlighting some of the important happenings in the past ten years.

First, IBM's PowerPC micro-architecture suffered the loss of a high profile name when Apple announced that it was shifting over to Intel's Core 2 processors. Ever since 1994, Apple's computers have been using PowerPC chips and while it may not be the largest of PC vendors, Apple does have a strong brand and image. This move was however quickly completed and by 2006, most of Apple's product lineup became Intel based. This was probably inevitable, given how IBM and Motorola, the main movers behind the PowerPC micro-architecture were facing manufacturing problems while the clock speed too seemed to have stagnated. IBM too was increasingly distracted by its business of making PowerPC variants for game consoles. Although the PowerPC micro-architecture is still relevant, it is now mostly found in embedded computers and high performance computing applications.

Meanwhile, what was distracting IBM is its own initiative with Sony and Toshiba to develop a new micro-architecture for the PlayStation 3. While based on the Power architecture, this new multi-core Cell processor is less a general purpose processor like existing x86 processors and more oriented towards the specialized parallel processing approach favored by graphics chipmakers like ATI and NVIDIA. Its main processing elements are eight Synergistic Processing Units that can execute threads in parallel and heavily optimized for single precision floating point computation, much like some of your graphics cards.

Besides its implementation in Sony's PlayStation 3 in 2006, both IBM and Toshiba have plans for the Cell processor in applications ranging from high-performance computing, mainframes and home entertainment devices. Currently, it is most well-known for its role in the PlayStation 3's impressive performance in distributed computing projects like Folding@Home.

The Cell processor's most famous role is to power the PS3, currently the most powerful console in hardware capabilities.

The Cell processor's most famous role is to power the PS3, currently the most powerful console in hardware capabilities.

A Timeline of the Industry

Having presented a good overview of the last 10 years in CPU evolution, we now detail a timeline of the key events in the CPU industry, year by year. We'll start off with 1998, the year in which HardwareZone was also started.

1998

Intel's budget processor offering, Celeron was launched in April 1998 and it would grow into a distinct brand representing the most affordable (read: low-end) of Intel's desktop processors despite undergoing various micro-architectural changes over the past ten years. The first Celerons were stripped down Pentium II processors in disguise with less cache memory and this approach of downgrading a mainstream desktop CPU of certain features and cache came to define the brand. Intel intended the Celeron to compete against lower end products from other companies like AMD and Cyrix at that time and though the original failed to live up to its expectations (those without L2 cache), some of the subsequent versions have provided excellent value, especially when overclocked. The most popular of the lot was the beloved Celeron 300A, which easily clocked to 450MHz and faster with hardly any effort.

While not the most high profile of Intel's products, the Celeron enjoyed quite a few moments of glory, especially in the hands of overclocking enthusiasts who pushed these inexpensive chips to higher clocks to compensate for its disabled features and lesser cache. (Image shown is a Celeron 300A in a SEPP package, courtesy of Wikipedia)

While not the most high profile of Intel's products, the Celeron enjoyed quite a few moments of glory, especially in the hands of overclocking enthusiasts who pushed these inexpensive chips to higher clocks to compensate for its disabled features and lesser cache. (Image shown is a Celeron 300A in a SEPP package, courtesy of Wikipedia)

AMD's excellent Athlon processors released the following year prompted the company to develop a low-end brand of its own known as the Duron in order to preserve the premium Athlon brand name. Following a similar principle as the Celeron, the Duron would also become a recognizable rival to its Intel counterpart and just like the Celeron, there were as many hits as misses in its history, depending on which micro-architecture it was based on.

A HardwareZone article in 2000 by Dr Jimmy Tang gave a clear advantage to the Duron then, a 750MHz version that received a glowing five-star review at the expense of its competition, a Celeron at 900MHz. A prescient quote from the article, "this teaches us one thing, never judge a processor by the MHz as a Duron at a lower speed rating could actually outperform a Celeron at high speed ratings."

A couple of years later, AMD was to counter Intel's NetBurst processors and their high clock speeds with a major marketing campaign that carried a similar message of not judging processor performance solely on clock speeds.

1999

For AMD, 1999 would be a significant year. It was the year that its new K7 micro-architecture was launched to critical acclaim from hardware enthusiasts and reviewers around the world. We even got hold of the very first version tested and reviewed - the AMD Athlon K7 500MHz . Featuring a new RISC, out-of-order CPU that had a double data rate memory bus and a super-pipelined triple-issue floating point unit among other innovations, the Athlon was clearly faster than Intel's current Pentium III models. As a result, sales of the Athlon were strong and AMD was soon on the lips of enthusiasts.

AMD showed that it was no longer content to follow Intel's lead with the Athlon Classic.

AMD showed that it was no longer content to follow Intel's lead with the Athlon Classic.

While we were equally enthusiastic about the Athlon when it was released, we also took the time to explore if its performance was affected by a change in L2 cache speed, a measure which was implemented by AMD with certain of its models due to the cost and the technological limitations of the memory used for that L2 cache. As our HardwareZone article by Dr Jimmy Tang pointed out, there is indeed a slight performance difference, though "the effect of a slower L2 cache is very small, especially in normal office applications. However, if you're using CPU intensive software (e.g. graphics, multimedia or simulation), the performance would be affected."



Obviously, this was a minor issue that did not adversely affect the sales of the Athlon, which grew further in popularity with the debut of the Thunderbird core the following year. Intel's Pentium III was unable to compete, even with the newer Coppermine variants. The undisputed advantage held by AMD was to last until 2001 when the Intel Pentium 4 became available.

2000

With its Pentium III processors struggling, Intel turned its eyes on a new micro-architecture known as NetBurst, which featured a very deep instruction pipeline and was supposedly capable of scaling to very high clock speeds. This touted scalability was expected to help Intel overcome the threat of AMD, though when the first processors based on this micro-architecture, the Pentium 4 (Willamette) were initially launched, they were still lagging behind their Athlon competitors. In fact, they were arguably not much of an improvement over the Pentium III. However, SSE2 was added for the Pentium 4, following up on the original SSE that was present on the Pentium III and these additional instructions made some difference with the proper application support.

The Pentium 4 represented a new micro-architecture from Intel and on hindsight, it was probably not the best of decisions.

The Pentium 4 represented a new micro-architecture from Intel and on hindsight, it was probably not the best of decisions.

Overall, the early Pentium 4 processors that we saw in 2000 were not at all worth its premium price, high temperatures and power consumption. However, even the early Willamette cores had clock speeds of at least 1.5GHz which meant that they had a numerical albeit false advantage compared to the 1.4GHz maximum managed by the Athlon Thunderbird (which was actually the better performer). This was to result in AMD's attempts to counter any wrong perceptions created by absolute processor clock speeds with concerted marketing efforts in the next couple of years.

2001

This was the year when Intel's Pentium 4 stared to make some inroads into the Athlon's performance superiority. However, when the year started, it was not exactly the case and our first major high-end processor shootout confirmed this. Even though the Pentium 4 was not included due to it being relatively new and hardly any faster than the Pentium III. In any case, our consensus then was that "the value and performance offered by the Athlon processor is simply unbeatable. With amazing performance, affordable price and a scalable architecture, you can't deny that the AMD Athlon is fast becoming the processor of choice for power users all over the world. The AMD Athlon is clearly the winner in this category."

AMD's high point came with the Athlon Thunderbird and our benchmarks showed that it was a deserving winner.

AMD's high point came with the Athlon Thunderbird and our benchmarks showed that it was a deserving winner.

Mirroring the high-end result was our subsequent low-end processor shootout, which found the AMD Duron predictably clinching the award against Intel's Celeron. As succinctly expressed by us, "the Duron is a 'Natural Born Winner' and it just beats the living daylights out of an Intel Celeron."

Taking the low-end segment as well was AMD's Duron in our Hardware Zone Awards.

Taking the low-end segment as well was AMD's Duron in our Hardware Zone Awards.

While the K7 micro-architecture certainly deserved its time in the spotlight, its undisputed superiority would be coming to an end in the next couple of months. A new Pentium III core, the Tualatin was also released, as the chip giant sought to improve on its power efficiency. It was found to have "performance matching the Athlon, the Tualatin could very well give users a good alternative to using AMD's power hungry Athlon processor." However, it was not widely available locally. Meanwhile, Intel was ramping up the clock speed and production of its Pentium 4 and though not yet dominant, it would soon be a more interesting and competitive environment.

2002

While AMD had sought to counter the high clocks on the Pentium 4 with a new version of the Athlon, featuring the Palomino core in late 2001, 2002 was the year that the competition between the two competitors erupted into a full blown contest as both the Pentium 4 and the Palomino became widespread in retail. AMD was finally forced to respond to Intel's Pentium 4 and its burgeoning clock speeds. Hence, AMD's Palomino core, which was dubbed the Athlon XP came with a new PR (Performance Rating) that removed any overt mention of clock speeds from the Athlon's model name. Instead, the new rating compared the relative performance of the Athlon XP against the 1.4GHz Thunderbird. This was to counter Intel marketing that its processors had higher clock speeds and hence the implication that it was faster. This PR rating system has been used by AMD ever since, even though the various micro-architectural changes since have made this quite irrelevant.



The Palomino had greater power savings than the warmer Thunderbird core and performance obviously was another step up, particularly as AMD started to include Intel's SSE and its own 3DNow! instruction sets. It was also available in a MP version, which officially supported multi-processing (dual in this case) with the appropriate motherboard of course. This lead to an interesting modding experiment by us, where we tried to see if the Athlon XP could work as a MP version on a AMD 760MPX motherboard. Although the BIOS would try to lock this 'feature' on detecting our Athlon XP processor, we found that "modifying the CPU to run in a dual configuration is much simpler than unlocking the multipliers. Just use conductive paint to connect the two pads together." And voila! We could get the motherboard to run our Athlon XP processors in a dual processor configuration successfully. (If we could only do that now and unlock the Phenom X3 into a full fledged Phenom X4.)



On the Intel front, 2002 was the year the Pentium 4 started to break clock speed records, with its Northwood revision breaking the 2.0GHz mark then the 3GHz mark. This was aided in part by Intel moving to 130nm process technology and the company also introduced Hyper-Threading to the consumer arena with a 3.06GHz model. Our first glimpse of a Northwood core came with a 2.53GHz version that left "no doubt that the Pentium 4 is now holding the crown of performance."



As we mentioned, the 3.06GHz Pentium 4 was the first consumer processor with Hyper-Threading, which aided in the execution of multiple threads by fooling the operating system into thinking that there was indeed two cores onboard, thereby allowing the scheduling of more than one thread. Hardware Zone took a long look at this new technology, previously available only on Xeon servers and our conclusion was very positive:

Our first experience with Hyper-Threading technology came with the Intel Pentium 4 processor at 3.06GHz.

Our first experience with Hyper-Threading technology came with the Intel Pentium 4 processor at 3.06GHz.

In our individually concocted tests, we managed to show an appreciable gain in performance under heavy multi-tasking environments. This is indeed a very powerful testament of the capabilities that Hyper-Threading offers. Certainly, we're very sure that this would give users a more enjoyable and productive use of their PC, especially users who frequently need to perform multiple tasks at one time.

"Hyper-Threading technology is possibly the best thing that has happened to the desktop processor since the the introduction of the Pentium series. For that, we're giving it our Most Innovative Product award.

Since its heydays with the Pentium 4, the importance of Hyper-Threading has declined with the emergence of dual and quad-core processors. However, Intel's new Atom processor for the low-power mobile devices segment saw a return of Hyper-Threading, since the Atom is a single-core processor. This technology is also set for a major revival with Intel's next generation Nehalem processors, which have up to eight processing cores and with Hyper-Threading, expected to execute many more threads in parallel.

2003

The year dawned with AMD releasing the final revisions to its Athlon processors. With a larger amount of L2 cache (512KB), these new Barton cores were on a 130nm process and were given PR ratings of up to 3200+. Unfortunately, the clock speeds were not much higher than previous Athlons and were capped at 2.33GHz. When we last saw the Athlon XP (based on the Thoroughbred core) in 2002, it had already "lost the GHz race, and has lost the performance battle".

Our review of the 3000+ Barton seemed to think that it was a "rather attractive processor for users who want the best performance out of their current Athlon-based system ... and managed to beat the 3.06GHz P4 in certain benchmarks. The final say was that the Athlon XP 3000+ processor does offer compelling value" but it looked like AMD had restored some parity to the scene. However, even the company knew that it was the final toss of the dice for its K7 micro-architecture and in that year, it would first introduce a serious server competitor to Intel's Xeon chips and then a consumer version, both of which were based on the new K8 micro-architecture.

We were not yet done with the Barton though, and following our successful modding of the Athlon XP into a Athlon MP, we attempted the same with the Barton cores and had another success. With that, we were quite convinced that "whether it's a Morgan, Palomino, Thoroughbred or Barton, there's no reason why you cannot mod them to operate in dual processor mode."

Of course, the big event for AMD in 2003 was the introduction of its new K8 micro-architecture and we first saw that with the server oriented Opteron processors. As a 64-bit processor that had AMD's new AMD64 ISA extensions to run legacy 32-bit processors with hardly any performance penalties, the Opteron introduced many innovations like its use of HyperTransport, silicon-on-insulator technology, integrated memory controller and prompted us to speculate that the "Opteron could be the next big thing in the modern computing era."

Computing enters the 64-bit era with the Opteron and its K8 micro-architecture. Intel would follow suit in the following year with its own EMT64.

Computing enters the 64-bit era with the Opteron and its K8 micro-architecture. Intel would follow suit in the following year with its own EMT64.

The Opteron has since emerged as a viable alternative to Intel's server class processors and in its heyday, was able to secure major deals with workstation and server vendors like HP and Sun. It managed to chalk up a market share of around 25% in quite a short period of time. The delays and performance of the newest iteration based on the K10 micro-architecture however has dimmed its appeal but since the server market is relatively static, expect the Opteron to remain in contention and in service in many companies for the near future. The upcoming 45nm versions may also inject a much needed boost to its competitiveness.

The consumer version of the K10, the Athlon 64 followed the Opteron in September 2003. This was something that AMD needed for quite a while now, as our massive CPU shootout that year had found even the newer Bartons unable to surmount the performance of the Pentium 4. This was especially true for the higher end models, as even after we factored in the price of the platform, we found that while the "low-end Athlon XP processors still offer compelling value especially to those who are on a tight budget ... comfortably settle for a faster Pentium 4 2.8C with a lot of spare cash left."

The new Athlon 64 meanwhile brought the same innovations that we already saw with the Opteron and restored competitiveness with the Pentium 4. Our own review of the top end FX version saw us declare that the Pentium 4 Extreme Edition and the Athlon 64 FX-51 "are pretty much tied in the top position. But, if you are looking for a processor just for gaming, we had to give the Athlon 64 FX-51 our two thumbs up for its excellent delivery of frame rates in majority of the 3D games tested in this review." It was a quick comeback for AMD and bringing the industry into the 64-bit era impressed us enough for it to receive our Most Innovative Product award.

Targeted at hardware enthusiasts, the FX series represent the highest end consumer Athlon and usually have unlocked multipliers for overclocking.

Targeted at hardware enthusiasts, the FX series represent the highest end consumer Athlon and usually have unlocked multipliers for overclocking.

2004

Early 2004 saw Intel upping the stakes with a new 90nm core, Prescott. A new addition of the Pentium 4 family, this core doubled the transistor count of the previous 130nm Northwood core while actually having a smaller die, thanks chiefly to the die shrink. Compared to the older core, the newcomer featured more L1 and L2 cache, the longest stage instruction pipeline ever on the Pentium 4 with 31 stages, an updated SSE3 and particularly in the manufacturing process, some new nifty silicon enhancements. On the other hand, the Prescott also significantly increased the power consumption of the Pentium 4 and despite the enhancements, the cost and performance of the new core did not impress us initially, as they were "more or less within the same performance range" as the Northwood Pentium 4s.

Intel's first 90nm chip, the Prescott got quite bad press at launch and acquired an impression of being too warm and power hungry.

Intel's first 90nm chip, the Prescott got quite bad press at launch and acquired an impression of being too warm and power hungry.

The impression that the Prescott Pentium 4 was not an upgrade over the Northwood core was cemented by our CPU shootout in 2004, which saw the Pentium 4 Extreme Edition outshine the newer, albeit lower clocked Prescott Pentium 4. AMD also had a more aggressively clocked Athlon 64 FX-53 to counter Intel and the result found that "the Athlon 64 FX-53 and Intel Pentium 4 3.4GHz Extreme Edition processors are probably the fastest desktop processors that money can buy today."

2005

Multi-processors, in particular dual-core CPUs from AMD and Intel came to prominence in 2005. Even before that, Intel had been facing a dilemma with its increasingly power hungry Pentium 4 cores and though the company may be working on its next generation micro-architecture to resolve this issue, it also had a trick up its sleeves. This was the introduction of the dual-core Pentium 4, known by a new series name, Pentium D. Internally, these new cores had the codename Smithfield and were released in May 2005. In our own words three years ago, the Smithfield is "almost like two Prescott cores glued into a single die. Both of these cores share the same characteristics as a Pentium 4 processor, with each core featuring a 1MB L2 cache. Other new features include Intel's EM64T...", which is Intel's 'clone' of the 64-bit instruction set architecture that AMD had implemented with the Athlon 64.

At 230 million transistors on board, the Pentium D has almost twice the number of transistors as the Prescott Pentium 4, which only makes sense given that it is made of two Pentium 4 cores side by side on the same package.

At 230 million transistors on board, the Pentium D has almost twice the number of transistors as the Prescott Pentium 4, which only makes sense given that it is made of two Pentium 4 cores side by side on the same package.

With a lower FSB than existing Prescott processors, the Pentium D would have less bandwidth than the comparable Pentium 4 and like many of Intel's processor releases then, did not impress us too much in our performance testing. In terms of marketing, having double the cores was at least a sure-fire way to garner attention and less informed consumers but in actual real-world scenarios, the limited number of multi-threaded applications then hindered its performance. In the end, it is perhaps notable for the technologies and innovations behind its conception, rather than the end product that shipped.

AMD was to have its own dual-core Athlon processors later that same year. The Athlon 64 X2 was similar to the Pentium D in that it consisted of two Athlon 64 cores on the same die. However, the K8 micro-architecture was eminently more suited for a multi-processor environment and features like its integrated memory controller, and presence of HyperTransport links went some way towards alleviating some of the issues found on Intel's dual-core. Despite these advantages, the Athlon 64 X2 remained significantly behind Intel when it came to raw clock speeds and this was clearly seen in our comparison article introducing the X2, when the 4800+ model was clocked at a modest 2.4GHz compared to 3.2GHz on the fastest dual-core Pentium D.

AMD's dual-core may have been released later than Intel's but it was worth the wait.

AMD's dual-core may have been released later than Intel's but it was worth the wait.

The lower clock speeds was no stumbling block however as we found that "the performance of the AMD Athlon 64 X2 4800+ is outstanding and what you're really getting is a 'dualie' packed into a single silicon die." In more than a few benchmarks, the results showed commendable performance scaling for the dual-core Athlon and not only was it less expensive than Intel's Pentium D 840, it also had the better numbers. Chalk another win for AMD but it would be a short-lived triumph.

The multi-core era for consumer CPUs started with the Intel Pentium D and was quickly followed by AMD's Athlon 64 X2. These early processors left much to be desired and were not helped by the lack of multi-threaded applications. Nevertheless, it was born out of necessity, as both AMD and Intel were facing serious difficulties in increasing the clock speeds for their respective micro-architectures despite improvements in process technologies.

2006

Without a doubt, 2006 would be best remembered for Intel's next generation Core micro-architecture. IDF Spring 2006 was the venue where the new micro-architecture was unwrapped to the public and the theme for the keynote was on the power efficiency and savings of the new 65nm Core based processors over the Pentium D. The presentation also revealed the change in direction in Intel's micro-architecture towards multi-core. The reasoning was that having more energy efficient and lower clocked processor cores on a single die can make up for the performance of a single, power hungry processor at high clock speeds. Of course, the Core micro-architecture was not just about the energy efficiency. Improvements in the core design resulted in performance gains even with more modest clocks. This was achieved by optimizing the execution of instructions through technologies like Intel Wide Dynamic Execution.

We were finally treated to the final product in the form of desktop Core 2 Duo processors featuring the Conroe core and they did not disappoint after all the hype. According to our review of the Conroe, they "practically ripped to shreds any and all previous desktop performance records with scores that were mostly through the roof." AMD's highest end processors were left behind and the fact that they had much higher TDP ratings only made it worse. The Core micro-architecture was taking over and more woe was to come for AMD as these dual-cores were only the beginning. Intel had quad-core processors slated for the end of the year and while they were similar to the Pentium D in that these quad-core processors were two Core 2 Duo on the same die, they were better than anything AMD had on the market.

The innovations that enabled the Core 2 Duo to make such a major leap in performance and energy efficiency are neatly summarized here.

The innovations that enabled the Core 2 Duo to make such a major leap in performance and energy efficiency are neatly summarized here.

Given the astounding performance of the Core micro-architecture, it was not surprising that there was lots of interest about it. Hence, HWZ had a follow-up in the form of an analysis of the thermal and power consumption of these new processors. Confirming Intel's claims, the tests confirmed that "they have come out triumphant on all accounts" when it came to performance, thermal and power.

More love for this new micro-architecture was to come in November 2006, when the quad-core Kentsfield Core 2 Extreme QX6700 found its way into our testing lab. It was the first consumer quad-core processor and our benchmarks confirmed its superiority in those applications which could take advantage of its quad-cores. For enthusiasts who are into gaming, "there is nary a game right now that will scale with the QX6700 ... the biggest hurdle is still GPU limitations." This led us to conclude that the quad-cores would only get better with time, when more applications become multi-threaded to take advantage of the increasingly multi-core computing landscape.

The two Core 2 Duo are evident from this image of the Core 2 Extreme QX6700.

The two Core 2 Duo are evident from this image of the Core 2 Extreme QX6700.

While almost two years have passed since our judgment of the Core 2 Extreme but till this day, it remains difficult to find enough multi-threaded applications, particularly for games, that can fully unlock the potential of these processors. GPU limitations are inevitable even for the newest quad-core processors nowadays and NVIDIA has been quite relentless in pounding home this point in its marketing.

2007

Following a year where its competitiveness was seriously eroded by Intel's Core micro-architecture, all eyes were on AMD and its upcoming K10 micro-architecture. But before that was to be released, Intel's new 'tick-tock' strategy was illustrated when the company showed off its 45nm Penryn processors, which effectively shrinks the Core 2 from 65nm to 45nm, together with an increase in cache size and naturally greater energy savings. Needless to say, we were quite excited by Intel's upcoming processors, especially as they involved a In fact, we did not have to wait till 2008 as predicted to see our first 45nm processor, as Intel delivered a Core 2 Extreme QX9650 in October, right before AMD's scheduled launch of its new K10 micro-architecture. These new quad-core can be considered only but it was indicative of the strong position that Intel had in manufacturing and process technologies. AMD needed a perfect launch of its K10 architecture to have any hopes of staging a comeback.

Unfortunately, AMD's 'Barcelona' cores that formed the basis of the K10 micro-architecture did not appear to be the answer. The Opteron versions of the new micro-architecture were only launched in September and while these were indeed the native quad-core processors that we had been promised, the new cores were hardly the answer against Intel's latest in both performance and power efficiency. Our verdict then was The Barcelona was not without its strengths, especially when it involved the memory subsystem but Intel was overall still the leader to beat for the important areas.

More trouble was to follow for AMD. When the desktop versions, the Phenom X4 was released in November, it was found to have a TLB bug that affected its stability in certain scenarios. AMD hemmed and hawed about the issue before releasing a BIOS fix that was found to affect the performance of the processors by around 10%. Eventually, the company would go back and re-release a new B3 stepping of the Phenom X4 that was bug-free. This naturally affected our review of the Phenom X4 and we would finally publish the results months later, when the B3 steppings were at last available in March 2008.

2008

As we had mentioned, our first published results of the Phenom X4 came only in April 2008 after the new B3 steppings were made available. The Phenom X4 had and the delay had served to illustrate the lead Intel now holds in the market. Newer Core 2 models had entered the market since the Phenom was supposed to be launched while older Core 2 models have fallen in price. Hence, this severely squeezed the price range that the Phenom X4 could exist and still be attractive to enthusiasts. In short, it was hard to choose the Phenom X4 unless you're a die-hard AMD fan.

Around the same time, AMD would try to spin the old trick of salvaging those quad-core chips that failed the mark, recycling them as triple-core processors and sold under the Phenom X3 brand. Given its track record, we weren't expecting any miracles from these processors and "the newcomer is not the answer either." Of course, Intel was not about to give its competitors any breathing space and price cuts were timed to maximize the impact it had on the launches of both the Phenom X4 and X3.

The 'Black Edition' of the Phenom X4 is unlocked and the company's high-end enthusiast product, but its price and performance made it a poor match for Intel's entry level quad-core processor.

The 'Black Edition' of the Phenom X4 is unlocked and the company's high-end enthusiast product, but its price and performance made it a poor match for Intel's entry level quad-core processor.

The middle of the year saw the launch at Computex of a new line of Intel processors aimed at the low-power, mobile device segment. The Atom processor was supposed to herald in a new era of computing which was in a way triggered by the popularity of sub-notebooks like the ASUS Eee PC. Obviously, Intel had grander ambitions for this segment and Netbooks was its way of naming these mobile computing devices.

These devices would be and would be paired with a mobile 945GM chipset and a ICH-7M Southbridge, which are existing chipset components from Intel. The Atom itself takes its architectural cues from older Intel micro-architectures and is only a single core, though it does see a return of Hyper-Threading and most importantly, has an attractively low TDP rating of 4W.

Can Intel succeed in the lower end segment where Microsoft have floundered in the past?

Can Intel succeed in the lower end segment where Microsoft have floundered in the past?

As the year drew to an end, AMD's 45nm process shrink for its K10 processors came to fruition ahead of schedule:- the 'Shanghai' core Opterons. The consumer version of the Shanghai, dubbed Phenom II will also debut next year at CES 2009 and like the 45nm Opterons, bring the advantages of the die shrink along with other core and cache enhancements that will narrow the gap slightly between AMD and Intel.

Intel meanwhile is going full steam on its next generation which is to succeed the Core and which we first saw . Known in the consumer retail scene as the Core i7, it has up to 8 processing cores (four physical, four logical) and a new integrated memory controller supporting DDR3. The Nehalem micro-architecture shows that the core count will remain the next frontier for the x86 platform for the near future. Already true six and eight-core variants are being planned for release later in 2009 for the server arena.

Our articles may contain affiliate links. If you buy through these links, we may earn a small commission.

Share this article