Voodoo Beginnings - 10 Years of GPU Development

The graphics card has certainly come a long way. Join us as we take a trip down memory lane to the humble beginnings of the 3D graphics card, to the days when 3dfx still existed and NVIDIA cards still had a sensible nomenclature.

By Kenny Yeo - 14 Jan 2009

Voodoo Beginnings

Unless you are an ardent gamer, you might not think much of the graphics card that is sitting right now on your computer's motherboard. You might even mistake it as an unnecessary piece of equipment and think that it is only used by gamers. However, graphics cards today are used for more than just gaming. In fact, they are used today for a variety of tasks such as accelerating HD video playback, video transcoding, accelerating the viewing of PDF documents and images and many others. Truly, they have come a long way from where they were 10 years ago.

Just slightly more than a decade ago, the first commercially successful 3D graphics card was released by 3dfx, the Voodoo Graphics card. It was so all-conquering that along with its successor, the Voodoo2, 3dfx was able to dominate the graphics card upgrade market for the next few years. In fact the best graphics card subsystem back then were an STB Lightspeed 128 + the Voodoo for the best in 2D and 3D capabilities. The more professional folks would have chosen the Matrox equivalent for better 2D quality.

This was probably one of the earliest cards we've ever reviewed - The Canopus Pure3D II 12 MB. It was also one of the fastest Voodoo 2 cards around.

With the 3D fever gripping strong, long-time graphics suppliers like S3, Trident, Matrox and many others found it hard to keep up with the sudden change in market demand and gradually dropped out of the heated competition. It soon became a three-way contest between 3dfx, ATI and NVIDIA. Sadly, 3dfx was also showing signs of waning, and in 2000, were eventually acquired by its arch-rival NVIDIA.

There were many reasons for the demise of 3dfx. For instance, instead of choosing short development cycles like ATI and NVIDIA, 3dfx pursued lengthy and ambitious ones. This strategy eventually backfired as they were unable to keep up with the rapid advances made by their rivals. Also, their ambitious development cycles meant that they neglected certain segments of the market, namely the mid and low-end segments. NVIDIA, especially, was able to capitalize on this large segment with their affordable yet powerful GeForce 2 MX, and ATI with their Radeon VE.

The eventual collapse of 3dfx meant that ATI and NVIDIA are the only dominant players left in the market, and the honor of the having the fastest solution has swung consistently back and forth between the two even till today.

From PCI to AGP to PCI Express

While all this was happening, there were also other major changes. One such change was the way graphics cards communicated with motherboards. Over the last 10 years, in order to satisfy our needs for greater bandwidth, we have transitioned through three major changes in the interface used - from PCI to AGP and finally, PCI Express.

As graphics cards got speedier, a quicker, more efficient link was needed to take full advantage of it and prevent the interface from being the bottleneck. As such, it came to a point, some time in 1997, when the humble PCI interface could no longer provide the bandwidth that was needed. To address this problem, Intel came up with the AGP (Accelerated Graphics Port) slot, which at its peak, could offer a bandwidth of up to 2GB/s. This was a dramatic improvement over that of PCI, which could only manage a maximum of 133MB/s. However, considering the rate at which graphics cards were evolving, it soon became evident that AGP wasn't future proof.

As a result, PCI Express was introduced by Intel in 2004. PCI-Express was markedly different from its predecessors in that it was structured around pairs of serial (1-bit), unidirectional point-to-point connections known as "lanes". In the first iteration of PCI Express (PCIe 1.1), each lane could send data at a rate of 250MB/s in each direction. The total bandwidth of a PCI Express slot was then determined by the number of lanes it has. A PCIe x1 slot would have a total bandwidth of 250MB/s, whereas a PCIe x16 slot (with the maximum lanes) would have a total bandwidth of 4GB/s. The PCIe x16 variant soon became the modern de facto interface for graphics cards.

However with ever faster graphics subsystems, it also meant that for PCI Express to be viable in the future, revisions would be needed and thankfully the PCI-SIG consortium has ensured that PCIe was designed with the future in mind and was extensible. In January last year, the second generation PCI Express 2.0 (PCIe 2.0), was born. The key advantage that PCIe 2.0 had over PCIe 1.1 was that data could now be sent at double the rate, meaning 500MB/s in each direction. This meant that an x16 slot could now transmit data at an amazing 8GB/s, four times greater than that of the fastest AGP iteration.

More recently, technologies such as CrossFire and SLI are taking advantage of PCI Express to give gamers that extra boost in performance. Thanks to PCI Express, multiple graphics cards riding on the same motherboard are now possible because being a point-to-point connection, it doesn't have to wait for the connection to free up nor require complex handshaking protocols. As such, multiple graphics cards can now communicate simultaneously among themselves and with the processor, which in turn results in much higher frame rates and a more immersive gaming experience.

SLI Cometh! With SLI, gamers could now stack two graphics cards together for added graphics processing power.

DirectX, GP-GPU and the Future

However, the development of graphics cards is not solely about sheer speed and power. All this time, there were also changes taking place beneath the surface - specifically the Application Programming Interface (API). Without delving into details, know that most games initially made use of the popular OpenGL API, until Microsoft came along with DirectX. DirectX was born because of Microsoft's intention to establish it as the 3D gaming API of choice with a fixed set of standards at any given iteration, so that game developers would be unified under a single API, which made designing games easier.

It took a while, but eventually DirectX established itself as the de facto 3D gaming API and Microsoft continually worked about implementing new features that would benefit developers and gamers. DirectX 7.0 for instance, was a leap in 3D gaming, because it introduced hardware support for Transform & Lighting (T&L) functions, which was previously handled by the CPU. DirectX 7.0 coupled with NVIDIA's now legendary GeForce 256 - the first card to support hardware T&L - helped to push the immersion level of 3D gaming to the next notch. Now that T&L functions are handled by the graphics processing unit, developers can create more realistic games with more complex scenes without worrying about overburdening the CPU.

The real milestone moment in 3D gaming would be the introduction of DirectX 8.0. This revision implemented Programmable Shading, which allowed for custom transform and lighting and more effects at the pixel-level, thereby increasing the flexibility and graphics quality churned out. This was also coined as the Microsoft Shader Model 1.0 standard. DX8 was first embraced by NVIDIA in their GeForce 3 series of cards and was followed suit by ATI's Radeon 8500 series.

However it wasn't till the entry of DirectX 9 and the Shader Model 2.0 standard that game developers adopted programmable shading routines more liberally as this new DirectX standard extended the capabilities of DX8 by leaps and bounds with even more flexibility and complex programming tools to yield the required effects. The legendary Radeon 9700 series was the first to support DX9 and was the only one to do so for a long while to come.

We're sure the gamers out there will remember this baby, the all-conquering Radeon 9700. It was so powerful, it could even handle games that came three years after its release.

These standards evolved yet again with the DX9.0c version that embraced Shader Model 3.0 and is now the minimum standard for graphics cards and games design. Features such as High Dynamic Range (HDR) Lighting, realistic shadows, instancing and more came to be supported in this revision and it brought about more realistic game play. NVIDIA's GeForce 6800 series was first to support the SM3.0 model and the tables switched as ATI wasn't able to offer an equivalent solution till the Radeon X1K series much later.

Yet another key moment was the introduction of DirectX 10, which brought about an unified shader programming model, which was once again first implemented by NVIDIA in their GeForce 8 series of cards which not only supported the unified shader programming model, but also physically had a Unified Shader architecture. This model was revolutionary because it breaks down the limitations of having specific types of shaders with the introduction of general purpose shaders in the GPU core.

Traditionally, GPUs had dedicated units for different types of operations in the rendering pipeline, such as vertex processing and pixel shading, but in a graphics cards with a unified shader architecture, such processes can now be handled by any standard shader processing units. What this means is that in scenes when there is a heavier pixel workload than vertex workload, more resources could be dynamically allocated to run these pixel shader instructions. The end result is greater flexibility, performance and efficiency and more significantly, opens the door for GPU computing. ATI managed to catch up more than half a year later with similar support on their Radeon HD 2000 series.

NVIDIA's 8-series of cards were the first to embrace DirectX 10.0. They also employ an Unified Shader Architecture, allowing superior performance over their rivals.

Today, graphics cards continue to evolve and improve while even more interesting developments are taking place. One of the more exciting things that have been discussed is general-purpose computing on graphics processing units (GP-GPU), which involves the GPUs taking on general computing tasks, thus increasing overall performance and efficiency of the system.

This has been a challenge for engineers and programmers thus far because GPUs, as powerful as they are, excel only at certain floating point operations and lack the flexibility and precision to take on tasks that CPUs traditionally do. Modern GPUs have bypassed this by having many less powerful general purpose 'stream' processors and with the development of an open compute language to bridge the architectural/hardware differences between ATI and NVIDIA, this is an exciting area of growth.

To put the raw power of a GPU into perspective: ATI's latest Radeon HD 4800 series of cards are capable of achieving in excess of 1 teraFLOPs, while the fastest of Intel processors - the quad-core QX9775 - can only manage 51 gigaFLOPs. Already, GPUs have proven that they are far more capable in helping to accelerate video decoding than CPUs and likewise in video transcoding tasks where the CPU could take many hours what the GPU can finish off in the span of a lunch break.

The latest cards from ATI are reportedly capable of achieving over 1 teraFLOPs, much more than what the fastest of quad-core processors can achieve.

There is also much buzz from ATI and NVIDIA about creating the ultimate visual experience. What does this exactly mean? Simply put, think of combining the Wii's interactivity with movie-quality digital renders. It's all about putting the player in the forefront of the action. To get a better idea, we suggest you read what David Kirk , NVIDIA's Chief Scientist and John Taylor , AMD's Director for Product and Strategic Communications, had to say in our interviews with them.

Clearly, these are exciting times for graphics cards. Faster and more powerful cards mean more realistic-looking games (think near movie-quality), and GP-GPU, if tackled correctly, could potentially unleash tremendous amounts of computing power. With so much in store, we can't wait to see where the next 10 years will take us. For now, we detail you a timeline of the last 10 years of GPU progression and that's up next following the jump.

1997

This was when 3D graphics cards and accompanying game development took off in a big way. 3dfx released the first true 3D graphics chip, the now legendary Voodoo. It was powerful and included several fundamental 3D effects processing capabilities. This marked the beginning of 3dfx's domination over the graphics card industry (at least before the Y2K era).

What Soundblaster did for PC sound, 3dfx did the same for PC graphics.

We also saw the introduction of AGP as the new interface between the graphics card and the motherboard. Intended to replace the older PCI interface, AGP offered as much as 15 times the bandwidth compared to PCI, and will remain dominant for the next few years until the birth of PCI-Express.

At this point, the main players in the graphics card industry were down to three - 3dfx, ATI and NVIDIA . The sudden interest spike for 3D graphics cards meant that the traditional graphics vendors had to evolve fast to avoid elimination from the market. The three mentioned vendors were able to do just that and their dominance meant that other well-known graphics vendors, such as S3, 3Dlabs and Rendition, had little chance in both the mainstream, performance and enthusiast markets. Matrox though, were still in with a shout with their Millennium series of cards, that consistently offered superior 2D speed and graphics quality for those who are serious about professional work.

1998

Rendition, left bruised and battered by 3dfx, ATI and Radeon , was eventually acquired by Micron, a semiconductor company. They kept the Rendition in hopes that they could work on embedded graphics for their own line of motherboards. Unfortunately, nothing happened and Rendition just faded away.

3dfx sought to secure its hold on the 3D graphics card market by introducing the Voodoo 2. It was technologically superior to its competitors as it allowed two textures to be drawn in a single pass, making it vastly faster, especially on games that use lots of textures and/or multi-texturing.
The Canopus Pure3D II 12MB was one of many Voodoo 2 cards we reviewed, and we were completely blown away by its performance. This Canopus card was one of the fastest of the Voodoo 2 bunch, and this was in no small part due to its high quality Silicon Magic 100MHz 25ns EDO DRAM.

One of the earliest cards we've ever reviewed and one of the fastest as well. During its time, the Canopus Pure3D II was peerless.

To promote the use of AGP, Intel introduced their own graphics chipset, the i740. We reviewed the ASUS AGP-V2740TV Intel i740 and found its performance to be average and good for video editing and playback of VCDs/DVDs for output to a television. Sadly, the i740 chipset didn't take off and it ended up being Intel's only foray into the dedicated graphics card market thus far (but the Larrabee in development currently might change that statement).

Intel's only foray into the graphics chipset market thus far, the ill-fated i740. It was good for only light multi-media work and could not handle games as well as Voodoo's cards could.

This year also saw the release of Matrox's much-hyped G200 chipset. It combined Matrox's renowned 2D performance with a fully-featured 3D accelerator. We managed to get our hands on a Matrox Millennium G200 AGP and thought that it was a competent all-rounder, providing excellent graphic quality and decent frame-rates. This by the way, was also our Editor, Mr Vijay's first-ever review for HardwareZone.

Towards the later part of 1998, 3dfx introduced its first ever 2D/3D chipset - the Banshee. We reviewed the Creative 3D Blaster Banshee and found it to be a good but ultimately flawed product. It had good 2D performance and was decent in games, but the lack of features like 32-bit color and multi-texture support meant that it wasn't on par with the feature support of many of the newer cards that were appearing then and it would soon be behind the curve.

Around the same time, NVIDIA launched their RIVA TNT chipset. The RIVA TNT was the first time anyone seriously challenged the Voodoo 2 for the mantle of fastest graphics chipset. It was almost as fast the Voodoo 2, and in addition, it supported 32-bit color and had 2D acceleration - something the Voodoo 2 cards didn't have.
We tested the Canopus Spectra2500 AGP and were thoroughly impressed. The RIVA TNT probably marked the beginning of the end of 3dfx's reign.

Canopus' TNT-based Spectra2500. This was one of the few cards that could ever hope to challenge the Voodoo 2. Not only was it fast, it supported 32-bit color and had 2D acceleration.

1999

Voodoo finally announced the much anticipated Voodoo 3 chipset, which was based heavily on the earlier Banshee and Voodoo 2 chipset. As we noted while testing the Voodoo 3 2D/3D card , it was fast, especially on games that were optimized for the Glide API, but sadly still lacked 32-bit color rendering. Looking back, the card didn't offer much improvement over the earlier Voodoo 2 and was eventually completely outclassed later by the NVIDIA's GeForce 256 and ATI's Radeon.

NVIDIA later improved on the success of the RIVA TNT chipset by introducing the TNT2. TNT2 is mostly similar to its predecessor, but included support for AGP 4X and up to 32MB of Video RAM. Additionally, the TNT2 was manufactured on a more advanced smaller process technology than the older TNT and could hit much higher clock speeds. We had the Canopus Spectra 5400 Premium Edition AGP in our labs, a really high-end RIVA TNT2 card and were absolutely thrilled with its performance. Its price, however, was just as thrilling, but in a different way - S$550! That's a lot of dough for a graphics card in those days.

TNT2 arrives! It continued to offer competitive 3D performance and 32-bit color support, and in our tests, we found that its graphics were of higher quality. 3dfx was now really feeling the heat.

In that same year, Matrox released its G400 chipset, which was essentially a refined and more powerful version of the earlier G200. It included multiple monitor output support and had a new 3D feature known as Environmental Mapped Bump Mapping. We reviewed the Matrox Millennium G400 Dualhead , and although it provided average 3D performance, we were absolutely delighted by its 2D performance and the quality of its graphics, as you would expect from a Matrox card.

In mid-1999, NVIDIA landed the killer blow to 3dfx by announcing its new GeForce 256 chipset. Along with Microsoft's DirectX 7.0 standard, it ushered in a new era in 3D gaming as Transform & Lighting (T&L) functions were now handled by the GPU directly and it was many times faster than what a CPU could process back then. This provided a tremendous boost in the quality of graphics as well as frame rates. We reviewed Creative's 3D Blaster Annihilator , and were impressed with the quality of its graphics and frame rates.

The GeForce 256, a true legend amongst graphics cards. Hardware-support for T&L brought about unprecedented gains in performance and image quality.

2000

To combat the threat that was the awesome GeForce 256, ATI came up with the radical ATI Rage Fury MAXX, which was probably the first card to ever feature 2 GPUs (Rage Fury chips) on a single PCB. It employed something called Alternate Frame Rendering (AFR) and was fast enough to match the GeForce 256 cards using SDRAM. However, its lack of T&L support ultimately meant that it wasn't a card for the future and it had operating system compatibility issues outside of Windows 98. These were severe drawbacks by themselves and the fact that a single GeForce 256 graphics card outfitted with DDR graphics memory was able to outclass the Rage Fury MAXX. Nonetheless, this dual-GPU graphics card will go down history just because it's the first of its kind and we're fortunate to still have one of them in our labs - which is now a showpiece of course!

It didn't matter in the end, because later in 2000, ATI unleashed the Radeon. We had the ATI Radeon 64MB DDR VIVO AGP in our labs and found it to be quite a capable card. At this point, things were really looking bad for 3dfx and they had to respond. Fast.

While NVIDIA had the GeForce 256, ATI, on the other hand, had the Radeon. Together, they would bring 3dfx to its knees.

The Voodoo 5. This was to be 3dfx's last graphics card. The twin-threat that was the GeForce 256 and Radeon proved to be too much for the ailing graphics card company to handle.

3dfx finally released the eagerly anticipated Voodoo 5. Looking back, Voodoo 5 was too little too late and some of its other features were too soon for its time. At that time, however, we thought that despite its shortcomings, the Voodoo 5 was still a good card and could even possibly herald the comeback of 3dfx. And ever present in the minds of the techies was the legendary Voodoo 5 6000, which had four GPU cores, powered by an external power brick and never saw the day of light in retail. Even today, no vendor dares to make a graphics card with more than two GPUs - the complexity of the board design and short time-to-market needs simple makes it infeasible.

Ironically, later that year, 3dfx declared bankruptcy and was eventually bought over by NVIDIA. From this point on, the graphics card market was dominated by ATI and NVIDIA.

In 2000, NVIDIA also built on the success of their GeForce 2 line by introducing the GeForce 2 MX GPU. The MX denotes that the chipset is for the more budget-conscious. It was much more affordable than its higher-end siblings and therefore was extremely popular, especially amongst OEM system builders, who now had a low-cost 3D solution.
Despite being targeted at the budget-minded, it was still a capable performer, as evident by the Asus AGP-V7100/Pure 32MB SDRAM .

2001

This year, Microsoft introduced DirectX 8.0, which implemented Programmable Shading. This allowed for non-standard shading routines to be applied at the at the pixel-level for interesting effects (more than just T&L functions), resulting in more realistic graphics.

The follow-up to the immensely successful GeForce 2 - the GeForce 3. It was also the first card to fully support DirectX 8.0, and hence, Programmable Shading.

In response to this move by Microsoft, NVIDIA launched the GeForce 3, the first card to fully support DirectX 8.0. It wasn't groundbreaking in the same way NVIDIA's previous cards were and the GeForce 3 could even, in some cases, be outperformed by the older GeForce 2 Ultra.

ATI, on the other hand, unleashed the Radeon 8500. Unfortunately, its launch was marred by problems with its drivers. Once that was sorted out, however, it proved to be a competitive card.

S3, now known as SONICBlue, sold its graphics business to VIA, a chipset provider, for over US$300 million, choosing instead to concentrate on digital media. And although whatever technology VIA inherited from S3 was not powerful enough to compete with modern day GPUs, its low cost, however, made it an ideal integrated solution.

2002

Early in the year, NVIDIA was the undisputed speed king once more with the launch of its GeForce 4 Titanium series of cards. It was similar to the earlier GeForce 3 GPU, with the exception of a few additional features, such as higher core and memory clock rates, an improved memory controller and the introduction of the nFiniteX Engine II.
We had a NVIDIA GeForce4 Ti 4600 128MB DDR in our labs and were surprised at how fast it was. Although it offered top notch performance, it was also very expensive.

NVIDIA reclaimed the honor having the fastest card with the Ti4600. It was very fast, but also very expensive.

To ensure they stayed competitive, NVIDIA would later expand the lineup by introducing the cheaper Ti 4200 GPU. This GPU was extremely popular and we managed to secure a Leadtek WinFast A250 LE TD 64MB for testing. Although it was considerably cheaper, it still provided great performance and this particular model was extremely overclockable.

The Ti4200 was the card of choice amongst most mainstream gamers. It was considerably cheaper than the top-of-the-range Ti4600, yet offered good enough performance.

By now, AGP was the de facto interface for graphics cards, and this year saw the introduction of AGP 8X.
Theoretically, AGP 8X would double the rate of data transfer from 1.06GB/s to 2.1GB/s, and so we sought to find out how much performance would you get from this increase in bandwidth. You can read the full test here , which was then a world exclusive.
In our tests, we took a SiS648 reference board as our Universal AGP 3.0 platform and accompanying it was an SiS Xabre400 graphics card, which has support for both AGP4X and AGP8X transfers. Since the reference board does not allow us to set the AGP transfer speed in the BIOS, we had to modify the graphics card to force it to operate in AGP4X (Mode 2.0). This allowed us to compare the performance of the same graphics card using different AGP transfer speeds.
Despite using only a SiS Xabre400 graphics card, we found AGP8X to give us about a 4.7% boost in frame-rates, and we believed that we would see greater gains with a higher-end graphics card.
However, in a follow-up test with a higher-end GeForce Ti4200 8X card, we were surprised to find that there was no significant improvement between AGP4X and AGP8X. We alluded this to the fact that AGP 8X was still in the early stages of implementation and that there were still higher-end cards that remain untested.
Surprisingly however, in a third test (this time with a Radeon 9700), we once again found no significant gain in performance. Clearly, AGP 8X provided negligible benefits. We concluded that this could be because both the Radeon 9700 and Ti 4200 had larger frame buffers, and it could be because these cards were more than capable of handling the games of the moment.
In mid 2002, Creative Technology acquired 3Dlabs, the creator of the Permedia chipsets, thereby signaling their intentions to be a major player in the graphics card market. Sadly, that was not to be, and 3Dlabs was left languishing under Creative's ownership.

The graphics card market was becoming increasingly crowded and competitive. To stand out, card manufacturers such as Sparkle started looking at packaging their products differently to attract buyers. Sparkle's Platinum GeForce4 Ti 4600 was one of those cards. A card in a tin can? Who would have thought of that?

If we gave out awards for most innovative packaging, this card would have won hands down. Never again have we seen cards coming in tin cans.

Finally, ATI released the Radeon 9700, which would later go on to achieve legendary, almost godly status. It was so powerful that it trumped the previous fastest card, the GeForce Ti 4600, by a 20% margin. With anti-aliasing and anisotrophic filtering turned on, it would beat it by anywhere from 40% to 100%! In fact, the Radeon 9700 was so powerful that it would allow gamers to achieve playable performance on even the latest games three years after its launch. We reviewed the Gigabyte MAYA II GV-R9700 PRO and proclaimed that, in its time, nothing in the market even came close to matching it for sheer performance.
Industry experts would later declare the Radeon 9700 to be one of the most important breakthrough graphics cards in history, alongside NVIDIA's GeForce 256 and 3dfx's Voodoo.

The sight of the Radeon 9700 caused our eyes to well up with tears. This was truly a monstrous card. Absolutely nothing could stand up to it.

2003

The 9500 PRO chipset was ATI's mid-range champion. It was faster, in some instances, than even NVIDIA's TI4600. Furthermore, you could even mod it to give 9700-levels of performance.

The R300 core in the Radeon 9700 proved to be a smash hit and to build on its success, ATI soon released the Radeon 9500 chipset, aimed at the mid-range market. We got our hands on a Gigabyte MAYA II GV-R9500 Pro 128MB , which we tested. Considering it was only about 30% slower than the 9700 Pro speed-king and that it was, in some instances, faster than a Ti 4600, it was an instant winner.

Moreover, there were stories of how a regular 9500 could be modified into a 9700. The basis behind this is that the 9500 is simply the same as a 9700 save for four less rendering pipelines. The four missing pipelines are actually on the regular 9500, just that they have been disabled. So to turn a regular 9500 to a regular 9700, one would in theory just have to enable these four rendering pipelines. Industrious hackers soon found two ways to go about doing this. One is through a simple software hack; the other is to physically modify the card, which, as you can see , requires utmost skill and precision.

NVIDIA responded to the Radeon 9700 with the FX 5800. We tested the NVIDIA GeForce FX 5800 Ultra which we aptly nicknamed 'The Dustbuster' (no prizes for guessing why). It was a good attempt, but given its astronomical price and lackluster performance, it never caught on.

The Dustbuster debuts, much to everyone's disappointment. It was incredibly expensive, yet wasn't much faster than the 9700 PRO, making it a difficult purchase to justify.

The FX5800 was eventually improved on, resulting in the FX 5900. We had a MSI NBox N5900 ULTRA in our labs and although we weren't exactly blown away by its performance, it redeemed itself by its very comprehensive package. And you must check out its cooling.

Again, the 5900 Ultra from NVIDIA wasn't exactly ground-breaking, but this card still worthy of a mention because of its radical cooling solution.

Based on the 9600 PRO chipset, not only was this card a capable performer, it had looks to boot as well. We simply adored the all-silver PCB. Very funky.

In the meantime, ATI strengthened its grip on the mid-range market by releasing the Radeon 9600 PRO. The Triplex REDai RADEON 9600 PRO 128MB was one of the best examples. We were especially fond of its funky cooler and all-silver PCB.

Another excellent example of the 9600 chipset is the GeCube RADEON 9600XT 128MB Extreme Edition . Not only did we give a full five out of five stars, it also garnered our most overclockable award. The memory on a stock card is clocked at 700MHz, but we managed to get ours all the way up to a nausea-inducing 810MHz!
Needless to say, it achieved stunning results. With the memory overclocked to such levels, it was a whopping 20% faster than most other 9600XT cards and could even go head-to-head with NVIDIA's FX 5800 and FX 5700 Ultra!

This was one of the few cards to ever receive the full 5 stars from us. Not only that, it was also awarded the most overclockable award!

Inevitably, with increased performance comes increased heat. MSI's FX5600-VTD128-J (A.C.T.) was one of those graphics cards to employ some radical cooling technology called A.C.T. - Aeronautical Cooling Technology. It did away with the need of the fan, but at the cost of onboard real estate.

Not exactly a fast performer, but this card still got our attention thanks to its radical cooling solution. The Silent-Snake amongst graphics cards then?

2004

As graphics cards got faster, AGP became inadequate and a new interface was needed. It was in this year that Intel introduced PCI-Express, which brought about many improvements; chief among them was greater bandwidth.

With so many graphics cards flooding the market, we decided to hold a shoot-out, and this 3-Way GeForce FX 5700 Ultra Mini-Roundup was perhaps our first graphics card shoot-out.
This was the year NVIDIA finally got back in the game and reclaimed its position as being the fastest graphics cards maker. Its weapon? The awesome GeForce 6 series.
This card was so important that we dedicated a massive 28 page article to it, making it one of our most comprehensive and largest articles ever. Written by our Editor Vijay, it provided an extremely detailed look at the new NVIDIA GPU.
We had a NVIDIA GeForce 6800 Ultra to play around with and were absolutely astonished. The increase in performance over the previous generation was more than just substantial, it was phenomenal. With it, NVIDIA was instantly back into the game.

Finally, NVIDIA counters with the 6800 Ultra, landing a uppercut right to ATI's chin! First to fully support Shader Model 3.0, something ATI would only offer 1 full year later!

Not wanting to rest on its laurels, NVIDIA introduced their Scalable Link Interface (SLI), a technology for connecting two or more graphics card together to produce a single output. This meant more processing power for computer graphics.

However, there were a couple of problems. For one, performance was a mixed bag and was very much dependent on whether or not the application was optimized to take advantage of multiple GPUs. Another problem that NVIDIA faced at that time was that few motherboards, at that point of time, had true dual PCIe x16 slots to take full advantage of SLI.

Our conclusion then was although SLI was promising, it clearly needed more time to mature more before it could become an accessible technology, which it eventually did.
ATI is not one to sit back and soon they launched their successor to the Radeon 9800 - the X800. We tested HIS Excalibur X800 PRO IceQ II (256MB) and given its outstanding performance, we thought that a shoot-out was necessary, and hence this Q3 2004 High-End GPU/VPU Shootout article.
Here, we noted that ATI's latest GPUs do not offer full support for Shader Model 3.0. As more games began to be optimized for Shader Model 3.0, ATI began to see a drop in performance. Its GPUs, which were not Shader Model 3.0 compliant, could not handle games that were optimized for Shader Model 3.0 as well as NVIDIA's GPUs could.

ATI retaliated with the X800 XT, which inexplicably failed to provide support for Shader Model 3.0. This particular model from HIS had a custom cooling unit from cooling specialists, Arctic Cooling.

Towards the year of the end, NVIDIA introduced the extremely popular GeForce 6600 GT. We had a Leadtek A6600 GT TDH in our labs and found that it completely obliterated the competition.

Later, we pitted the GeForce 6600 GT against a Radeon X700 XT to find out which would be the best mid-range graphics card. Our findings were recorded in this Performance Midrange GPU/VPU Shootout (PCIe) article and unsurprisingly the GeForce 6600 GT came out tops. Interestingly though, the Radeon X700 XT never made it to retail and only the slower PRO version ever made it to retail. And thus began the infamous 'paper launches' that ATI was often caught doing.

NVIDIA scores another home-run with its 6600 GT chipset, targeted at mainstream and casual gamers. It was more than a match for ATI's X700 XT and was therefore, unsurprisingly, the choice of many.

Now that SLI has been around for some time, we finally had a go at it in our labs. MSI GeForce 6600 GT PCIe SLI Performance Review documents our findings. In summary, we found the performance it offered to be a mixed bag. It offered tangible gains on some games, yet on some others, there was no difference at all. Nevertheless, it boosted sales of GeForce 6600 GT cards.

This was probably our very first SLI test. 6600 GTs were reasonably priced at that time, and we sought to find whether or not two 6600 GTs would offer any significant gains in performance.

2005

With SLI slowly catching on, Gigabyte came up with an ingenious solution not seen since the days of ATI's Rage Fury MAXX - two GPUs on a single card. We had the Gigabyte GV-3D1 (Dual GeForce 6600GT) for testing and despite the fact that it was tied down to a single motherboard due to SLI driver constraints, it was nevertheless awarded our 'Most Innovative Product' award.

Not seen since the time of ATI's Rage Fury MAXX - this is Gigabyte's GV-3D1, which put two 6600 GT GPUs on a single PCB. Because of its ingenuity, it was awarded our Most Innovative Product award.

With 512MB cards slowly flooding the market, we decided to investigate whether or not the extra memory actually helps. We hypothesized that faster clocked cards would actually do more good than the extra memory and we did our tests with a Sapphire Hybrid RADEON X800 XL 512MB (PCIe). Just as we suspected, the faster clocked Radeon X850 XT card came up tops in most of our tests, proving our hypothesis right.
Just a year after releasing the awesome GeForce 6 series of cards, NVIDIA was taking it to the next level with the new G70 core, which powered the GeForce 7 series of cards. We were at Computex in Taiwan to get a first look at it. You can look at our coverage here - NVIDIA G70 - A Snapshot Preview.

Seen at Computex 2005, this is a sneak peek at NVIDIA upcoming G70 core. Set to appear in the flagship 7800 GTX, this was the most complex GPU for its time.

Also at Computex that year, ATI finally unveiled their response to NVIDIA's SLI - CrossFire - albeit more than half a year later. In our ATI CrossFire Preview , we noted CrossFire offers users greater flexibility with graphics card configurations and initial test-runs showed that it offered a substantial gain in performance, which is encouraging.

Also at Computex 2005, ATI finally unveiled their own multi-GPU solution, CrossFire. First impressions were good and we were especially excited about its flexible card configurations.

With SLI becoming increasingly popular, card manufacturers started looking at innovative ways of offering them at low prices. Dual GPUs on a single card was fast becoming an attractive option. We tested the ASUS EN6800GT Dual (6800GT SLi) and found it to be a monster, in more ways than one. Not only was it fast, it was also huge, installation on many of the casings was a problem.

When it was launched, it was one of the biggest cards we've ever seen, dwarfing everything else in our office.

Another interesting GeForce 6600 GT card that released this year was the MSI NX6600GT-V2TD128E Diamond (GeForce 6600 GT, PCIe) . It came with unique software utility called CoreCell 3D, which allowed users to not only monitor the status of the card, but also allowed them to tweak and overclock it. Moreover, it was extremely comprehensive and easy to use.
CoreCell 3D was such an impressive little utility that we decided to give it our 'Most Innovative Product' award. It even made it into our Top 100 products of 2005

Awarded our Most Innovative Product award, this card from MSI was one of the best 6600 GT money could buy.

Yet another interesting card based on the GeForce 6600 GT is the ASUS Extreme N6600GT Silencer 256MB (GeForce 6600 GT, PCIe) . Although its performance was nothing to shout about, its cooling solution definitely was.

Card manufacturers sought ways to make their products stand out, and implementing radical cooling solutions like this, was one of them.

Midway through the year, NVIDIA finally unleashed the fearsome NVIDIA GeForce 7800 GTX and we rushed to test it. Expectations for it were high and we were not disappointed. At higher resolutions, it completely smashed ATI's flagship GPU, the Radeon X850 XT. And in SLI mode, it churned out phenomenal frame-rates. To say that it was amazing was an understatement.
On a side note, our article of the GeForce 7800 GTX was so comprehensive that it was picked up on by many news reporting sites.

The 7800 GTX, based on the new G70 core, was launched to much fanfare, and it didn't disappoint. It obliterated the competition and set a new yardstick by which all cards would now be measured.

We later followed up on the 7800 GTX by reviewing the Exclusive: ASUS Extreme N7800 GTX TOP (GeForce 7800 GTX) . It was the best 7800 GTX by far, because it was overclocked to insane levels. A stock 7800 GTX is clocked at 430/1200MHz, but with its customized cooler from Artic Cooling, Asus was able to overclock this card to a dizzying 486/1350MHz. Needless to say, it brought about substantial gains in performance.
This card eventually appeared in our Top 100 products list for 2005, and was our pick for best 7800 GTX card.

As if the 7800 GTX wasn't fast enough, ASUS saw fit to stick a large customized cooler on top of it, and then overclocked it to insane levels.

The highly-anticipated Quake 4 was finally released. In our Quake 4 Performance Review , we documented the major graphics changes to the game and told users how best they can go about upgrading their system so as to ensure the smoothest frame rates and the highest quality graphics.

SLI has already made it to desktops, but what about notebooks? In our World Exclusive MSI's MXM SLI Card article and MSI Geminium-Go (MXM SLI Card) review, we talked about dual GPUs on mobile notebooks and we even demonstrated the feasibility of using mobile-based GPUs on desktops.
Unfortunately, despite the encouraging results we've seen, these new technologies weren't as popular as we would have liked.

Unique and one of its kind, the MSI Geminium-Go brings the best of laptop graphics to the desktop.

Later in the year, ATI finally launched their X1000 series of cards, which was drastically redesigned and also their first ever series of cards to fully support Shader Model 3.0.
Their high-end cards, such as the X1800 XT, were particularly strong performers as evidenced by the ASUS EAX1800XT TOP (Radeon X1800 XT 512MB) . It was the fastest available card at the moment, but was, continuously plagued by availability issues and hence didn't pose as much as a threat to NVIDIA as they would have liked.
ATI would later improve on this by introducing the X1900 XT. We tested the PowerColor Radeon X1900 XT 512MB and like its predecessor, it was a supremely fast card.

2006

In the early part of 2006, NVIDIA shot back by releasing the 7900 GT and 7900 GTX chipsets. By this time, the battle for top spot was really heating up and it seemed as if both sides were releasing new GPUs every other week.
The GeForce 7900GTX was meant to recapture the crown of speed-king for NVIDIA, but the ASUS EN7900GTX (GeForce 7900 GTX 512MB) didn't provide us with any conclusive answers. It seemed, at the moment at least, to be a deadlock between the Radeon X1900 XT and GeForce 7900 GTX.
NVIDIA of course didn't forget about the mid-range market and soon released the 7600GT GPU. We tested ASUS EN7600GT (GeForce 7600 GT 256MB) and Leadtek WinFast PX7600 GT TDH Extreme 256MB and found them to be worthy successors to the 6600 GT. The Leaktek, particular, was our favorite 7600GT card because of its willingness to be overclocked. So happy with it were we that we gave it a full five stars!

This Leadtek 7600 GT Extreme was one of favorite 7600 GT cards. Its willingness to be overclocked and its competitive was enough for us to give it the full five stars!

In February 2006, 3Dlabs announced that it would stop developing and selling 3D graphic chips and would instead focus on embedded and mobile media processors.

The year also saw the debut of the PhysX physics engine. We had a go at the ASUS PhysX P1 GRAW Edition 128MB (PCI) one of the world's first few Physics Processing Units (PPU). PhysX would later go on to be acquired by NVIDIA, and subsequently be integrated in their graphics cards.
We tested the ASUS PPU and found that it provided a somewhat better gaming experience. In games that were optimized for it, objects seemed to move more realistically. Of course, these were early days for PhysX and that it would have to be widely implemented in games for a PPU to make any sense.

This was one of the world's first dedicated physics processing units. Looks like any other low-range graphics card doesn't it?

Midway through the year, NVIDIA dropped the bomb on ATI by releasing the NVIDIA GeForce 7950 GX2 1GB . Rather than put two GPUs on a single PCB, NVIDIA somehow managed to sandwich to 7900 GTX cards together (two PCBs) and make them run on a single PCIe x16 slot, to give us the monstrosity that was this - the GeForce 7950 GX2.
What this meant was that you could essentially put two of these together and what you'll end up with is quad SLI! And as an added bonus, it was competitively priced. ATI struggled to come up with an answer to this.

Bring your biggest guns, because the NVIDIA 7950 GX2 is in town! It was the reigning speed-king until the debut of the 8800 GTX.

Towards the end of the year, AMD completed the acquisition of ATI. From this point on, they shared technologies and there is an increasing emphasis by AMD to unify the CPU and GPU so that processes become more seamless.

Turning our attention back to the mid-range market, we saw a few interesting cards to be released based on the 7600 GPU. One of them was the ASUS EN7600GS TOP Silent (GeForce 7600 GS 512MB) , for its radical-looking cooler; another was the Gigabyte GV-NX76G256HI-RH (GeForce 7600 GS, HDMI) , because it was one of the first cards to support HDMI and HDCP, hence making it a good choice for HTPCs.
We also did a , pitting NVIDIA 7600 and 7300 series of cards against ATI's X1800 and X1600. And if there were ever any doubt of NVIDIA's superiority, this shootout put them all to rest.

The 7600GS chipset was one targeted at the low to mid-end market, and this card by Asus featured a really interesting looking heatsink.

ATI later released the X1950 XTX and we tested two cards based on this GPU, the ATI Radeon X1950 XTX 512MB DDR4 and MSI RX1950XTX-VT2D512E Water Cooled Edition (Radeon X1950 XTX) . They were good cards in their own rights, a match for the regular 7900 GTX cards, but could not hope to compete with the fearsome GeForce 7950 GX2.

This was one of the few cards we reviewed that was water-cooled. Very cool-looking, but you might find yourself regretting buying it once the novelty wears off.

And before ATI had a chance to catch their breath, NVIDIA released their GeForce 8-series of GPUs. These GPUs were the first to be fully DirectX 10 compliant and to incorporate a Unified Shader Architecture. Once again, NVIDIA made big leaps and bounds, leaving ATI looking very shabby.
We tested the flagship NVIDIA GeForce 8800 GTX and were once again blown away by what NVIDIA has managed to achieved. Let's put its sheer power in perspective: up to 70% faster than the older 7900 GTX and up to 30% more powerful than the dual-GPU 7950 GX2 combo card.

NVIDIA continues to pound ATI into submission by releasing the supremely powerful 8800 GTX. Look at how big it is!

2007

Soon after, we tested a number of cards based on the GeForce 8800 GPU. The in particular was peculiar in that it was one of the few cards to actually overclock its shader units. In fact it was overclocked to 1350MHz, which is equivalent to that of the 8800 GTX. Hence despite having only 320MB of memory, it still managed to put it some impressive performances. As such, it was even in our Top 100 products of the year.

Admittedly, it looks rather plain jane, but it is one of the few cards to feature overclocked shader units.

The mid-range market, it goes without saying, was not neglected by NVIDIA and in our , we covered the 12 best mid-range 8600 GTS cards to find out which offers the most bang for the least buck. Also, this is our biggest graphics card shootout ever!
Another mid-range card that is worthy of a mention is the , because of its special 'OC Gear', a separate module that fits into another 5.25 inch bay that allows users to tweak the core clock at the turn of a knob.

The OC Gear is really useful for on-the-fly tweaking of your card's settings. In addition, we think it looks really cool.

After much waiting, ATI's high-end Radeon HD 2900 series of cards finally entered the market, and we tested the to see if it can reclaim the crown from NVIDIA. Sadly, that was not to be. The GeForce 8800 GTX proved too much for it to handle, but the Radeon HD 2900 XT was more than a match for the less powerful 8800 GTS. Enthusiasts who were looking for something a notch lower than the 8800 GTX found that the Radeon HD 2900 XT soon became a compelling option.

ATI's latest flagship, the HD 2900 XT, disappoints again. Not as fast as the 8800 GTX, it was saved only by its aggressive pricing.

By now the market was absolutely flooded with cards. There were so many different GPUs and makers that it was absolutely mind-boggling. So to help readers, we undertook great pains to come up with this .

Towards the end of the year, NVIDIA decided to refresh its 8800 lineup once more and launched the . This was an outstanding card because it offered superb performance at a very good price.
We noticed that two GeForce 8800 GT cards would cost as much as a GeForce 8800 GTX and so in a follow-up article, we took a pair of cards and put them in SLI to see how it would stack up. Although this pair provided superior performance, we found that the drivers didn't seem to be up to mark as we found a few issues while testing. Otherwise, a pair of GeForce 8800 GT cards would prove to be a very enticing option.
Given the outstanding value of the GeForce 8800 GT, we decided to hold another shootout - to help buyers pick out the best of the lot. In this shoot-out, we covered no less than ten GeForce 8800 GT cards.

One of the most impressive cards ever, the 8800 GT packed great performance and price in one unbeatable package.

ATI came back once more with the Radeon HD 3800 series, which was basically a die-shrunk version of the GPU used on the Radeon HD 2900 XT. We received the less powerful for testing, and were slightly disappointed by its performance. It was not the 8800 GT beater that we hoped it would be, and we even speculated that even the more powerful Radeon HD 3870 would have difficulties matching up with GeForce 8800 GT. If there was anything to take solace in, it was its competitive pricing.

2008

NVIDIA kicked off 2008 with a bang as it announced its Hybrid SLI solution. What this does is that it allowed discrete graphics cards to work in tandem with integrated graphics solutions in the same way two discrete graphics cards would in the more traditional SLI formation (though it has a few pairing restrictions).

By this time, it became clear that ATI couldn't hope to match NVIDIA on pure performance alone. In light of this, card manufacturers sought different ways to improve and differentiate their products from the reference design and one popular way was to overclock the cards.
We received the for testing, and found it to be good value for money. It had good overclocking potential and if tweaked properly, could offer almost 8800 GT levels of performance.
One of the most impressive cards based on the Radeon HD 3870 GPU, however, was the . As the 'X2' in the name suggests, it is a dual GPU card, but unlike other dual GPU cards, it features 2 GPUs on a single PCB - something not seen in a long time. As expected, it was quick and we found it hard not to recommend it to anyone with deep enough pockets.

Ah, yet another dual-GPU card - something we haven't seen in a while. Although this card was as big as the 8800 GTX, it was much heavier, tipping the scales at 1.1kg.

With HD content becoming more prevalent on the web, we decided to investigate ATI's and NVIDIA's hardware solutions for HD decoding in the .
All in all, we found both technologies to be effective at reducing CPU utilization when playing HD content, to the point where even an old P4 system could handle HD content comfortably. There were, however, some interesting discrepancies. Chief among them was that NVIDIA's PureVideo HD seemed to be work less effectively on their lower-end cards.
Later, we decided to put ATI's new mainstream GPUs - the HD 3650 and HD 3450 - under further scrutiny, to see how they compared to the cards in our earlier HD decoding tests. Considering that these new cards were basically shrunken versions of their older incarnations, it was unsurprising to find that their performance was almost similar.

In response to the dual GPU HD 3870 X2, NVIDIA released its much-anticipated , which we happily tested in our labs. Needless to say, it was quick as lightning and completely pummeled the HD 3870 X2. However, its high price meant that it was almost exclusively a card for the hardest of the hardcore enthusiasts.

NVIDIA, never one to seat back, went on the offensive with the 9800 X2. However, rather than put two GPUs on a single PCB, they went the way of the 7950 GX2 instead, sandwiching two cards together.

Not long after, NVIDIA launched their new flagship, the . As usual, we had this in our labs and we put it to the test to see just how good it is. Unsurprisingly, it was the fastest single card we've ever tested, and could handle all but the most demanding games with ease. However, it did, in some tests, lose out to the older dual-GPU 9800 GX2, and considering its stratospheric price (US$649 at launch) plus the fact that a less powerful variant, the GTX260 is available for much less, the GTX280 is something most would definitely think twice about before buying.

In response, ATI introduced their new HD 4800 series of mainstream performance cards. The launch of these cards clearly signifies a change in strategy from ATI, deciding to focus on the mainstream segment instead of the high-end, flagship GPU.
We tested both and , and were delighted by the performance they offered. By themselves they were competent performers, and should you need extra juice, just put them in CrossFireX mode. In fact, the HD 4870 was so good that in CrossFireX, it could trump the GTX280. And to add salt to injury, each HD 4870 retailed at only US$299, making it cheaper to get two HD 4870s than a single GTX280. In response, NVIDIA had no choice but to slash prices of their GTX200 series of cards. After being out in the wilderness for so long, ATI was finally back in contention.

Not satisfied with the advantage gained with their new Radeon HD 4800 series of cards, ATI went on the offensive and for the first time in recent memory, reclaimed the speed crown thanks to its dual-GPU monster, the . The GPU itself doesn't differ too much from the 3000 series, but the current 4000 series has a much more capable audio controller integrated to process HD Audio streams for HDMI output, plus the core graphics crunching horsepower was greatly bolstered with more processing units and a more efficient memory controller.

Recently there has been increased attention and discussion about general-purpose computing on graphics processing units (GP-GPU). However, this was not without its problems. While GPUs might be inherently powerful, they were designed specifically to tackle graphics and though capable of general computing, writing data parallel programs to suit their multiple processing units was not an easy task.
NVIDIA has been very vocal about the prospects of GP-GPU, going so far as to say that GPUs will one day render CPUs redundant. To back this claim, they touted their set of development tools called Compute Unified Device Architecture (CUDA), first released two years back, which will allow developers to code and optimize programs for execution on GPUs. This is still very much in its infancy but the move towards GPU computing took a giant step with the recent introduction of OpenCL, which is an open API for GPU compute supported by many companies and will hopefully bring about more developments in this area. Next generation's DirectX 11 too will bring about more support for GP-GPU initiatives, so it's just a matter of time before the GPU is fully unleashed beyond its current normal functions.

Intel too is waiting to enter the game and is developing a GPU themselves under the codename "Larrabee". Larrabee, according to Intel, is designed mainly for GP-GPU optimization and high-performance users. They expect a working sample to be completed by the year of 2008, after which it will be released to the public in late 2009 or 2010.

Our articles may contain affiliate links. If you buy through these links, we may earn a small commission.

Voodoo Beginnings - 10 Years of GPU Development