NVIDIA GeForce 7800 GTX (G70)
NVIDIA's G70 GPU architecture has finally materialized today in the form of their new flagship and heir to the performance crown with the GeForce 7800 GTX. Find out all the tech details, its performance, our views and if you should look forward to splurge on one, if not, two of these.
By Vijay Anand -
Prelude To The Showdown
Just over a year back in 2004, ATI and NVIDIA released their then next generation graphics parts and for a good half-year following the launch, both vendors were only replacing their aging lineup with thumbed down variations of the R420 and NV40 respectively in the mid and low-end segments. There wasn't any real competition in dethroning each other in the high-end for quite a while until ATI refreshed their RADEON X800 series with the much-needed RADEON X850 series that brought with it a more refined silicon and was even faster. That allowed ATI to ramp up high-end graphics card production, which was greatly held back during the days of the RADEON X800 XT series that was plagued with yield and production volume issues.
This new high-end series from ATI was launched just before the turn of 2005 and had steady volume in retail by end of January. Now that posed a real challenge to NVIDIA's GeForce 6800 Ultra, which was already tailing the RADEON X800 XT Platinum Edition and it goes without saying for the even speedier RADEON X850 XT Platinum Edition. This was only an issue for single board performance, but NVIDIA already had SLI capable platforms and graphics cards by then. With dual GeForce 6800 GT or Ultra cards, performance scaled really well for games that were previously pixel shading and fill rate limited (basically, GPU limited and not local memory bandwidth limited or CPU-bound). Hence NVIDIA's option for those wanting even faster performance than a GeForce 6800 Ultra was to adopt the SLI version, which became the ultimate performance king if one had US$1,000 to spend (and a relatively cheaper and more sought after option was dual GeForce 6800 GT cards). Though the returns were never always twice that of a single card (where SLI worked perfectly), it was still more than 50% faster than any single card solution, which is justifiable for many hardcore gamers. After all, there wasn't anything to challenge the combo graphics card solution until the very recently announced ATI CrossFire series, which would likely only make it to retail somewhere in July.
Building Up The Momentum
Back on the single graphics card arena, tensions were rising high as both ATI and NVIDIA tried to outclass one another consecutively during the period of CeBIT Hannover 2005 held in March. Just days before the show launched, ATI quietly unleashed the 512MB version of the RADEON X850 XT graphics card to limited online media and NVIDIA countered back with an official release during CeBIT itself with a huge 512MB variant of the GeForce 6800 Ultra, which was running in their booth. ATI's RADEON X850 XT 512MB never made it officially and only remained as an option for their partners, but they did seed the RADEON X800 XL 512MB officially in to retail at the turn of May 2005. As we have shown our readers in a string of 512MB graphics cards reviews, the returns were just not there for gamers to invest in them. In fact, when these monster half-a-gigabyte graphics cards debuted, most consumers were actually anticipating the launch of the next generation parts instead. Well in our view, the spate of salvos by both the visualization leaders was definitely a build-up of momentum to launch their next generation solutions, and indeed it was with the buzzwords such as '"G70" and "R520" codenames plastered all over online debates and news tidbits. Finally, one of them has come to pass in reality as 22nd June marks the day when not only NVIDIA releases their G70, but has also made it available to end-users on that very day. Today, we welcome the newly christened GeForce 7800 GTX, which utilizes the G70 graphics architecture.
NVIDIA's previous generation's GeForce 6800 Ultra in SLI had set itself new performance levels and has helped developers and consumers alike to feel what the next generation graphics processors are capable. In fact, SLI setups have helped game houses tremendously in modeling their next generation game engines as well as push the limits of game realism. With NVIDIA hyping that their single GeForce 7800 GTX would be equivalent to it previous generation's SLI setup, it's time we took a look under the hood and see how it has evolved from its predecessor and of course bring our readers the full gamut of results for you to judge NVIDIA's new heir to their throne.
GeForce 7800 GTX Technical Specifications
Graphics Engine |
|
Graphics Memory |
|
Display Capabilities |
|
Connectors |
|
Operating Systems / API Support |
|
Other Information |
|
What The GeForce 7800 GTX Brings To The Table
Unlike previous occasions, NVIDIA is only launching one version of their GeForce 7 series initially and that shall be the GeForce 7800 GTX. Previously with the GeForce 5 and 6 series, 'Ultra' was used to classify their highest performing version, but not this time. The new 'GTX' nomenclature according to NVIDIA now signifies their highest performing part. Whether this is true or just an excuse to reserve the 'Ultra' tag if ATI's future R520 part should outdo the GeForce 7800 GTX remains to be seen. NVIDIA has also declined to comment on newer less powerful spin-offs of the GeForce 7800 GTX, but as history has shown, they will follow suit in future and the only two questions would be when and what toned down specs they would feature. Future possibilities aside, the GeForce 7800 GTX is NVIDIA's highest performing GPU now.
For those of you wondering why NVIDIA's internal GPU codenames have now differed from the previous NVxx naming scheme, they figured that the G70 for GeForce 7xxx series is just plain easier to remember. We would agree too, but some hardware aficionados might feel that the first numeral no longer is in tandem with ATI's code names. Well, that's just a side observation, so let's put that away and bring on the main hardware specs:-
8 Vertex Shader Units and 24 Pixel Shader Pipelines
If we were to simplify the G70 architecture in a single line, it's basically an NV40 with internal redesigning to make it more efficient, increased firepower with additional pipelines and a few new features. It might as well be a turbocharged GeForce 6800 Ultra, but the actual performance differential and under the hood changes are vast enough that it warrants a new series by itself. Hence we have the GeForce 7 series, with the 7800 GTX model seated at the very pinnacle.
The GeForce 7800 GTX has 30% more vertex shader units and a 50% increase in shader pipelines from the highest performing GeForce 6 model that had 'only' 6 vertex shader units and 16 pixel shader pipelines. The basic internal structures of each unit remains similar to NV40's vertex and pixel pipeline, but have been re-engineered for higher throughput. We'll touch a little bit more on them on the following page as these are part of the changes that form the new NVIDIA CineFX standard in the G70 architecture. What you can be sure is that the increased processing units and pipelines would greatly contribute to new performance levels for a single graphics card. Perhaps one letdown from the initial rumors were the doubling of the pixel shader pipelines from the GeForce 6800 Ultra, which unfortunately did not come to pass. Nevertheless, with more efficient execution, which increases the throughput of each of the units, we'll soon see just how much advantage the GeForce 7800 GTX maintains ahead of the GeForce 6800 Ultra.
Note that the number of Raster Operation Pipelines (ROPs) still remain at 16 in both the GeForce 7800 GTX and the GeForce 6800 Ultra, but the former supports a new feature and improved antialiasing performance thanks to an updated Intellisample engine. We'll touch on that later to let you in on the improvements.
302 Million Transistor GPU @ 430MHz
If you thought the 222 million transistors in the GeForce 6800 Ultra was already mind boggling, then try 302 million transistors in the new GeForce 7800 GTX. Now that's a third more transistors than the best of theNV40, but that isn't surprising at all when you consider that the core processing engine has been enlarged with more pipelines and processing units as pointed out above. At 302 million transistors, this is the most complex mass commercial GPU out there and to intimidate some others, NVIDIA did some math for us to put things in better perspective. If you were to combine the CPU and graphics processing units of all the current generation consoles (XBOX, PS2 and Game Cube) and throw in an Athlon 64 FX-55 processor as well, the combined transistor count stands in at only 300.4 million. That's still shy of the 302 million transistors that the GeForce 7800 GTX boasts and a testament to its complex design and processing prowess.
This complex GPU is manufactured at TSMC's fab with a 0.11-micron process manufacturing technology. Some of NVIDIA's existing GPUs are fabed there as well with the same process technology such as the GeForce Go 6800 GPUs. Hence, we predict it won't take TSMC and NVIDIA long to refine the process for the GeForce 7800 GTX GPUs before we see volume quantities of them fast. Perhaps for the long run, it would have been great if NVIDIA already started adopting a 90nm process technology such as the competitor's upcoming VPU, but that would also mean a longer cycle time before getting settled and attuned to that process technology. There are pros and cons to adopting such cutting edge silicon manufacturing process technologies, but looking back at how NVIDIA met with issues during their bold moves in the NV30 days, we guess it's better for them to churn out the GPUs in the needed quantity now at 0.11-micron process while slowly preparing for the 90nm transition in the near future and perhaps re-spin them when ready. More likely, we foresee a variant of the G70 design in future to adopt the newer process technology. In any case, the current decision by NVIDIA is a safer and surer move if you ask us. Here's a comparison of the GeForce 6800 Ultra die and that of the GeForce 7800 GTX:-
The GeForce 6800 Ultra's die measures 16.5 x 18.5mm for a die size of 305 millimeter-square. Using a Flip-Chip BGA packaging, the entire chip is 40 x 40mm in dimensions.
The all-new GeForce 7800 GTX has an even larger die at 18.5 x 18.5mm for a die size of 342 millimeter-square - huge even at 0.11-micron process. The packaging however, seems to have shrunk a little and is 37.5 x 37.5mm. For a clearer photo without the thermal interface material, check our <a href="http://www.hardwarezone.com.sg/articles/view.php?cid=3&id=1603">snapshot preview</a> .
As for the GPU's operating clock speeds, it hasn't changed much from the 400MHz GeForce 6800 Ultra. The new GeForce 7800 GTX has a clock speed of 275MHz in 2D mode and when entering any 3D API, it throttles up to 430MHz. At this point, some of you might be thinking it's a measly 30MHz improvement. Well don't forget that this GPU is a massively parallel internally with far more processing engines and pipelines to accomplish tasks faster than its predecessor by a good degree even if there were no clock speed improvements. Historical advances in GPUs have showcased this many times (for example, NV40 versus NV35 versus NV30 and so on) and it's no exception this time either.
On another note about the G70 architecture, Sony's upcoming PlayStation 3 will be using NVIDIA's RSX processor that is based on the G70 architecture. A few months ago, some NVIDIA representatives mentioned it would be as powerful as the GeForce 7800 GTX. However during the G70 launch briefing, NVIDIA's CEO - Jen-Hsun Huang, made a statement that the RSX graphics synthesizer is using the G70 architecture, but is one generation ahead. That could either mean more features, or more parallel processing pipelines, or a combination of both to give the RSX the needed power to see through it's useful lifetime in the future when it's released. Whatever the case, this sounded a lot more logical to us. After all, the GeForce3 was a fresh product when the XBOX was launched and it had a GPU more powerful than the desktop GeForce3 products.
1.2GHz Graphics-DDR3 Memory On 256-bits Memory Interface
Even Memory bandwidth hasn't improved much over existing generation of hardware simply because good shader codes taking advantage of true programmable shading, which can produce fantastic and realistic image quality as NVIDIA has showcased in the past using the third generation Unreal Engine demo. This reduces the memory bandwidth required if data were to be repeatedly written to the local frame buffer (multiple passes) as was the case with many complex texture effects executed in the past using DirectX 8 and early DirectX 9 standards (and hence such effects were used sparingly those days to prevent overwhelming the graphics card). The newer Microsoft Shader Model 3.0 supported on all GeForce 6 and 7 series GPUs and supported on DirectX 9.0c has opened the doors to vastly complex shader effects with easy programmability. This allows the graphics shader hardware to do complex calculations and branching (given that the code is written to take advantage of SM 3.0) to derive the final output, vastly cutting back on the write backs required to the frame buffer (which is a problem when more 'primitive' code paths are used to accomplish complex tasks). With almost all new graphics hardware to support SM 3.0, it's only a matter of time before more and more games are designed to take advantage of the more efficient code paths and flexibility, putting the GPU to good use without overly impacting on the memory bandwidth required.
This is the reason why the memory interface on the G70 architecture is still 256-bits wide and the GeForce 7800 GTX in its first incarnation ships with 256MB of memory. Of course as we mentioned in a recent review, game developers will push the limits of realism further requiring even larger buffers to compute all sort of effects in addition to the growing size of textures and maps. Hence more memory and bandwidth required in future is inevitable, but at least with SM 3.0 and more efficient processing and effects generation methods, it has prolonged the need for bigger frame buffers and higher memory throughput needs right now (which would have escalated hardware costs exorbitantly, as if they weren't costly enough already).
NVIDIA CineFX 4.0 Engine
The CineFX engine is the heart and soul of the entire GPU, which encompasses the bulk of the 3D processing units that cranks out all that we see in modern games. In a simple overview, it consists of the vertex shaders, pixel shaders, triangle setup engine and the texturing engine. In its fourth generation, the CineFX 4.0 still complies and supports all the features of Microsoft's Shader Model 3.0 (which encompasses the Vertex Shader 3.0 and Pixel Shader 3.0 standards) and was first available on the CineFX 3.0 engine found in the NV40. If you would like to refresh yourself with the standards and features in SM 3.0, we highly recommend you our NV40 Preview article that spells them out in easy to understand terms.
So what's new to CineFX 4.0? NVIDIA has done some re-engineering to wring out more performance in each of the units by improving the arithmetic prowess as well as incorporating even better scheduling and branching techniques to take advantage of the massive parallelism architecture of the GPU. This is even more crucial on the GeForce 7800 GTX since it features even more vertex and shader pipelines than its predecessor.
Enter NVIDIA's new mascot promoting the GeForce 7 series, Luna. Her facial expressions are even more realistic than Nalu, the mermaid mascot for the GeForce 6 series. Using soft 3D shadows, these give depth and volume to Luna's hair. That helps to cast shadows and portray volume more accurately as opposed to using shadows by individual strands of hair (such as in Nalu), which are likely to cast an overlay dark shadow when hair density is very high.
The gatekeeper monsters are hideous (but so is the entire backdrop). However, their skin has a high level of transparency in strong lighting and as a strong beam of plasma (or whatever light source it was - not shown in this picture) came up on the demo, we could make out the monster's innards such as lungs, arteries, rib cage and more from the red light transmitted through the body. Too bad we don't have a screenshot of that, but it was an excellent real-time demo in which sub-surface scattering was showcased in its best form we've seen to date.
Mad Mod Mike is another new technology demo member. This demo primarily focuses on the heat shimmer and flames of Mike's rocket pack (calculated with PS 3.0) and how the lighting from it indirectly lights up a lot of other objects in the room (including himself) as he moves about and how that interacts with the existing environmental lighting. Due to the effects created in some of the buffers not written to the output frame buffer, the snapshot provided by NVIDIA here does not really exhibit the effects that we just mentioned. However, you can view them onscreen if you were to run the demo using an appropriate SM 3.0 compliant NVIDIA graphics card.
Vertex Shader Enhancements
Besides bolstering the GPU with eight vertex shader units (which now features accelerated geometry processing), the triangle setup engine following it has been speeded up to cater to the higher throughput.
Pixel Shader Enhancements
Going a rank deeper, we now touch upon the changes to the pixel shader units. Over the many months of developing their G70, NVIDIA has been carefully studying the kind of shader programs used in a wide gamut of games and identifying which kind of functions are most commonly called upon. With that information, they set out to turbo charge the shader performance with regards to these commonly used functions and of course those that NVIDIA thinks it will have strong impact in current and forthcoming games. As a result of this, they were able to crank out more performance per pipeline, per clock.
One of the areas that NVIDIA paid close attention to was Multiply + Add (MADD) operations, each of which is a 32-bit floating point operation, which are commonly found in functions such as transformations, lighting calculations, normal map calculations and others of the sort. These are highly utilized in the latest of today's games and they will be even more widely used in future games. As such, they have ensured that each of their shader units in the G70 is capable of double the MADD operations of the NV40. The latter can perform four MADDs and four MULs (multiply) operations in a single clock cycle, but the G70 can perform eight full MADD operations completely in a single clock, which is definitely a higher throughput than NV40 if you were to look at the total number of multiply and add operations required individually.
Overall, the enhancements in total have allowed the G70 architecture to boast two times the floating point shading power of the NV40 predecessor. What can we do with all this extra power? For one, you can expect future games to extensively use even more complex shader programs to recreate even better effects, High Dynamic Range lighting without reservations, as well as effectively use effects like sub-surface scattering (such as that in NVIDIA's Luna Demo). These are not really new effects as they were possible even in the GeForce 6 era, but the hardware wasn't powerful enough to use such effects effectively in large scale all the time. Perhaps high-end GeForce 6800 products may get away with such effects some of the time, but certainly not all of the GeForce 6 series.
Texture Engine Enhancements
The texture processing engine was hauled over a bit to accelerate its texture access, which is a rather immediate enhancement since textures are always used in 3D games. The improved internal cached design for the texture engine also benefits when using Anisotropic Filtering.
Designed To Support Microsoft's Longhorn OS
A year earlier when NVIDIA was launching their then new GeForce 6 series, they invited several key people from various industries who represented their respective corporations and shared with us on how their company is actively involved in shaping upcoming software that take advantage of 3D hardware. Of them, one of the most important notes was from Microsoft's Chris Donahue who shared with us information on Shader Model 3.0, as well as developments in DirectX and the next generation Longhorn operating system.
Longhorn would be somewhat different than existing Microsoft operating systems as in its core structure, the way it interacts with the user with an updated GUI, as well as supporting the new tech trends that Intel has been pushing such as virtualization and improved security. Direct3D will form the foundation (and with other technologies to form the Windows Graphics Foundation standard) of this next generation operating system, with most of what's shown onscreen being GPU driven using 3D computer graphics hardware and updated Direct3D technologies to support desktop 3D graphics, animation and other special visual effects. There will be a few user interface levels to cater for the various graphics hardware out in the market at the point of release, but for the best experience of the Avalon interface, you'll require a fairly modern 3D graphics card that is Longhorn Display Driver Model ready.
The CineFX 4.0 engine in the GeForce 7800 GTX fully supports the Windows Graphics Foundation model and has a Composited Desktop Hardware Engine to support the next generation's user interface. This engine facilitates the following effects:-
- Video post-processing
- Real-time desktop compositing
- Seamlessly run multiply 3D applications
- Accelerated antialiased text rendering
- Special effects and animation
Expect other upcoming G70 based graphics cards to feature CineFX 4.0 and hence, would likely support the Composited Desktop Hardware Engine too for enhanced Longhorn operating system experience.
Intellisample Basics
The Intellisample engine is also upgraded to the fourth iteration and it builds upon the third generation found in the GeForce 6 series. We strongly suggest reading up our NV40 Preview article to get yourself acquainted if you haven't a clue on it. For summary's sake, the Intellisample engine is all about NVIDIA's full scene antialiasing (FSAA), anisotropic filtering (AF) and high-resolution compression technology that condenses color, texture and z-data.
Intellisample 3.0 (NV40)
Intellisample 3.0 in the GeForce 6 series furthered the basic Intellisample engine by implementing the rotated grid antialiasing, 16x AF and 64-bit texture filtering and blending. The last feature enables photorealistic lighting and shadow effects by means of representing data in 64-bit floating-point format. This means 16-bits are allocated for each of the primary color channels (R, G, B) and an alpha channel. Just to stray off here for a bit, the alpha channel, like in Photoshop, is used to create transparency effects (fog, glass, mist, etc.) with the primary channels when blended together. Where once upon a time the industry represented the entire color spectrum in just 16-bits, the 64-bit floating-point format allows 16-bits allocated to each color channel, greatly increasing the color spectrum for accurate reproduction. This is what High Dynamic Range (HDR) image reproduction is all about as it's able to replicate the extreme dark and extreme bright values accurately, which might have otherwise been compressed when using the current standard 32-bit RGB pixel color format (8-bit integer for each component).
Intellisample 4.0 (G70)
Intellisample version 4.0 in the G70 architecture builds upon Intellisample 3.0 by improving rotated grid antialiasing performance, supports normal map compression and a new antialiasing mode - Transparency adaptive supersampling and Transparency adaptive multisampling. Let's tackle all these features one at a time.
Intellisample 4.0 - Normal Map Compression
Improved rotated grid antialiasing performance is purely internal tweaking by NVIDIA for better high-resolution antialiasing throughput. Normal map compression is however a new feature to the G70 architecture. First initiated by ATI via its 3Dc feature (which is normal map compression using a FOURCC texture compression format), we've discussed why it came about and how it aids games in this article here . It was available on all of ATI's RADEON X850/800/700 series, but wasn't available on the competing GeForce 6 series in any equivalent form. Looks like developers did like ATI's 3Dc (including the folks at NVIDIA) that the new G70 was designed to incorporate an equivalently competent normal map compression technique.
NVIDIA uses a different texture compression format, choosing to go with the V8U8 algorithm, but they've revealed to us that there's no cause of compatibility if a game should use ATI's 3Dc. NVIDIA's driver transparently converts ATI's 3Dc maps, if encountered, into the V8U8 format for it's hardware's use. So although there's an extra step involved in the process, the compatibility is kept intact and neither developers nor the consumers should have concerns with regards to game quality being compromised. What we do wonder is if ATI's hardware can do likewise to accept the format used by NVIDIA should games choose to use that. We'll find that out in due time and update you in a future article. For those techies interested in the pros and cons of each texture storage format, you can easily find discussions and tech articles concerning them online.
Intellisample 4.0 - Transparency Antialiasing
Normal antialiasing techniques work fine for objects of sizable dimensions, but antialiasing sub-pixel sampling patterns/algorithms are unable to rectify extremely fine objects such as fencing and fine leaves because they normally do not fall into view of the sub-pixel sampling patterns. NVIDIA's transparency antialiasing offers to overcome this into modes - transparency adaptive supersampling and transparency adaptive multisampling. Both get the job done but just like traditional supersampling, transparency supersampling endures a larger performance hit for little improvement over its multisampling variant. As such, we suggest opting for the transparency multisampling option. After all, most of us are all accustomed to full scene multisampling antialiasing these days. Transparency antialiasing is an optional feature that can be used in conjunction with normal FSAA. It is only available using the advanced settings view mode as seen below:-
Transparency antialiasing option with three values: - Off, Transparency Multisampling and Transparency Supersampling.
On average, there shouldn't be more than a 10% performance hit by activating transparency antialiasing, but of course this is greatly dependent from game to game, scene to scene. Check out the image quality difference we encountered using the FarCry game when running at a resolution of 1280x960 with 4x FSAA and 8x AF:-
This is a close-up of some foliage in the game without Transparency antialiasing. Click to inspect the entire screen capture.
Now this is the same shot with Transparency antialiasing enabled; what do you know, it works! Check out the smoother and cleaner edges of the leaves. Just like the above shot, click this picture to inspect the entire captured frame and see where else there's been an improvement.
As for performance concerns with transparency antialiasing, we doubt it's going to eat much into a powerful card such as the GeForce 7800 GTX, but here's some results to reassure you:-
NVIDIA PureVideo Updates
NVIDIA's PureVideo technology has matured yet again with the new GeForce 7800 GTX, which brings with it additional processing power. The Spatial Temporal De-Interlacing function has always existed in the mid-upper class of GeForce 6 series, smoothing video and DVD playback for a clear reproduction as the video was first intended. Thanks to the GeForce 7800 GTX immense horsepower, this card features High Definition Spatial Temporal De-Interlacing function for videos with resolutions as high as 1920 x 1080i.
Yet another feature is High Definition H.264 acceleration for videos encoded in such a format, but this won't be made available till a future driver update and NVIDIA estimates that this feature should be available close to the end of the year. Still a long way to go, but it's kind of disappointing when NVIDIA mentioned H.264 acceleration (non-HD) available with their PureVideo technology when the NV40 debuted. Still, it's never too late and they are working with industry software vendors such as CyberLink and IVI to incorporate NVIDIA's H.264 decoders first. In the near future, H.264 acceleration would be enabled via a DirectX Video Acceleration (DXVA) layer from Microsoft directly and that should do away with the third party intervention. Till that materializes, NVIDIA would be working with other industry software vendors like DivX and more to push their decoder and further value-add NVIDIA customers.
Since H.264 acceleration is merely a software validation approach, it would be a feature offered to even GeForce 6 series, but the High Definition H.264 acceleration would indeed be the mainstay of the GeForce 7800 GTX powerhouse. While NVIDIA is strongly pushing the H.264 encoded video acceleration feature in the near future, we are quite uncertain just how much of the video content out there would end up using this compression/decompression algorithm. Certainly it has proven itself to generate smaller file sizes as opposed to MPEG-2 and maintains comparable image quality. It is a valuable streaming video format, but the real question is whether it would gain the popularity as MPEG-2 has as the de-facto video codec. Only if it gains substantial popularity would all this extra effort by NVIDIA and other vendors pay off in the long run. The benefits alone of the H.264 codec are simply not adequate to move the industry that has already been accustomed with the early adoption of MPEG-2 and it remains to be seen how the consumers and industry in general will evolve in the near future as a new media format war is on the brink once more.
H.264 acceleration aside, recall the Inverse Telecine 3:2 pull-down feature in PureVideo, which helps recover the original film format to present a more accurate video playback with slightly enhanced image quality? Since that was only of use for NTSC videos, nations using PAL broadcast were left hanging for something equivalent. With the new Release 75 series drivers, NVIDIA solves that issue with a 2:2 pull-down feature. So watch out for the update coming soon on NVIDIA's website.
New ForceWare Suite
Since we are on the topic of software and drivers on the previous page, we show you the new 75 series driver panel and other driver related information here.
The general tab displays adapter details.
In 2D mode, the GeForce 7800 GTX idles at about 40 degrees Celsius.
New transparency AA feature available in the advanced view mode.
An updated and more decorated SLI tab.
Several enhancements were made to the Release 75 drivers, so below is a list form of the updates incorporated, which should be available for download at the time of publishing this article.
- More SLI profiles and direct control to alter SLI rendering mode (Single, SFR and AFR).
- 16x FSAA mode in SLI mode (uses 4x Supersampling AA + 4x Mutisampling AA)
- Retaining of application profiles through driver updates.
- Support for the new PureVideo features that were discussed earlier.
- Improved Display Wizard for HDTV support and setup; ability to verify HDTV capabilities of the connected display.
- 64-bit driver performance equivalent to 32-bit versions.
- Reduced CPU overhead for driver calls.
- Improved TurboCache performance for products using TurboCache
- Improved SmartDimmer feature for mobile users with GeForce Go GPUs.
- Full OpenGL 2.0 support for hardware
- Graphics cards support axed for GeForce2 and older series to maintain a tweaked and leaner library. In essence, non-programmable shader based graphics cards aren't supported anymore.
In addition to these, NVIDIA has even shared out with the press regarding their ForceWare 80 series and here's what they have planned to tackle:-
- SLI: Independent graphics card overclocking even in SLI mode.
- SLI: Ability to mix cards from different add-in-card brands/partners.
- SLI: TV/HDTV output support
- SLI: Seamless mode switching between SLI and non-SLI configuration (no reboots)
- SLI: Scale even more games and upcoming titles
- Performance: Tweaked for better performance and support for Dual Core CPUs.
NVIDIA's Reference GeForce 7800 GTX - Part 1
The reference card you see here will be the design that would most likely be utilized on most other vendor's products. As you can see from the photo below, it's rather long and joins the ranks of the GeForce 6800 Ultra 512MB version, GeForce FX 5700 Ultra and the GeForce4 Ti 4600 as one of the longest cards for consumers.
There she is - the reference GeForce 7800 GTX with 256MB of frame buffer.
Our take on the length is perhaps to cater for the 512MB versions of the GeForce 7800 GTX. As we have seen with the initial GeForce 6800 Ultra cards that came with 256MB of frame buffer and compared them to the 512MB variants, it's obvious that the increased count in memory chips required more stable power. To counter that, the power circuitry on the 512MB GeForce 6800 Ultra took up a sizeable portion and lengthened the card. If you were to place that card right next to the GeForce 7800 GTX, they are exactly equal in length (and to a certain extent, the PCB area allocated to the power circuitry as well). How accurate is our speculation? You'll find that out in due time when the frame buffer of the 7800 GTX models possibly double in the near future.
Take a good look because it has dual DVI-I outputs and is a single slot design!
A more important aspect to many would be the thickness profile of the card and we are glad to report that the top dog from NVIDIA is a single slot solution! NVIDIA once said that their GeForce 6800 Ultra would be in a single slot form factor, but that never came true. Its closer cousin, the GT version, made it to a single slot design though. Now on to the next generation and with SLI technology in full force, it seems that they took great pains to ensure the GeForce 7800 GTX would fit into a single slot thermal profile. If you thought they slapped on a strong blower fan, you're would be dead wrong. The GeForce 7800 GTX is a very quiet card and is more silent than its predecessors. Now that's wonderful news to hear, literally.
The rear view of the card has RAMsinks and the securing bracket for the GPU heatsink.
Power transistors were bunched together and slapped with a heatsink to cool them collectively. The blower fan's airflow passes over this heatsink and that facilitates cooling further. Note the 6-pin PCI Express power connector requirement.
While NVIDIA managed to deliver the GeForce 7800 GTX in a single slot profile and maintained very low fan noise, heat was unfortunately rather high. The outcome plotted below is measured during intensive benchmarking and that's in our 22-degrees Celsius air-conditioned lab environment:-
Despite its close-to-scorching operating temperatures, it consumes less power than a GeForce 6800 Ultra. The G70 architecture actually incorporated some facets of NVIDIA's PowerMizer technology that's found on their mobile graphics parts to utilize and conserve power as efficiently as possible. Hence, NVIDIA places the peak power consumption for the GeForce 7800 GTX at about 100 to 110 watts while it's predecessor reaches 120 watts. For power supply concerns, ensure that you have at least a 350W power supply from a renowned manufacturer and throw out those shoddy ones lumped together with most normal casings.
Summing it up, it looks like NVIDIA figured out the form factor, noise and power consumption of the GeForce 7800 GTX, but weren't able to maintain the low operating temperatures of the NV40 series. If only that was possible to tackle as well, then that would have solved most of the common issues of transitioning to a new graphics card product.
NVIDIA's Reference GeForce 7800 GTX - Part 2
Here are more photo shots of the GeForce 7800 GTX and what you can expect many of the graphics card vendors to use too:-
This is what you'll see when the fancy NVIDIA faceplate and airflow guide is detached.
Looks like it's an aluminum cooler for most part of it, but note that it's using an integrated heat pipe that's found over the GPU and some of the memory parts.
Here's another closer view of the heatsink.
An underside shot of the GPU heatsink. The cooler design isn't as extensive as the previous one on the NV40 in our opinion.
Here's the front face of the bare GeForce 7800 GTX graphics card. Note the RAM chip layout.
Here's the rear view with more RAM chips. We figure that NVIDIA needed to spread out the heat output and placed four ICs in front of the card and the remaining four at the rear. In case you are curious, they are the Samsung GC-16 Graphics-DDR3 memory chips rated for 1.6ns operation; nothing really new.
A close up view.
The reference card was fitted with a Philips SA7115 video decoder chip, which allows for video capturing through the card's rear mini-DIN port.
Test Setup
With all the tech details established, let's get down to the results from our testing. We clinched a pair of NVIDIA GeForce 7800 GTX 256MB graphics cards and we also managed to acquire dual GeForce 6800 Ultra 512MB graphics cards. What do all people with dual graphics cards do these days? They SLI them, and so did we on an ASUS A8N-SLI (nForce4 SLI) motherboard with an Athlon 64 3500+ and 1GB of Corsair DDR400 memory set for dual-channel. We would have loved to put in an Athlon 64 FX-55, but for some internal results compatibility sake, we used back an Athlon 64 3500+ CPU. In case you've not been up to speed with our CPU articles, the A64 3500+ is able to keep pace with an Intel 3.2GHz Extreme Edition or better as far as the gaming arena is concerned. So that should settle any doubts regarding how high we can scale our performance numbers.
Of course we also involved an ATI RADEON X850 XT Platinum Edition to get a good gauge how much faster the GeForce 7800 GTX is ahead of the former best performing graphics card. All in all, we have five different results for each test iteration and they are as follows:- GeForce 7800 GTX - SLI, GeForce 7800 GTX, GeForce 6800 Ultra - SLI, GeForce 6800 Ultra and the RADEON X850 XT P.E.. The NVIDIA graphics cards used the latest ForceWare 77.62 driver while ATI's sole card was tested with their Catalyst 5.6 driver set. As usual, we list the benchmarks used to gauge all the cards and following that we have a GPU comparison table for reference purposes:-
- Futuremark 3DMark03 Pro (version 360)
- Futuremark 3DMark05 Pro (version 120)
- Codecult's Codecreatures
- Unreal Tournament 2004 Demo
- Halo: Combat Evolved
- GunMetal Demo
- AquaMark 3 benchmark
- High Dynamic Range (HDR) Lighting demo
- FarCry 1.31
- Doom 3
GPU/VPU | NVIDIA GeForce 7800 GTX | NVIDIA GeForce 6800 Ultra | ATI RADEON X850 XT Platinum Edition |
Core Code | G70 | NV40 | R480 |
Transistor Count | 302 million | 222 million | 160 million |
Manufacturing Process (microns) | 0.11 | 0.13 | 0.13 |
Core Clock | 430MHz | 400MHz | 540MHz |
Vertex Shader Pipelines | 8 | 6 | 6 |
Pixel Shader (Rendering) Pipelines | 24 | 16 | 16 |
Peak Texture Fill Rate (Mtexels/s) | 10,320 | 6,400 | 8,640 |
Raster Operation Pipelines | 16 | 16 | 16 |
Peak Pixel Fill Rate (MPixels/s) | 6,880 | 6,400 | 8,640 |
Memory Clock | 600MHz (1200MHz DDR3) | 550MHz (1100MHz DDR3) | 90MHz (1180MHz DDR3) |
Memory Bus Interface | 256-bit | 256-bit | 256-bit |
Total Memory Bandwidth | 38.4GB/s | 35.2GB/s | 37.8GB/s |
NVIDIA CineFX Engine | 4.0 | 3.0 | - |
NVIDIA Intellisample Technology | 4.0 | 3.0 | - |
NVIDIA UltraShadow Technology | II | II | - |
ATI SMARTSHADER | - | - | HD |
ATI SMOOTHVISION | - | - | HD |
ATI HYPER Z | - | - | HD |
Pixel Shader Support | 3.0 | 3.0 | 2.0 |
Vertex Shader Support | 3.0 | 3.0 | 2.0b |
DirectX Support | Up to DirectX 9.0c | Up to DirectX 9.0c | Up to DirectX 9.0b |
FSAA Mode | Multi Sampling (up to 6x) + Transparency AA option | Multi Sampling (up to 6x) | Multi Sampling (up to 6x) |
Anisotropic filtering Modes | Up to 16x | Up to 16x | Up to 16x |
Other Features |
|
|
|
RAMDAC | Dual 400MHz | Dual 400MHz | Dual 400MHz |
TV Output | Int., HDTV output | Int., 1024x768 | Int., 1024x768 |
TMDS transmitter | dual, integrated | On-board IC | Int. 165MHz |
Graphics Card Interface | PCIe x16 | PCIe x16 | PCIe x16 |
PCIe Power Connectors | 1 | 1 | 1 |
Results - 3DMark03 Pro (ver.360)
Before the launch, NVIDIA was hyping that their G70 was about the same performance level of an existing GeForce 6800 Ultra in SLI. It goes without saying that we had high expectations for the GeForce 7800 GTX, but we also have to pause and consider that a pair of GeForce 6800 Ultra cards actually has far more shader performance throughput by the virtue of an overall graphics processing setup having 32 pixel shader pipelines. Though that's just a lose overview since SLI doesn't actually equate that, it goes to show that it is very difficult for a single 24-pipeline GeForce 7800 GTX, no matter how efficiently it is designed, to outrun pure brute power by having more processing pipelines.
Extracting from the results below, we have the GeForce 7800 GTX powering ahead of the GeForce 6800 Ultra by 28% and comparing the respective SLI setups can also yield an equivalent advantage. The gap widens in favor of the GeForce 7800 GTX to about 40% when the workload increases with FSAA and AF. All these comparisons are made at a resolution of 1600x1200 since that's where the hardware is most taxed and given the high results obtained, it's likely what GeForce 7800 GTX owners would want to use.
Results - 3DMark05 Pro (ver.120, No FSAA)
One of the most taxing benchmarks around, 3DMark05 is as close as possible to viewing next generation games performance, though it's more of a graphics functions showcase than anything similar to what future game engines would harness. For now, we'll have to make do with it till we can get Unreal Engine 3 and other exciting upcoming game engine demos.
As the resolutions scale up, the single GeForce 7800 GTX lives up to its architectural advantage by showing a 50% boost in performance ahead of the GeForce 6800 Ultra. In SLI mode, it's advantage drops to just over 30%, though as you shall see on the next page with more taxing settings, the numerical benefits look great. If you notice the scores, they don't really scale too well with resolution and that's because of the super huge 2048 x 2048 resolution depth map that's used to generate the numerous shadow effects and it's part of the benchmark's particular shadow rendering technique. You can read more about it from Futuremark's whitepapers should you want to delve into details.
Taking a closer look at the frame rates of the three games that combine to generate the 3DMark05 result, it looks like the Canyon test has enough effects to bog down the graphics cards on the various resolutions, but the first two games seem to have hit a wall. Looks like we have to resort to overclocking the system if were to see any performance increase in those tests.
Results - 3DMark05 Pro (ver.120, 4x FSAA + 8x AF)
The situation with FSAA looks almost identical, as 3DMark05 doesn't show much of a hit, but SLI performance of the GeForce 7800 GTX has some good benefits. While the performance gain for single graphics cards is good comparing the GeForce 7800 GTX and the GeForce 6800 Ultra, the same cannot be said when you bring the RADEON X850 XT P.E. into the picture as the difference waters down to 'only' 30%.
The breakdown of the results are as follows:-
Results - Codecreatures (Direct3D Benchmark)
Using the Codecreatures demo, which is more of a very taxing DirectX 8, based engine, the lone GeForce 7800 GTX had a 30% lead ahead of the GeForce 6800 Ultra at 1600x1200 with 4x FSAA and 8x AF, but the equivalent SLI setup only had a 20% gain. It's sufficient to say that this is more of an outcome to consider when dealing with an older breed of games (albeit a demanding one) so the performance should not be tallied with those using a newer breed of game engines.
Results - Unreal Tournament 2004 (Direct3D Benchmark)
Unreal Tournament 2004 is a very platform and CPU limited game and it's not something to showcase the performance of a GeForce 78000 GTX. Nevertheless, we show you the outcome for those who are curious:-
Results - Halo: Combat Evolved (DirectX 9 Benchmark)
This is one of those rare games that really highlighted the performance improvement with NVIDIA's latest incarnations, and the gains were actually higher than in 3DMark05. At near 60% speedup over a single GeForce 6800 Ultra, the outcome is fantastic. It is also the only game where the single GeForce 7800 GTX was about on par with the GeForce 6800 Ultra SLI combo. Looks like NVIDIA must have been referring to these figures back in Computex when they first mentioned the capability of their next generation cards to the press. Against the ATI's current top dog, the lead shrunk to only 30%.
SLI performance gain was less than 20% and was becoming more a system constraint than the inability of the SLI system to scale further.
Results - AquaMark3 (DirectX 9 Benchmark)
Here again the single graphics card performance improvements was a good 50% at the higher resolutions and just like in Halo, SLI wasn't too fantastic as the results indicate our CPU and system being the limiter than SLI's true throughput. Take note that we met problems with the ATI RADEON X850 XT P.E. in this benchmark, hence it's exclusion in this test.
Results - HDR Lighting Demo (DirectX 9 Benchmark)
SLI test results was abysmally low in this high dynamic range lighting demo and was verified by NVIDIA too. For some reason, it couldn't activate the right SLI profile to get the desired results. That aside, we were still disappointed to see that the GeForce 7800 GTX was only marginally faster than the RADEON X850 XT P.E.
Results - FarCry 1.31 (DirectX 9 Benchmark)
At lower resolutions, our FarCry testing became platform limited, but at 1600 x 1200, the limitation switched back to the graphics card's throughput and we were able to obtain reasonable comparisons at that resolution. Before you dismiss the GeForce 7800 GTX for its negligible gain when contrasted with a RADEON X850 XT Platinum Edition, take note that FarCry version used here is 1.31, which supports Shader Model 3.0. As such, gaming on the NVIDIA cards would give you better game realism with better image quality. So the comparison here isn't all too direct, but considering the already good performance, you would much rather expend some frame rates for better game quality, which is NVIDIA's argument (and one that we agree too).
Results - Doom 3 (DirectX 9 Benchmark)
The performance gain in Doom 3 is nothing really spectacular either with the GeForce 7800 GTX at best averaging 25% better than its predecessor. SLI scales very well in this game and we finally see our prediction come true as the GeForce 7800 GTX SLI combo sped past the 80 frames per second mark at 1600 x 1200 with all image quality settings at maximum. Though our prediction in December 2004 was for the next generation graphics card to hit the 80FPS mark, the truth as it seems is just slightly off the mark, requiring a pair of next generation cards to achieve that feat. SLI did bring in massive performance gains, but it wasn't much more powerful than a pair of GeForce 6800 Ultra cards.
Here's more results using the Ultra High Quality mode:-
Overclocking
Unlike the GeForce 6800 Ultra that had little overhead to scale the clocks further, the new GeForce 7800 GTX had a much generous headroom. From a 430MHz core, we went up to 480MHz and the 1200MHz memory climbed up to 1370MHz DDR for a total performance gain of close to 13% at the highest resolution. If you tally the performance with the clock speed increment in percentage, you'll find them to scale quite linearly, which is a good sign. SLI performance advantage at the overclocked state wasn't too encouraging and is likely in need of a speedier system to net a bigger gain.
A New Heir To The Throne
After yet another round of buzz and sneak spotlights, the G70 architecture is finally made known to the public and the first graphics part utilizing it is the GeForce 7800 GTX - NVIDIA's new kingpin. In summary, this new monster GPU sports NVIDIA's up-to-date CineFX 4.0 architecture that was designed for improved shader performance with twice the floating-point performance of its predecessor. To tackle the toughest of the next round of game engines, it is also built with 8 vertex engines and 24 pixel pipelines for an extremely parallel operating GPU. Apart from brute processing power, the GeForce 7800 GTX also has a couple of new features, an updated PureVideo engine and has hardware acceleration to support the Windows Graphics Foundation standard that would debut in the Longhorn operating system in the near future.
If all those sounded like a massively turbocharged NV40 GPU, in simple terms for the layman, yes it certainly is. You won't find excessively new features on the GeForce 7800 GTX simply because the underlying processing engine still adheres to Microsoft's Shader Model 3.0 standard. Back when the NV40 debuted, it was practically a complete revamp from the uninspiring GeForce FX series since NVIDIA had to tackle both the GPU design to crank out better performance and to support the SM 3.0 which meant a whole lot of new features and changes then. The NV40 was an excellent overhaul that brought enormous and unheard performance gains as well as being a perfect candidate to evolve further in the future. That brings us to the new GeForce 7800 GTX, which we have scrutinized in complete detail.
The GeForce 7800 GTX based on the G70 architecture looks to reign for a while at least.
While there were numerous rumors hinting the GeForce 7800 GTX would rival a pair of GeForce 6800 Ultra graphics cards (and even by NVIDIA), our testing revealed that wasn't the case most of the time. In fact, if one were to examine the specs and make an educated guess, that would have been asking too much of the GeForce 7800 GTX. On average, we found it to be 30% to 50% speedier than the GeForce 6800 Ultra and tailing the GeForce 6800 SLI by less than a 20% differential (with negligible difference in the best scenarios). However, when compared with the best of the previous generation, the RADEON X850 XT P.E., the gain was only 35% at best, which doesn't seem like a whole lot considering how much more has gone into making the GeForce 7800 GTX. Part of the reason is because the GeForce 7800 GTX is designed to tackle very shader intensive games like the future Unreal Engine 3, where internal testing from NVIDIA places the GeForce 7800 GTX twice as fast the GeForce 6800 Ultra. In this light, the current crop of games just isn't the sort where the GeForce 7800 GTX would break a sweat. Additionally, the GeForce 7800 GTX has gotten so fast to the point that the processing bottleneck is now back to the system. This is especially evident when we ran two of the GeForce 7800 GTX cards in SLI and comparing the speed bump with a pair of GeForce 6800 Ultra graphics cards.
So it wasn't a groundbreaker like its predecessor, but we've got to admit, the GeForce 7800 GTX is the fastest available single graphics card and when you combine two of them in an SLI system, you'll get performance that's simply top notch. Still we feel that the GeForce 7800 GTX has a lot of processing power that's untapped, simply because there isn't a game out there right now that can really bring it down to its knees and test all of its abilities. Performance numbers aside, other factors that are noteworthy are its single slot profile, extremely quiet operation and slightly reduced power consumption than the GeForce 6800 Ultra, but yet delivering up to 50% more performance. Here's the best catch:- it's available right now for US$599 and it will replace the GeForce 6800 Ultra SKU. This means the rest of NVIDIA's lineup is practically untouched for now and the only change is at pinnacle of power. With ready availability and a guaranteed credible speed bump, that makes the GeForce 6800 Ultra SLI pair too expensive, too much of a space hog and a high power drain. For an offering in-between the GeForce 6800 Ultra and a pair of them in terms of performance (with the potential to close the gap of the SLI counterpart), we would say the GeForce 7800 GTX is a well-welcomed addition to NVIDIA's family.
If there was anything we really disliked, it has got to be the high operating temperatures. While the GeForce 6800 Ultra single card easily managed a sub-50 degrees Celsius operation in our air-conditioned test lab, the GeForce 7800 GTX was idling at those temperatures. Needless to say, it shot up the 60 to 70 degrees Celsius range during intense benchmarking. Now if you were to have a pair of these cards for SLI, you ought to have a 500W or higher rated power supply from a reputable brand and ensure that you have excellent casing ventilation. The GeForce 7800 GTX's default cooling system does not use a direct exhaust system, hence we highly stress on the need for good case cooling.
How about a pair of GeForce 7800 GTX cards?
Overall, the GeForce 7800 GTX simply rules, but it's just not to the extent that most enthusiasts were hoping to gain. As we have mentioned, it's best that we reserve the G70 architecture's capabilities until a new class of game engines come along to really showcase the next generation graphics power. Hopefully then, we would see even better results as compared to its predecessors. For now, what the GeForce 7800 GTX brings to the current crop of games is the ability to use the highest resolutions and the maximum detail settings and enjoy gaming with the very best performance and image quality possible. The only requirement is that you would have to be equipped with a decently fast machine in the first place to avoid the GeForce 7800 GTX waiting up on your system.
Currently the GeForce 7800 GTX is only available for the PCIe interface and NVIDIA has no plans for an AGP counterpart at the time of publishing this article. However, if market demand and their partners do request for such, NVIDIA will pursue an appropriate reference design.
Our articles may contain affiliate links. If you buy through these links, we may earn a small commission.