CPU Guide

Intel Core 2 (Conroe) Performance Review

Intel Core 2 (Conroe) Performance Review


Core du Conroe

Core du Conroe

Intel's Core 2 Duo desktop processors will be the first processor based on their new Core microarchitecture to leave Intel's doors. Like the current Pentium D 900-series Presler processors, Intel's new Core 2 Duo will be manufactured under a 65nm process. However, the Conroe shares more similarities with Intel's mobile processor - the Core Duo or Yonah core. With Conroe, Intel has gone back to packing two execution cores into a single die, but with all the goodness of a shared L2 cache of the Yonah. Each core comes with a dedicated 64KB L1 cache (32KB instruction, 32KB data), more than double that of the previous generation Presler's 28KB L1 cache (16KB + 12KB), features five independent prefetchers per core (two L2 data, two L1 data and one L1 instruction prefetch units).

The Core 2 will debut with a total of 4MB L2 cache with a cheaper 2MB L2 variant based on the Allendale core available as well. The shared cache design allows each core to have full dynamic access to the entire cache size, which will have a substantial impact when running non-SMP capable applications where one core is essentially shut off. If you have trouble understanding how this helps, remember the performance gains going from 1MB to 2MB L2 for the Prescott core? Now visualize the processor with a full 4MB L2 cache all to itself. Drool worthy, yes we know.

Surprisingly, although the Conroe core is built with the same 65nm manufacturing process, carry the same amount of L2 cache (4MB) and features more L1 cache as the current Preslers, the die size has shrunk by about 12% from 162mm to 143mm to and transistor count by 22% from 376 million to 291 million. Intel has also managed to drastically improve thermal performance of the processor through fine grain power gating techniques where individual components, registers and cache of each core can be dynamically shut down to save power and brought back up immediately when required so as not to outwardly affect processor performance. All announced Core 2 Duo processors now carry a cool max TDP of 65W with the Extreme edition capping off at 75W, which on average a 50% improvement over the 130W furnace that is the Presler core and even bests AMD's new AM2 Athlon 64 X2 and FX processors by 27 - 48% with exception for the 35W Low Power versions.


The Fat Blue Pipe

Intel's Core microarchitecture is all about efficiency and Intel has been working hard to tweak the processor itself for better performance instead of just beating on the MHz stick. The Core 2 will feature a shorter 14-stage pipeline, more than a 50% reduction of the 31 stages in the current NetBurst microarchitecture and just slightly above the 12-stages in AMD's Athlon 64s. This will put a dampen into Intel's speed ramping with lowest end Core 2 Duo E6300 processor running only at 1.86GHz. However, the shorter pipeline will equal more instructions per cycle with less retry stages.

The Core 2 features a 4-issue wide execution core (per core), up from the 3-issue wide cores of NetBurst processors. This allows the Core 2 Duo to execute more instructions per clock cycle than previous Intel desktop processors. The Core microarchitecture also integrates the Pentium M's Micro-fusion technology and introduces a new feature called Macro-fusion. Modern processors break down certain x86 program instructions into smaller micro-ops for processing and the Pentium M inherited Micro-fusion can identify similar micro-ops pairs and process them in a single cycle. Macro-fusion is a similar feature, but works at the higher instruction level where common x86 instructions like CMP or TEST can be combined into a single macro-op, read and decoded in one cycle. Thus, for the Core 2 Duo, a 4-wide execution core can even mean five instructions per clock cycle.

The Core 2 will also feature an enhanced SSE engine with full 128-bit SSE registers to enable SSE instructions to be executed in a single cycle, where previously would have required two because of the 64-bit data paths of older processors.

Finally, the Conroe core is also a testament to reinforce Intel's views that an integrated memory controller is unnecessary. Intel goes around memory bottleneck issues associated with an external memory controller by effectively 'hiding' memory access latency hits. This is done through a technique called memory disambiguation where the processor can scan instruction queues and speculatively perform memory loads before previous store commands have completed, reducing wait times and enhancing instruction parallelism efficiency. The five data prefetchers per core play a big role in optimizing memory access as well by intelligently loading data before requests are made.

These are all the new microarchitectural changes that will make its way into the Core 2 and it will also support previous generation features such as Intel Virtualization Technology, EMT64 and Execute Disable Bit. However, Hyper-Threading technology is not supported in any of the announced Core 2 processors (which is perhaps a good thing at this point of time). The following table clearly illustrates the detailed technical differences between the Core 2 CPUs and the latest dual-core processors from Intel and AMD:-

High-End Dual-Core CPUs Compared
Processor Name Core 2 Extreme Core 2 Duo Pentium Extreme Edition AMD Athlon 64 FX
Processor Model X6800 E6600, E6700 965 FX-62
Processor Frequency 2.93GHz 2.40GHz, 2.67GHz 3.73GHz 2.8GHz
No. of Cores 2 2 2 2
Hyper-Threading Technology No No Yes -
No. of Logical Processors 2 2 4 2
Front Side Bus (MHz) 1066 1066 1066 -
HyperTransport Bus - - - 1GHz (2000MT/s)
L1 Cache (data + instruction) (32KB + 32KB) x 2 (32KB + 32KB) x 2 (16KB + 12KB) x 2 (64KB + 64KB) x 2
L2 Cache 4MB 4MB 2MB x 2 1MB x 2
Memory Controller External Dual Channel (up to DDR2-800) External Dual Channel (up to DDR2-800) External Dual Channel ( up to DDR2-667) Integrated Dual Channel (up to DDR2-800)
VID (V) 0.85 - 1.3625 0.85 - 1.3625 1.20 -1.3375 1.35 - 1.40
Icc (max) (A)7 90 75 125 90.4
TDP (W) 5 65 130 125
Execute Disable Bit Yes Yes Yes Yes
Intel EM64T / AMD64 Yes Yes Yes Yes
Power Management Technology   Yes - Intel Intelligent Power Capability Yes - Intel Intelligent Power Capability No Yes - AMD Cool 'n' Quiet
Virtualization Technology Yes Yes Yes Yes
Packaging    LGA775 LGA775 LGA775 AM2
Process Technology   65nm 65nm 65nm 90nm SOI
Processor Codename    Conroe Conroe Presler Windsor
Die Size 143mm 143mm 162mm 230mm
No. of Transistors 291 million 291 million 376 million 227.4 million