Feature Articles

AMD Kabini APU Preview - Combating a Changing Computing Landscape

Improvements: Jaguar, Unified Northbridge & GCN

Jaguar Roars

Named after a river in Southern India (and perhaps telling of which target market AMD wants to strike hard), the new Kabini platform and its APUs feature two crucial and major improvements. First of all, it will be utilizing AMD’s new Jaguar microarchitecture.

To be sure, Jaguar can be best thought of as a refinement over the older Bobcat architecture that was used in the older Brazos platform (and the forerunner of the AMD Fusion initiative). The main objectives of Jaguar was to improve on instructions per clock, better frequency and power efficiency.

To do that, the Jaguar cores received improved prefetchers, integer execution and floating point execution units (128-bit up from 64-bit), and a shared L2 interface. The shared L2 cache is one of the major design additions to Jaguar. In the higher end SKUs - A6-5200 and A4-5000 - the L2 cache is 2MB and it is supported by four L2D banks of 512KB each. Each core also has its own L2 stream prefetcher, thereby improving overall bandwidth and instruction per clock. That aside, the 'floor plan' has been mildly tweaked to improve data flow.

Additionally, Jaguar also features improved instruction set support over Bobcat, adding support for SSE 4.1, SSE 4.2, AVX, AES and many more.

 

The Unified Northbridge (UNB)

Taking a cue from the Trinity APUs, the new Kabini APUs will also see the introduction of a Unified Northbridge (UNB). In the new UNB, PCIe replaces the old HyperTransport technology as the interconnect between memory and I/O subsystems. According to AMD, this is more efficient at handling and scheduling memory-related requests from both the CPU and GPU.

As part of the new UNB, the memory controller in Jaguar supports 1.25V, 1.35V and 1.5V SO-DIMM memory modules of up to DDR3-1600MHz speeds, giving it a maximum total memory bandwidth of around 10.3GB/s. The controller also supports memory P-states, allowing it to adapt to memory speed changes on the fly for better power efficiency. Finally, maximum memory support on the platform is a healthy 32GB.

 

Graphics Core Next (GCN)

Previously, we lamented that the new Trinity APUs were not getting integrated GPUs built using AMD’s latest Graphics Core Next (GCN) architecture. Finally, AMD has decided to implement GCN-derived Radeon 8000-series integrated GPUs into the Kabini APUs.

The older VLIW architecture is good for graphics work, but poor for compute. They excel in high instruction level parallelism, which is the exact opposite of typical compute tasks. This meant that the processing pipeline was never really fully populated, thus not utilizing the core to its full ability. In getting GPUs to take on traditional computing tasks, this has been the perennial problem. With the new GCN architecture, however, AMD attempts to address some of the shortcomings of its older VLIW architecture in taking on compute tasks.   

A GCN compute unit is the basic processing block of AMD’s new generation GPUs, and the new GCN architecture promotes greater independency amongst the various GCN compute units, thereby relieving bottlenecks. This is achieved by moving scheduling from the compiler to the hardware.

Overall, the move to the new GCN architecture is vital to the performance o the new Kabini APUs, as it allows compatible workloads to tap into the vast raw computing powers of GPUs. Crucially, this is also consistent with the GCN architecture of AMD’s discrete offerings, thereby ensuring full compatibility with all AMD GPUs.

That aside, the new Radeon 8000 series integrated GPUs will support up to 4K output resolution via HDMI and DisplayPort, and also in AMD EyeFinity mode.