Currently, AMD is struggling in the microprocessor market. Their only competitor, Intel, is clearly miles ahead of them, both in terms of performance and finance. It’s been like this since a few years now. However, in 2011, AMD launched their APUs (Accelerated Processing Unit) under their ‘Fusion’ initiative. AMD APUs consist of a close integration of CPU and GPU, aimed to provide balanced performance with low power consumption. It all started in 2006 when AMD acquired ATi and announced Fusion program. After several years of development, we saw its fruit in terms of Llano APUs in 2011. Well, actually APUs were initially launched in the end of 2010 (codenamed Brazos) but those were for laptops only. On the other hand, Llano consists of desktop parts as well.
Both the Brazos and Llano didn’t really make any breakthrough in terms of performance or implementation of technology. Both of these were comparatively strong on the GPU side but the CPU side was not up for any kind of competition. Furthermore, AMD also faced serious supply issues with Llano desktop parts and that is also one of the reasons why they were unsuccessful to capture considerable market share. However, company sold enough Llano parts to gain some confidence in their Fusion program. In May of 2012, AMD finally launched their next generation of APUs; Trinity. Only mobile Trinity parts were launched back then, desktop parts followed in October 2012. There is no real explanation behind this gap between two launches but common belief is that the company wanted to clear their Llano desktop stock.
The Trinity is still based on aging 32nm manufacturing process but in terms of architectural changes, it brings a lot of new and exciting things to the table. First of all, it’s based on AMD’s latest Piledriver architecture. Piledriver is same as Bulldozer but a lot different as well. Bulldozer didn’t do very well in first generation FX-series desktop CPUs but Piledriver in second generation FX chips make things a lot better. Trinity APUs have up to four cores or two Piledriver modules. We’ve discussed Pildriver and its ‘modules’ in detail in our FX-8350 CPU review but I would like to give a quick recap here. A single Piledriver module is detected as two cores by OS and it pretty much performs as two cores but in-fact it doesn’t have two complete cores. However, the way AMD has implemented it, one can’t argue with it.
AMD tweaked the Piledriver a bit to fit the APU budget and power envelop. To begin with, these are tuned for efficiency instead of absolute raw power as in FX-series processors. Secondly, the L3 cache is also omitted from the APUs. There are two reasons behind that. First one is that putting an L3 cache increases the manufacturing costs. Second one is that L3 cache consumes considerably more power as compared to L2 cache. The L2 cache can be enabled or disabled depending on the usage of cores but L3 cache stays awake no matter what. Furthermore, L3 cache only increases performance in certain conditions so apparently AMD thought that it’s not worth it.
On the GPU side, the Trinity APUs comes with up to 384 Cayman cores from their Northern Island GPU family. The clock speeds go up to 800MHz in the flagship model. Exactly like Cayman based discrete GPUs, the Trinity GPU is also based on VLIW4 architecture as opposed to VLIW5 which AMD previously used. This technically means that Trinity has fewer graphics cores as compared to Llano but they are more efficient. With 24 texture units and 8 ROPs, the GPU in Trinity A10-series APUs is exactly a quarter of Radeon HD 6970 in terms of hardware resources.
AMD also introduced another very important and useful technology with Trnity APUs and that is the Turbo Core 3.0. In Llano, only CPU can increase its clocks if needed while GPU can only go to its maximum specified frequency but this changes Trinity. The new Turbo Core technology allows both parts of the APU to increase their clocks when needed. Apart from this, if one part of the APU isn’t completely using its TDP quota with its turbo while the other part is under greater load, the later will automatically achieve the maximum frequency possible capitalizing the overall TDP headroom and bypassing its actual quota. This considerably helps in putting the horsepower where it is actually needed.