![]() Of course, this GPU has its own fast, dedicated memory subsystem, so we're not just adding a whole truckload of FLOPS we're adding bandwidth in support of those FLOPS. (The 7950 is somewhat faster when combined with Intel processors, likely because of their higher single-threaded performance.) The Radeon HD 7950 achieves more than twice the throughput of the Core i7-3770K's quad CPU cores, regardless of which processor is driving it. Moving some workloads over to a fast enough GPU can really pay off. To give you a better sense of the prospects for mixed-mode computing, let's have a look at a much more capable GPU, the Radeon HD 7950, when driven by the various processors we've tested. There's a hint of potential here combined performance is roughly equal to the AMD FX-6200's, a chip with three Bulldozer modules. If we invoke both the CPU cores and the IGPs at the same time, we see higher overall performance than with just one type of computing unit engaged-and the A10's combined throughput is ever so slightly higher than the Core i3-3225's. ![]() The IGP in AMD's Trinity is substantially faster than Intel's HD 4000 graphics, but neither CPU's IGP can match its x86 cores. Moving the workload over to the IGPs uniformly produces lower performance than the same processors achieve with only their CPU cores. Without AVX or Hyper-Threading, the Pentium G2120 finishes dead last, well behind the A8-5600K. (FMA support is slated for Intel's next-gen Haswell chip.) The Core i3's FPUs support AVX, as well, and they achieve higher throughput than Trinity's, even though they don't use the fused multiply-add instruction. Intel's Core i3-3225 is only a dual-core processor, but it has two FPUs and can track and execute four threads via Hyper-Threading, so the architectural similarities to Trinity are closer than you might think. Each of Llano's four cores has its own dedicated FPU, so although Trinity benefits from the extra-wide vector math enabled by its support for AVX instructions, it's not much faster than Llano. Why? One reason is that the two "Piledriver" modules in Trinity have only one shared FPU each. Using their CPU cores alone, the new Trinity APUs are only a smidgen faster than the chip they replace, the Llano-based A8-3850. These results come from the AMD APP driver for OpenCL, since it tends to be faster on both Intel and AMD CPUs, funnily enough. We'll start with CPU-only results from a broad swath of processors. The AMD APP driver even supports Bulldozer's distinctive instructions, FMA4 and XOP. For instance, Intel and AMD offer integrated client drivers for OpenCL on x86 processors, and they both claim to support AVX. Since OpenCL code is by nature parallelized and relies on a real-time compiler, it adapts easily to new instructions. LuxMark should do a nice job of harnessing the capabilities of new CPUs. Also, we've already incorporated LuxMark into our wider CPU suite, which includes a huge selection of chips, so we have ample context for the performance numbers it spits out. Ray-tracing is a classic "embarrassingly parallel" application, so it's a good test case to demonstrate the potential of data-parallel compute hardware. We tried out accelerated versions of The GIMP image processor and WinZip compression in our review of Trinity's mobile variant, but the program we find most interesting to date is LuxMark, which uses OpenCL to tackle ray-traced rendering. However, the more interesting programs in our book don't just use dedicated custom logic they employ real GPU computing, likely through the OpenCL API, to handle tasks previously reserved for the CPU cores. ![]() We've recently taken a look at the hardware video encoding options on the PC, so you can read about them if you wish. Some of them are just video transcoders that make use of the dedicated encoding hardware built into new CPUs, features like Intel's QuickSync and AMD's HD Media Accelerator. These "accelerated" programs fall into several groups. AMD has been making strides in persuading developers to use OpenCL to accelerate certain classes of applications, though, and it has supplied reviewers with a handful of programs to demonstrate the potential there. ![]() Although GPU computing has taken off in specialized sectors like scientific computing and HPC, we are still in the early days of GPU computing for consumer applications. One of AMD's goals for APUs going forward is to use the parallel computing power of the integrated graphics processor to assist the CPU cores where possible.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |