NVIDIA Announces CUDA 6, Updated to Simplify Parallel Programming
NVIDIA has announced CUDA 6 and this parallel programming platform has been updated to enable CUDA developers to be more efficient. CUDA 6 has also been enhanced to allow the same developers to retrofit CPU-based libraries with GPU-based ones for efficient parallel processing.
One of the main features is the introduction of a unified memory space for the CPU and the GPU. The idea was first introduced with CUDA 4.0 where the Unified Virtual Addressing (UVA) provides a single merged-memory address space for the main system memory and the GPU memories, enabling quicker and easier parallel programming. With CUDA 6.0, this feature has been updated to designate an actual unified memory system for the CPU and GPU.
The other two features are as follows:-
Drop-in Libraries – Automatically accelerates applications’ BLAS and FFTW calculations by up to 8X by simply replacing the existing CPU libraries with the GPU-accelerated equivalents.
Multi-GPU Scaling – Re-designed BLAS and FFT GPU libraries automatically scale performance across up to eight GPUs in a single node, delivering over nine teraflops of double precision performance per node, and supporting larger workloads than ever before (up to 512GB). Multi-GPU scaling can also be used with the new BLAS drop-in library.
For more information, please click this link for the official NVIDIA press release.