GPGPU stands for General-purpose computing on graphics processing units. In Linux, there are currently two major GPGPU frameworks: OpenCL and CUDA
- 1 OpenCL
- 2 CUDA
- 3 List of OpenCL and CUDA accelerated software
- 4 Links and references
OpenCL (Open Computing Language) is an open, royalty-free parallel programming specification developed by the Khronos Group, a non-profit consortium.
The OpenCL specification describes a programming language, a general environment that is required to be present, and a C API to enable programmers to call into this environment.
Arch Linux provides multiple packages for all of these.
To execute programs that use OpenCL, you need to install a runtime compatible with your hardware:
- : execute on your Nvidia GPU (official Nvidia runtime)
- : execute on AMD GPU's using the mesa drivers (currently under development, your mileage may vary)
- AUR: execute on your AMD GPU (official AMD runtime)
- AUR: execute on your CPU (official Intel runtime, also supports non-Intel CPUs)
- AUR: execute on your CPU (LLVM-based OpenCL implementation)
For OpenCL development, the bare minimum additional packages required, are:
- : OpenCL ICD loader implementation, up to date with the latest OpenCL specification.
- : OpenCL C/C++ API headers.
The vendors' SDKs provide a multitude of tools and support libraries:
- AUR: Intel's OpenCL SDK (old version, new OpenCL SDKs are included in the INDE and Intel Media Server Studio)
- AUR: AMD's OpenCL SDK
- : Nvidia's GPU SDK which includes support for OpenCL 1.1.
OpenCL ICD loader (libOpenCL.so)
The OpenCL ICD loader is supposed to be a platform-agnostic library that provides the means to load device-specific drivers through the OpenCL API. Most OpenCL vendors provide their own implementation of an OpenCL ICD loader, and these should all work with the other vendors' OpenCL implementations. Unfortunately, most vendors do not provide completely up-to-date ICD loaders, and therefore Arch Linux has decided to provide this library from a separate project () which currently provides a functioning implementation of the current OpenCL API.
The other ICD loader libraries are installed as part of each vendor's SDK. If you want to ensure the ICD loader from the
/etc/ld.so.conf.d which adds
/usr/lib to the dynamic program loader's search directories:
This is necessary because all the SDKs add their runtime's lib directories to the search path through
The available packages containing various OpenCL ICDs are:
- : recommended, most up-to-date
- AUR by AMD. Provides version 2.0 of OpenCL. It is currently distributed by AMD under a restrictive license and therefore could not have been pushed into official repo.
- AUR: Intel's libCL, provides OpenCL 1.2.
To see which OpenCL implementations are currently active on your system, use the following command:
$ ls /etc/OpenCL/vendors
OpenCL implementation from AMD is known as AMD APP SDK, formerly also known as AMD Stream SDK or ATi Stream.
It can be installed with the
/opt/AMDAPP and apart from SDK files it also contains a number of code samples (
/opt/AMDAPP/SDK/samples/). It also provides the
clinfo utility which lists OpenCL platforms and devices present in the system and displays detailed information about them.
As AMD APP SDK itself contains CPU OpenCL driver, no extra driver is needed to execute OpenCL on CPU devices (regardless of its vendor). GPU OpenCL drivers are provided by theAUR package (an optional dependency).
Code is compiled using(dependency).
OpenCL support from Mesa is in development (see http://www.x.org/wiki/GalliumStatus/). AMD Radeon cards are supported by the r600g driver.
Arch Linux ships OpenCL support as a separate package http://dri.freedesktop.org/wiki/GalliumCompute/ for usage instructions.. See
You could also use lordheavy's repo. Install these packages:
Surprisingly, pyrit performs 20% better with radeon+r600g compared to Catalyst 13.11 Beta1 (tested with 7 other CPU cores):
catalyst #1: 'OpenCL-Device 'Barts'': 21840.7 PMKs/s (RTT 2.8) radeon+r600g #1: 'OpenCL-Device 'AMD BARTS'': 26608.1 PMKs/s (RTT 3.0)
At the time of this writing (30 October 2013), one must apply patches  and  on top of Mesa commit ac81b6f2be8779022e8641984b09118b57263128 to get this performance improvement. The latest unpatched LLVM trunk was used (SVN rev 193660).
The Nvidia implementation is available as official repositories. It only supports Nvidia GPUs running the kernel module (nouveau does not support OpenCL yet).from the
The Intel implementation, named simply Intel OpenCL SDK, provides optimized OpenCL performance on Intel CPUs (mainly Core and Xeon) and CPUs only. Install it with the AUR package. The runtime can be installed with the separate AUR package. OpenCL for integrated graphics hardware is available through the AUR package for Ivy Bridge and newer hardware.
CPU-only LLVM-based implementation. Available asAUR.
The required packages for OpenCL development are listed in the overview. Installation of a full SDK is optional (depending on the runtime implementation, which is often only available as part of a vendor's SDK).
Link your application to
- D: cl4d
- Haskell: OpenCLRaw: AUR[broken link: archived in aur-mirror]
- Java: JOCL (a part of JogAmp)
- Mono/.NET: Open Toolkit
- Go: OpenCL bindings for Go
- Racket: Racket has a native interface on PLaneT that can be installed via raco.
CUDA (Compute Unified Device Architecture) is NVIDIA's proprietary, closed-source parallel computing architecture and framework. It requires a Nvidia GPU. It consists of several components:
- proprietary Nvidia kernel module
- CUDA "driver" and "runtime" libraries
- additional libraries: CUBLAS, CUFFT, CUSPARSE, etc.
- CUDA toolkit, including the
- CUDA SDK, which contains many code samples and examples of CUDA and OpenCL programs
The kernel module and CUDA "driver" library are shipped in only in 64-bit version.and . The "runtime" library and the rest of the CUDA toolkit are available in . The library is available
/opt/cuda. For compiling CUDA code, add
/opt/cuda/include to your include path in the compiler instructions. For example this can be accomplished by adding
-I/opt/cuda/include to the compiler flags/options. To use
gcc wrapper provided by NVIDIA, just add
/opt/cuda/bin to your path.
To find whether the installation was successful and if cuda is up and running, you can compile the samples installed on
/opt/cuda/sample (you can simply run
make inside the directory, altough is a good practice to copy the
/opt/cuda/samples directory to your home directory before compiling) and running the compiled examples. A nice way to check the installation is to run one of the examples, called
Using CUDA with an older GCC
Since CUDA does often not support the latest GCC version, you might need to install an older GCC to compile CUDA programs.
For CUDA 7.5/GCC 4.9, create the following symlinks, so CUDA will use the old compiler (for CUDA 8.0/GCC 5, replace
# ln -s /usr/bin/gcc-4.9 /opt/cuda/bin/gcc # ln -s /usr/bin/g++-4.9 /opt/cuda/bin/g++
You might also need to configure your build system to use the same GCC version for compiling host code.
- Fortran: PGI CUDA Fortran Compiler
- Haskell: The accelerate package lists available CUDA backends
- Java: JCuda
- Mathematica: CUDAlink
- Mono/.NET: CUDA.NET, CUDAfy.NET
- Perl: Kappa, CUDA-Minimal
- Python: or Kappa
- Ruby, Lua: Kappa
It might be necessary to use the legacy driveror to resolve permissions issues when running CUDA programs on systems with multiple GPUs.
List of OpenCL and CUDA accelerated software
- GIMP (experimental - see )
- - OpenCL feature requires at least 1 GB RAM on GPU and Image support (check output of clinfo command).
- broken link: archived in aur-mirror] AUR[
- AUR - a GPU memtest. Despite its name, is supports both CUDA and OpenCL
- here. - CUDA support for Nvidia GPUs and OpenCL support for AMD GPUs. More information