Difference between revisions of "GPGPU"

From ArchWiki
Jump to: navigation, search
(Please don't categorize user / non-public articles.)
m (AMD)
(31 intermediate revisions by 10 users not shown)
Line 1: Line 1:
{{Note|This article is a non-public work in progress as of yet}}
+
[[Category:Development]]
 
+
[[Category:Graphics]]
 
+
{{Out of date|With new versions of OpenCL, the things have changed a little bit.}}
 
{{Article summary start}}
 
{{Article summary start}}
 
{{Article summary text|Installation and usage of OpenCL and CUDA, the two major Linux GPGPU frameworks}}
 
{{Article summary text|Installation and usage of OpenCL and CUDA, the two major Linux GPGPU frameworks}}
Line 12: Line 12:
 
In Linux, there are currently two major GPGPU frameworks: [http://en.wikipedia.org/wiki/OpenCL OpenCL] and [http://en.wikipedia.org/wiki/CUDA CUDA]<br><br>
 
In Linux, there are currently two major GPGPU frameworks: [http://en.wikipedia.org/wiki/OpenCL OpenCL] and [http://en.wikipedia.org/wiki/CUDA CUDA]<br><br>
  
==<br>OpenCL==
+
==OpenCL==
 
===Overview===
 
===Overview===
 
OpenCL (Open Computing Language) is an open, royalty-free parallel programming framework developed by the Khronos Group, a non-profit consortium.
 
OpenCL (Open Computing Language) is an open, royalty-free parallel programming framework developed by the Khronos Group, a non-profit consortium.
  
Distribution of the OpenCL framework generally constists of:
+
Distribution of the OpenCL framework generally consists of:
* Library providing OpenCL API, known as libCL or libOpenCL (<tt>libOpenCL.so</tt> in linux)
+
* Library providing OpenCL API, known as libCL or libOpenCL ({{ic|libOpenCL.so}} in linux)
 
* OpenCL implementation(s), which contain:
 
* OpenCL implementation(s), which contain:
 
** Device drivers
 
** Device drivers
Line 25: Line 25:
 
''* only needed for development''
 
''* only needed for development''
  
===OpenCL libraray===
+
===OpenCL library===
There are several choices for the libCL. In general case, installing {{Package Official|libcl}} from [extra] should do :
+
There are several choices for the libCL. In general case, installing {{Pkg|libcl}} from [extra] should do :
 
  # pacman -S libcl
 
  # pacman -S libcl
 
However, there are situations when another libCL distribution is more suitable. The following paragraph covers this more advanced topic.
 
However, there are situations when another libCL distribution is more suitable. The following paragraph covers this more advanced topic.
Line 37: Line 37:
  
 
Although itself vendor-agnostic, the ICD Loader still has to be provided by someone. In Archlinux, there are currently two options:
 
Although itself vendor-agnostic, the ICD Loader still has to be provided by someone. In Archlinux, there are currently two options:
* extra/{{Package Official|libcl}} by Nvidia. Provides OpenCL version 1.0 and is thus slightly outdated. It's behaviour with OpenCL 1.1 code has not been tested as of yet.
+
* extra/{{Pkg|libcl}} by Nvidia. Provides OpenCL version 1.0 and is thus slightly outdated. Its behaviour with OpenCL 1.1 code has not been tested as of yet.
* unsupported/{{Package AUR|libopencl}} by AMD. Provides up to date version 1.1 of OpenCL. It is currently distributed by AMD under a restrictive license and therefore could not have been pushed into official repo.
+
* unsupported/{{AUR|libopencl}} by AMD. Provides up to date version 1.1 of OpenCL. It is currently distributed by AMD under a restrictive license and therefore could not have been pushed into official repo.
  
(There is also Intel's libCL, this one is currently not provided in a seperate package though.)
+
(There is also Intel's libCL, this one is currently not provided in a separate package though.)
  
{{Note|ICD Loader's vendor is mentioned only to indetify each loader, it is otherwise completely irrelevant. ICD Loaders are vendor-agnostic and may be used interchangeably<br>(as long as they are implemented correctly)}}
+
{{Note|ICD Loader's vendor is mentioned only to identify each loader, it is otherwise completely irrelevant. ICD Loaders are vendor-agnostic and may be used interchangeably<br>(as long as they are implemented correctly)}}
  
For basic usage, extra/libcl is recommended as it's installation and updating is convenient. For advanced usage, libopencl is recommended.  Both libcl and libopencl should still work with all the implementations.
+
For basic usage, extra/libcl is recommended as its installation and updating is convenient. For advanced usage, libopencl is recommended.  Both libcl and libopencl should still work with all the implementations.
  
 
===Implementations===
 
===Implementations===
To see which OpenCL imeplentations are currently active on your system, use the following command:
+
To see which OpenCL implementations are currently active on your system, use the following command:
{{Cli|$ ls /etc/OpenCL/vendors}}
+
{{bc|$ ls /etc/OpenCL/vendors}}
  
 
====AMD====
 
====AMD====
 
OpenCL implementation from AMD is known as [http://developer.amd.com/sdks/AMDAPPSDK/Pages/default.aspx AMD APP SDK], formerly also known as AMD Stream SDK or ATi Stream.
 
OpenCL implementation from AMD is known as [http://developer.amd.com/sdks/AMDAPPSDK/Pages/default.aspx AMD APP SDK], formerly also known as AMD Stream SDK or ATi Stream.
  
For Arch Linux, AMD APP SDK is currently available in AUR as {{Package AUR|amdstream}}.
+
For Arch Linux, AMD APP SDK is currently available in AUR as {{AUR|amdapp-sdk}}.
This package is installed as {{Filename|/opt/amdstream}} and apart from SDK files it also contains a profiler ({{Filename|/opt/amdstream/bin/sprofile}}) and a number of code samples ({{Filename|/opt/amdstream/samples/opencl}}). It also provides the {{Filename|clinfo}} utility which lists OpenCL platforms and devices present in the system and displays detailed information about them.
+
This package is installed as {{ic|/opt/AMDAPP}} and apart from SDK files it also contains a number of code samples ({{ic|/opt/AMDAPP/SDK/samples/}}). It also provides the {{ic|clinfo}} utility which lists OpenCL platforms and devices present in the system and displays detailed information about them.
  
As AMD APP SDK itself contains CPU OpenCL driver, no extra driver is needed to use execute OpenCL on CPU devices (regardless of it's vendor). GPU OpenCL drivers are provided by the {{Package AUR|catalyst}} package (an optional dependency), the open-source driver ({{Package Official|xf86-video-ati}}) does ''not'' support OpenCL.
+
As AMD APP SDK itself contains CPU OpenCL driver, no extra driver is needed to use execute OpenCL on CPU devices (regardless of its vendor). GPU OpenCL drivers are provided by the {{AUR|catalyst}} package (an optional dependency), the open-source driver ({{Pkg|xf86-video-ati}}) does ''not'' support OpenCL.
  
Code is compiled using {{Package Official|llvm}} (dependency).
+
Code is compiled using {{Pkg|llvm}} (dependency).
 +
 
 +
====Mesa (Gallium)====
 +
OpenCL support from Mesa is in development (see http://www.x.org/wiki/GalliumStatus/). AMD Radeon cards are supported by the r600g driver.
 +
 
 +
Arch Linux does currently (October 2013; Mesa 9.2.2; LLVM 3.3) not build Mesa with OpenCL support. See http://dri.freedesktop.org/wiki/GalliumCompute/ for installation instructions (use the development branches of LLVM and Mesa for optimal results).
 +
 
 +
Surprisingly, pyrit performs 20% better with radeon+r600g compared to Catalyst 13.11 Beta1 (tested with 7 other CPU cores):
 +
{{bc|<nowiki>catalyst    #1: 'OpenCL-Device 'Barts'': 21840.7 PMKs/s (RTT 2.8)
 +
radeon+r600g #1: 'OpenCL-Device 'AMD BARTS'': 26608.1 PMKs/s (RTT 3.0)</nowiki>}}
 +
At the time of this writing (30 October 2013), one must apply patches [http://people.freedesktop.org/~tstellar/pyrit-perf/0001-XXX-clover-Calculate-the-optimal-work-group-size.patch] and [http://people.freedesktop.org/~tstellar/pyrit-perf/0001-radeon-llvm-Specify-the-DataLayout-when-running-opti.patch] on top of Mesa commit ac81b6f2be8779022e8641984b09118b57263128 to get this performance improvement. The latest unpatched LLVM trunk was used (SVN rev 193660).
  
 
====Nvidia====
 
====Nvidia====
The Nvidia implementation is available in extra/{{Package Official|opencl-nvidia}}. It only supports Nvidia GPUs running the {{Package Official|nvidia}} kernel module (nouveau does not support OpenCL yet).
+
The Nvidia implementation is available in extra/{{Pkg|opencl-nvidia}}. It only supports Nvidia GPUs running the {{Pkg|nvidia}} kernel module (nouveau does not support OpenCL yet).
  
 
====Intel====
 
====Intel====
 
The Intel implementation, named simply [http://software.intel.com/en-us/articles/opencl-sdk/ Intel OpenCL SDK],  
 
The Intel implementation, named simply [http://software.intel.com/en-us/articles/opencl-sdk/ Intel OpenCL SDK],  
provides optimized OpenCL performance on Intel CPUs (mainly Core and Xeon) and CPUs only. There is no GPU support as Intel GPUs don't support OpenCL/GPGPU. Package is available in AUR: {{Package AUR|intel-opencl-sdk}}.
+
provides optimized OpenCL performance on Intel CPUs (mainly Core and Xeon) and CPUs only. There is no GPU support as Intel GPUs do not support OpenCL/GPGPU. Package is available in AUR: {{AUR|intel-opencl-sdk}}.
  
===Developement===
+
===Development===
For development of OpenCL-capable applications, full installation of the OpenCL framework including implementation, drivers and compiler plus the {{Package Official|opencl-headers}} package is needed. Link your code against <tt>libOpenCL</tt>.
+
For development of OpenCL-capable applications, full installation of the OpenCL framework including implementation, drivers and compiler plus the {{Pkg|opencl-headers}} package is needed. Link your code against {{ic|libOpenCL}}.
  
 +
====Language bindings====
 +
* '''C++''': A binding by Khronos is part of the official specs. It is included in {{Pkg|opencl-headers}}
 +
* '''C++/Qt''': An experimental binding named [http://qt.gitorious.org/qt-labs/opencl QtOpenCL] is in Qt Labs - see [http://labs.qt.nokia.com/2010/04/07/using-opencl-with-qt/ Blog entry] for more information
 +
* '''JavaScript/HTML5''': [http://www.khronos.org/webcl/ WebCL]
 +
* '''[[Python]]''': There are two bindings with the same name: PyOpenCL. One is in [extra]: {{Pkg|python2-pyopencl}}, for the other one see [http://sourceforge.net/projects/pyopencl/ sourceforge]
 +
* '''[[D]]''': [https://bitbucket.org/trass3r/cl4d/wiki/Home cl4d]
 +
* '''Haskell''': The OpenCLRaw package is available in AUR: {{AUR|haskell-openclraw}}
 +
* '''[[Java]]''': [http://jogamp.org/jocl/www/ JOCL] (a part of [http://jogamp.org/ JogAmp])
 +
* '''[[Mono|Mono/.NET]]''': [http://sourceforge.net/projects/opentk/ Open Toolkit]
  
 
==CUDA==
 
==CUDA==
Line 79: Line 98:
 
* optional:
 
* optional:
 
** additional libraries: CUBLAS, CUFFT, CUSPARSE, etc.
 
** additional libraries: CUBLAS, CUFFT, CUSPARSE, etc.
** CUDA toolkit, including the {{Filename|nvcc}} compiler
+
** CUDA toolkit, including the {{ic|nvcc}} compiler
 
** CUDA SDK, which contains many code samples and examples of CUDA and OpenCL programs
 
** CUDA SDK, which contains many code samples and examples of CUDA and OpenCL programs
  
The kernel module and CUDA "driver" library are shipped in extra/{{Package Official|nvidia}} and extra/{{Package Official|opencl-nvidia}}. The "runtime" library and the rest of the CUDA toolkit are available in unsupported/{{Package AUR|cuda-toolkit}}. The SDK has been packaged too ({{Package AUR|cuda-sdk}}), even if it is not required for developing in CUDA.
+
The kernel module and CUDA "driver" library are shipped in extra/{{Pkg|nvidia}} and extra/{{Pkg|opencl-nvidia}}. The "runtime" library and the rest of the CUDA toolkit are available in community/{{Pkg|cuda}}. The library is available [https://projects.archlinux.org/svntogit/community.git/commit/trunk?h=packages/cuda&id=1b62c8bcb9194b2de1b750bd62a8dce1e7e549f5 only in 64-bit version].
 +
 
 +
===Development===
 +
 
 +
When installing {{Pkg|cuda}} package you get the directory /opt/cuda created where all of the components "live". For compiling cuda code add /opt/cuda/include to your include path in the compiler instructions. For example this can be accomplished by adding -I/opt/cuda/include to the compiler flags/options.
 +
 
 +
===Language bindings===
 +
* '''Fortran''': [http://www.hoopoe-cloud.com/Solutions/Fortran/Default.aspx FORTRAN CUDA], [http://www.pgroup.com/resources/cudafortran.htm PGI CUDA Fortran Compiler]
 +
* '''[[Python]]''': In AUR: {{AUR|pycuda}}, also [http://psilambda.com/download/kappa-for-python Kappa]
 +
* '''Perl''': [http://psilambda.com/download/kappa-for-perl Kappa], [https://github.com/run4flat/perl-CUDA-Minimal CUDA-Minimal]
 +
* '''Haskell''': The CUDA package is available in AUR: {{AUR|haskell-cuda}}. There is also [http://hackage.haskell.org/package/accelerate The accelerate package]
 +
* '''Java''': [http://www.hoopoe-cloud.com/Solutions/jCUDA/Default.aspx jCUDA], [http://www.jcuda.org/jcuda/JCuda.html JCuda]
 +
* '''[[Mono|Mono/.NET]]''': [http://www.hoopoe-cloud.com/Solutions/CUDA.NET/Default.aspx CUDA.NET], [http://www.hybriddsp.com/ CUDAfy.NET]
 +
* '''[[Mathematica]]''': [http://reference.wolfram.com/mathematica/CUDALink/tutorial/Overview.html CUDAlink]
 +
* '''[[Ruby]]''', '''Lua''': [http://psilambda.com/products/kappa/ Kappa]
 +
 
 +
===Driver issues===
 +
 
 +
It might be necessary to use the legacy driver {{Pkg|nvidia-304xx}} or {{Pkg|nvidia-304xx-lts}} to resolve permissions issues when running CUDA programs on systems with multiple GPUs.
 +
 
 +
==List of OpenCL and CUDA accelerated software==
 +
{{Expansion}}
 +
* [[Bitcoin]]
 +
* [[GIMP]] (development in progress - see [http://www.phoronix.com/scan.php?page=news_item&px=OTc5OQ this notice])
 +
* {{AUR|Pyrit}}
 +
* {{AUR|aircrack-ng}}
 +
* {{AUR|cuda_memtest}} - a GPU memtest. Despite its name, is supports both CUDA and OpenCL
  
 
==Links and references==
 
==Links and references==

Revision as of 06:47, 17 November 2013

Tango-view-refresh-red.pngThis article or section is out of date.Tango-view-refresh-red.png

Reason: With new versions of OpenCL, the things have changed a little bit. (Discuss in Talk:GPGPU#)
Template:Article summary start

Template:Article summary text Template:Article summary heading Template:Article summary wiki Template:Article summary wiki Template:Article summary end

GPGPU stands for General-purpose computing on graphics processing units. In Linux, there are currently two major GPGPU frameworks: OpenCL and CUDA

OpenCL

Overview

OpenCL (Open Computing Language) is an open, royalty-free parallel programming framework developed by the Khronos Group, a non-profit consortium.

Distribution of the OpenCL framework generally consists of:

  • Library providing OpenCL API, known as libCL or libOpenCL (libOpenCL.so in linux)
  • OpenCL implementation(s), which contain:
    • Device drivers
    • OpenCL/C code compiler
    • SDK *
  • Header files *

* only needed for development

OpenCL library

There are several choices for the libCL. In general case, installing libcl from [extra] should do :

# pacman -S libcl

However, there are situations when another libCL distribution is more suitable. The following paragraph covers this more advanced topic.

The OpenCL ICD model

OpenCL offers the option to install multiple vendor-specific implementations on the same machine at the same time. In practice, this is implemented using the Installable Client Driver (ICD) model. The center point of this model is the libCL library which in fact imeplements ICD Loader. Through the ICD Loader, an OpenCL application is able to access all platforms and all devices present in the system.

Although itself vendor-agnostic, the ICD Loader still has to be provided by someone. In Archlinux, there are currently two options:

  • extra/libcl by Nvidia. Provides OpenCL version 1.0 and is thus slightly outdated. Its behaviour with OpenCL 1.1 code has not been tested as of yet.
  • unsupported/libopenclAUR by AMD. Provides up to date version 1.1 of OpenCL. It is currently distributed by AMD under a restrictive license and therefore could not have been pushed into official repo.

(There is also Intel's libCL, this one is currently not provided in a separate package though.)

Note: ICD Loader's vendor is mentioned only to identify each loader, it is otherwise completely irrelevant. ICD Loaders are vendor-agnostic and may be used interchangeably
(as long as they are implemented correctly)

For basic usage, extra/libcl is recommended as its installation and updating is convenient. For advanced usage, libopencl is recommended. Both libcl and libopencl should still work with all the implementations.

Implementations

To see which OpenCL implementations are currently active on your system, use the following command:

$ ls /etc/OpenCL/vendors

AMD

OpenCL implementation from AMD is known as AMD APP SDK, formerly also known as AMD Stream SDK or ATi Stream.

For Arch Linux, AMD APP SDK is currently available in AUR as amdapp-sdkAUR. This package is installed as /opt/AMDAPP and apart from SDK files it also contains a number of code samples (/opt/AMDAPP/SDK/samples/). It also provides the clinfo utility which lists OpenCL platforms and devices present in the system and displays detailed information about them.

As AMD APP SDK itself contains CPU OpenCL driver, no extra driver is needed to use execute OpenCL on CPU devices (regardless of its vendor). GPU OpenCL drivers are provided by the catalystAUR package (an optional dependency), the open-source driver (xf86-video-ati) does not support OpenCL.

Code is compiled using llvm (dependency).

Mesa (Gallium)

OpenCL support from Mesa is in development (see http://www.x.org/wiki/GalliumStatus/). AMD Radeon cards are supported by the r600g driver.

Arch Linux does currently (October 2013; Mesa 9.2.2; LLVM 3.3) not build Mesa with OpenCL support. See http://dri.freedesktop.org/wiki/GalliumCompute/ for installation instructions (use the development branches of LLVM and Mesa for optimal results).

Surprisingly, pyrit performs 20% better with radeon+r600g compared to Catalyst 13.11 Beta1 (tested with 7 other CPU cores):

catalyst     #1: 'OpenCL-Device 'Barts'': 21840.7 PMKs/s (RTT 2.8)
radeon+r600g #1: 'OpenCL-Device 'AMD BARTS'': 26608.1 PMKs/s (RTT 3.0)

At the time of this writing (30 October 2013), one must apply patches [1] and [2] on top of Mesa commit ac81b6f2be8779022e8641984b09118b57263128 to get this performance improvement. The latest unpatched LLVM trunk was used (SVN rev 193660).

Nvidia

The Nvidia implementation is available in extra/opencl-nvidia. It only supports Nvidia GPUs running the nvidia kernel module (nouveau does not support OpenCL yet).

Intel

The Intel implementation, named simply Intel OpenCL SDK, provides optimized OpenCL performance on Intel CPUs (mainly Core and Xeon) and CPUs only. There is no GPU support as Intel GPUs do not support OpenCL/GPGPU. Package is available in AUR: intel-opencl-sdkAUR.

Development

For development of OpenCL-capable applications, full installation of the OpenCL framework including implementation, drivers and compiler plus the opencl-headers package is needed. Link your code against libOpenCL.

Language bindings

CUDA

CUDA (Compute Unified Device Architecture) is Nvidia's proprietary, closed-source parallel computing architecture and framework. It is made of several components:

  • required:
    • proprietary Nvidia kernel module
    • CUDA "driver" and "runtime" libraries
  • optional:
    • additional libraries: CUBLAS, CUFFT, CUSPARSE, etc.
    • CUDA toolkit, including the nvcc compiler
    • CUDA SDK, which contains many code samples and examples of CUDA and OpenCL programs

The kernel module and CUDA "driver" library are shipped in extra/nvidia and extra/opencl-nvidia. The "runtime" library and the rest of the CUDA toolkit are available in community/cuda. The library is available only in 64-bit version.

Development

When installing cuda package you get the directory /opt/cuda created where all of the components "live". For compiling cuda code add /opt/cuda/include to your include path in the compiler instructions. For example this can be accomplished by adding -I/opt/cuda/include to the compiler flags/options.

Language bindings

Driver issues

It might be necessary to use the legacy driver nvidia-304xx or nvidia-304xx-lts to resolve permissions issues when running CUDA programs on systems with multiple GPUs.

List of OpenCL and CUDA accelerated software

Tango-view-fullscreen.pngThis article or section needs expansion.Tango-view-fullscreen.png

Reason: please use the first argument of the template to provide a brief explanation. (Discuss in Talk:GPGPU#)

Links and references