AMDGPU
AMDGPU is the open source graphics driver for AMD Radeon graphics cards since the Graphics Core Next family.
Selecting the right driver
Depending on the card you have, find the right driver in Xorg#AMD. This driver supports Southern Islands (SI) cards and later. AMD has no plans to support pre-GCN GPUs. Owners of unsupported GPUs may use the open source ATI driver.
Installation
Install the mesa package, which provides both the DRI driver for 3D acceleration and VA-API/VDPAU drivers for accelerated video decoding.
- For 32-bit application support, also install the lib32-mesa package from the multilib repository.
- For the DDX driver (which provides 2D acceleration in Xorg), install the xf86-video-amdgpu package.
- For Vulkan support:
- Test with only vulkan-radeon first: although not appearing as the first provider of vulkan-driver (due to its alphabetical order), it avoids some issues that have repeatedly been reported about amdvlk.
- When the amdvlk package is installed, it sets itself as the default Vulkan driver: see Vulkan#Selecting via environment variable if you need to have both drivers installed (e.g. when having issues with vulkan-radeon).
- Optionally, for 32-bit application support, install the lib32-vulkan-radeon or lib32-amdvlk package to match the native package installed.
Experimental
It may be worthwhile for some users to use the upstream experimental build of mesa.
Install the mesa-gitAUR package, which provides the DRI driver for 3D acceleration.
- For 32-bit application support, also install the lib32-mesa-gitAUR package from the mesa-git repository or the AUR.
- For the DDX driver (which provides 2D acceleration in Xorg), install the xf86-video-amdgpu-gitAUR package.
- For Vulkan support using the mesa-git repository, install the vulkan-radeon-git package. Optionally install the lib32-vulkan-radeon-git package for 32-bit application support. This should not be required if building mesa-gitAUR from the AUR.
Enable Southern Islands (SI) and Sea Islands (CIK) support
The linux package enables AMDGPU support for cards of the Southern Islands (HD 7000 Series, SI, ie. GCN 1) and Sea Islands (HD 8000 Series, CIK, ie. GCN 2). The amdgpu
kernel driver needs to be loaded before the radeon one. You can check which kernel driver is loaded by running lspci -k
. It should be like this:
$ lspci -k -d ::03xx
01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Curacao PRO [Radeon R7 370 / R9 270/370 OEM] Subsystem: Gigabyte Technology Co., Ltd Device 226c Kernel driver in use: amdgpu Kernel modules: radeon, amdgpu
If the amdgpu
driver is not in use, follow instructions in the next section.
Load amdgpu driver
The module parameters of both amdgpu
and radeon
modules are cik_support=
and si_support=
.
They need to be set as kernel parameters or in a modprobe configuration file, and depend on the cards GCN version.
You can use both parameters if you are unsure which kernel card you have.
[..] amdgpu 0000:01:00.0: Use radeon.cik_support=0 amdgpu.cik_support=1 to override
.Set module parameters in kernel command line
Set one of the following kernel parameters:
- Southern Islands (SI):
radeon.si_support=0 amdgpu.si_support=1
- Sea Islands (CIK):
radeon.cik_support=0 amdgpu.cik_support=1
Specify the correct module order
Make sure amdgpu
has been set as first module in the Mkinitcpio#MODULES array, e.g. MODULES=(amdgpu radeon)
.
Set kernel module parameters
For Southern Islands (SI) use the si_support=1
kernel module parameter, for Sea Islands (CIK) use cik_support=1
:
/etc/modprobe.d/amdgpu.conf
options amdgpu si_support=1 options amdgpu cik_support=1
/etc/modprobe.d/radeon.conf
options radeon si_support=0 options radeon cik_support=0
Make sure modconf
is in the HOOKS
array in /etc/mkinitcpio.conf
and regenerate the initramfs.
Compile kernel which supports amdgpu driver
When building or compiling a kernel, CONFIG_DRM_AMDGPU_SI=Y
and/or CONFIG_DRM_AMDGPU_CIK=Y
should be set in the config.
Disable loading radeon completely at boot
The kernel may still probe and load the radeon
module depending on the specific graphics chip involved, but the module is unnecessary to have loaded after confirming amdgpu
works as expected. Reboot between each step to confirm it works before moving to the next step:
- Use the module parameters on the kernel commandline method to ensure
amdgpu
works as expected - Use the
MODULES=(amdgpu)
mkinitcpio method but do not addradeon
to the configuration - Test that
modprobe -r radeon
will remove the kernel module cleanly after logged into the desktop - Blacklist the
radeon
module from being probed by the kernel during second stage boot:
/etc/modprobe.d/radeon.conf
blacklist radeon
The output of lsmod
and dmesg
should now only show the amdgpu driver loading, radeon should not be present. The directory /sys/module/radeon
should not exist.
ACO compiler
The ACO compiler is an open source shader compiler created and developed by Valve Corporation to directly compete with the LLVM compiler, the AMDVLK drivers, as well as Windows 10. It offers lesser compilation time and also performs better while gaming than LLVM and AMDVLK.
Some benchmarks can be seen on GitHub and Phoronix (1) (2) (3).
Since mesa version 20.2 ACO is the default shader compiler.
Loading
The amdgpu
kernel module is supposed to load automatically on system boot.
If it does not:
- Make sure to #Enable Southern Islands (SI) and Sea Islands (CIK) support when needed.
- Make sure you have the latest linux-firmware package installed. This driver requires the latest firmware for each model to successfully boot.
- Make sure you do not have
nomodeset
orvga=
as a kernel parameter, sinceamdgpu
requires KMS. - Check that you have not disabled
amdgpu
by using any kernel module blacklisting.
It is possible it loads, but late, after the X server requires it. In this case see Kernel mode setting#Early KMS start.
Xorg configuration
Xorg will automatically load the driver and it will use your monitor's EDID to set the native resolution. Configuration is only required for tuning the driver.
If you want manual configuration, create /etc/X11/xorg.conf.d/20-amdgpu.conf
, and add the following:
/etc/X11/xorg.conf.d/20-amdgpu.conf
Section "OutputClass" Identifier "AMD" MatchDriver "amdgpu" Driver "amdgpu" EndSection
Using this section, you can enable features and tweak the driver settings, see amdgpu(4) first before setting driver options.
Tear free rendering
TearFree controls tearing prevention using the hardware page flipping mechanism. By default, TearFree will be on for rotated outputs, outputs with RandR transforms applied, and for RandR 1.4 slave outputs, and off for everything else. Or you can configure it to be always on or always off with true
or false
respectively.
Option "TearFree" "true"
You can also enable TearFree temporarily with xrandr:
$ xrandr --output output --set TearFree on
Where output
should look like DisplayPort-0
or HDMI-A-0
and can be acquired by running xrandr -q
.
DRI level
DRI sets the maximum level of DRI to enable. Valid values are 2 for DRI2 or 3 for DRI3. The default is 3 for DRI3 if the Xorg version is >= 1.18.3, otherwise DRI2 is used:
Option "DRI" "3"
Variable refresh rate
10-bit color
Newer AMD cards support 10bpc color, but the default is 24-bit color and 30-bit color must be explicitly enabled. Enabling it can reduce visible banding/artifacts in gradients, if the applications support this too. To check if your monitor supports it search for "EDID" in your Xorg log file (e.g. /var/log/Xorg.0.log
or ~/.local/share/xorg/Xorg.0.log
):
[ 336.695] (II) AMDGPU(0): EDID for output DisplayPort-0 [ 336.695] (II) AMDGPU(0): EDID for output DisplayPort-1 [ 336.695] (II) AMDGPU(0): Manufacturer: DEL Model: a0ec Serial#: 123456789 [ 336.695] (II) AMDGPU(0): Year: 2018 Week: 23 [ 336.695] (II) AMDGPU(0): EDID Version: 1.4 [ 336.695] (II) AMDGPU(0): Digital Display Input [ 336.695] (II) AMDGPU(0): 10 bits per channel
To check whether it is currently enabled search for "Depth"):
[ 336.618] (**) AMDGPU(0): Depth 30, (--) framebuffer bpp 32 [ 336.618] (II) AMDGPU(0): Pixel depth = 30 bits stored in 4 bytes (32 bpp pixmaps)
With the default configuration it will instead say the depth is 24, with 24 bits stored in 4 bytes.
To check whether 10-bit works, exit Xorg if you have it running and run Xorg -retro
which will display a black and white grid, then press Ctrl-Alt-F1
and Ctrl-C
to exit X, and run Xorg -depth 30 -retro
. If this works fine, then 10-bit is working.
To launch in 10-bit via startx
, use startx -- -depth 30
. To permanently enable it, create or add to:
/etc/X11/xorg.conf.d/20-amdgpu.conf
Section "Screen" Identifier "asdf" DefaultDepth 30 EndSection
Reduce output latency
If you want to minimize latency you can disable page flipping and tear free:
/etc/X11/xorg.conf.d/20-amdgpu.conf
Section "OutputClass" Identifier "AMD" MatchDriver "amdgpu" Driver "amdgpu" Option "EnablePageFlip" "off" Option "TearFree" "false" EndSection
See Gaming#Reducing DRI latency to further reduce latency.
Features
Video acceleration
See Hardware video acceleration#AMD/ATI.
Monitoring
Monitoring your GPU is often used to check the temperature and also the P-states of your GPU.
CLI
- amdgpu_top — Tool to display AMDGPU usage
- nvtop — GPUs process monitoring for AMD, Intel and NVIDIA
- radeontop — A GPU utilization viewer, both for the total activity percent and individual blocks
GUI
- amdgpu_top — Tool to display AMDGPU usage
- AmdGuid — A basic fan control GUI fully written in Rust.
- TuxClocker — A Qt5 monitoring and overclocking tool.
Manually
To check your GPU's P-states, execute:
$ cat /sys/class/drm/card0/device/pp_od_clk_voltage
To monitor your GPU, execute:
# watch -n 0.5 cat /sys/kernel/debug/dri/0/amdgpu_pm_info
To check your GPU utilization, execute:
$ cat /sys/class/drm/card0/device/gpu_busy_percent
To check your GPU frequency, execute:
$ cat /sys/class/drm/card0/device/pp_dpm_sclk
To check your GPU temperature, execute:
$ cat /sys/class/drm/card0/device/hwmon/hwmon*/temp1_input
To check your VRAM frequency, execute:
$ cat /sys/class/drm/card0/device/pp_dpm_mclk
To check your VRAM usage, execute:
$ cat /sys/class/drm/card0/device/mem_info_vram_used
To check your VRAM size, execute:
$ cat /sys/class/drm/card0/device/mem_info_vram_total
Overclocking
Since Linux 4.17, once you have enabled the features at boot below, it is possible to adjust clocks and voltages of the graphics card via /sys/class/drm/card0/device/pp_od_clk_voltage
.
Boot parameter
It is required to unlock access to adjust clocks and voltages in sysfs by appending the Kernel parameter amdgpu.ppfeaturemask=0xffffffff
.
Not all bits are defined, and new features may be added over time. Setting all 32 bits may enable unstable features that cause problems such as screen flicker or broken resume from suspend. It should be sufficient to set the PP_OVERDRIVE_MASK bit, 0x4000, in combination with the default ppfeaturemask. To compute a reasonable parameter for your system, execute:
$ printf 'amdgpu.ppfeaturemask=0x%x\n' "$(($(cat /sys/module/amdgpu/parameters/ppfeaturemask) | 0x4000))"
Manual (default)
/sys/class/drm/...
are just symlinks and may change between reboots. Persistent locations can be found in /sys/devices/
, e.g. /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/
. Adjust the commands accordingly for a reliable result.For in-depth information on all possible options, read the kernel documentation for amdgpu thermal control.
To set the GPU clock for the maximum P-state 7 on e.g. a Polaris GPU to 1209MHz and 900mV voltage, run:
# echo "s 7 1209 900" > /sys/class/drm/card0/device/pp_od_clk_voltage
The same procedure can be applied to the VRAM, e.g. maximum P-state 2 on Polaris 5xx series cards:
# echo "m 2 1850 850" > /sys/class/drm/card0/device/pp_od_clk_voltage
To apply, run:
# echo "c" > /sys/class/drm/card0/device/pp_od_clk_voltage
To check if it worked out, read out clocks and voltage under 3D load:
# watch -n 0.5 cat /sys/kernel/debug/dri/0/amdgpu_pm_info
You can reset to the default values using:
# echo "r" > /sys/class/drm/card0/device/pp_od_clk_voltage
It is also possible to forbid the driver from switching to certain P-states, e.g. to workaround problems with deep powersaving P-states, such as flickering artifacts or stutter. To force the highest VRAM P-state on a card, while still allowing the GPU itself to run with lower clocks, first find the highest possible P-state, then set it:
# cat /sys/class/drm/card0/device/pp_dpm_mclk
0: 96Mhz * 1: 456Mhz 2: 675Mhz 3: 1000Mhz
# echo "manual" > /sys/class/drm/card0/device/power_dpm_force_performance_level # echo "3" > /sys/class/drm/card0/device/pp_dpm_mclk
Allow only the three highest GPU P-states:
# echo "5 6 7" > /sys/class/drm/card0/device/pp_dpm_sclk
To set the allowed maximum power consumption of the GPU to e.g. 50 Watts, run:
# echo 50000000 > /sys/class/drm/card0/device/hwmon/hwmon0/power1_cap
Assisted
If you are not inclined to fully manually overclock your GPU, there are some overclocking tools that are offered by the community to assist you to overclock and monitor your AMD GPU.
CLI tools
- amdgpu-clocks — A script that can be used to monitor and set custom power states for AMD GPUs. It also offers a Systemd service to apply the settings automatically upon boot.
GUI tools
- TuxClocker — A Qt5 monitoring and overclocking tool.
- CoreCtrl — A GUI overclocking tool with a WattMan-like UI that supports per-application profiles.
- LACT — A GTK tool to view information and control your AMD GPU.
Startup on boot
One way is to use systemd units, if you want your settings to apply automatically upon boot, consider looking at this Reddit thread to configure and apply your settings on boot.
Another way is to use udev rules for some of the values, for example, to set a low performance level to save energy:
/etc/udev/rules.d/30-amdgpu-low-power.rules
SUBSYSTEM=="pci", DRIVER=="amdgpu", ATTR{power_dpm_force_performance_level}="low"
Performance levels
AMDGPU offers several performance levels, the file power_dpm_force_performance_level is used for this, it is possible to select between these levels:
- auto: dynamically select the optimal power profile for current conditions in the driver.
- low: clocks are forced to the lowest power state.
- high: clocks are forced to the highest power state.
- manual: user can manually adjust which power states are enabled for each clock domain (used for setting #Power profiles)
- profile_standard, profile_min_sclk, profile_min_mclk, profile_peak: clock and power gating are disabled and the clocks are set for different profiling cases. This mode is recommended for profiling specific work loads
To set the AMDGPU device to use a low performance level, the following command can be executed:
# echo "low" > /sys/class/drm/card0/device/power_dpm_force_performance_level
Power profiles
AMDGPU offers several optimizations via power profiles, one of the most commonly used is the compute mode for OpenCL intensive applications. Available power profiles can be listed with:
cat /sys/class/drm/card0/device/pp_power_profile_mode
NUM MODE_NAME SCLK_UP_HYST SCLK_DOWN_HYST SCLK_ACTIVE_LEVEL MCLK_UP_HYST MCLK_DOWN_HYST MCLK_ACTIVE_LEVEL 0 BOOTUP_DEFAULT: - - - - - - 1 3D_FULL_SCREEN: 0 100 30 0 100 10 2 POWER_SAVING: 10 0 30 - - - 3 VIDEO: - - - 10 16 31 4 VR: 0 11 50 0 100 10 5 COMPUTE *: 0 5 30 10 60 25 6 CUSTOM: - - - - - -
card0
identifies a specific GPU in your machine, in case of multiple GPUs be sure to address the right one.To use a specific power profile you should first enable manual control over them with:
# echo "manual" > /sys/class/drm/card0/device/power_dpm_force_performance_level
Then to select a power profile by writing the NUM field associated with it, e.g. to enable COMPUTE run:
# echo "5" > /sys/class/drm/card0/device/pp_power_profile_mode
Enable GPU display scaling
To avoid the usage of the scaler which is built in the display, and use the GPU own scaler instead, when not using the native resolution of the monitor, execute:
$ xrandr --output output --set "scaling mode" scaling_mode
Possible values for "scaling mode"
are: None
, Full
, Center
, Full aspect
.
- To show the available outputs and settings, execute:
$ xrandr --prop
- To set
scaling mode = Full aspect
for just every available output, execute:
$ for output in $(xrandr --prop | grep -E -o -i "^[A-Z\-]+-[0-9]+"); do xrandr --output "$output" --set "scaling mode" "Full aspect"; done
Troubleshooting
Module parameters
The amdgpu module stashes several config parameters (modinfo amdgpu | grep mask
) in masks that are only documented in the kernel sources.
Xorg or applications will not start
- "(EE) AMDGPU(0): [DRI2] DRI2SwapBuffers: drawable has no back or front?" error after opening glxgears, can open Xorg server but OpenGL applications crash.
- "(EE) AMDGPU(0): Given depth (32) is not supported by amdgpu driver" error, Xorg will not start.
Setting the screen's depth under Xorg to 16 or 32 will cause problems/crash. To avoid that, you should use a standard screen depth of 24 by adding this to your "screen" section:
/etc/X11/xorg.conf.d/10-screen.conf
Section "Screen" Identifier "Screen" DefaultDepth 24 SubSection "Display" Depth 24 EndSubSection EndSection
Screen artifacts and frequency problem
Dynamic power management may cause screen artifacts to appear when displaying to monitors at higher frequencies (anything above 60Hz) due to issues in the way GPU clock speeds are managed[1][2].
A workaround [3] is saving high
or low
in /sys/class/drm/card0/device/power_dpm_force_performance_level
.
To make it persistent, you may create a udev rule:
/etc/udev/rules.d/30-amdgpu-pm.rules
KERNEL=="card0", SUBSYSTEM=="drm", DRIVERS=="amdgpu", ATTR{device/power_dpm_force_performance_level}="high"
To determine the KERNEL
name execute:
$ find /sys/class/drm/ -regextype awk -regex '.+/card[0-9]+' -printf '%f\n'
There is also a GUI solution [4] where you can manage the "power_dpm" with radeon-profile-gitAUR and radeon-profile-daemon-gitAUR.
Artifacts in Chromium
If you see artifacts in Chromium, forcing the vulkan-based backend might help. Go to chrome://flags
and enable #ignore-gpu-blocklist
and #enable-vulkan
.
R9 390 series poor performance and/or instability
If you experience issues [5] with a AMD R9 390 series graphics card, set radeon.cik_support=0 radeon.si_support=0 amdgpu.cik_support=1 amdgpu.si_support=1 amdgpu.dc=1
as kernel parameters to force the use of amdgpu driver instead of radeon.
If it still does not work, disabling DPM might help, add radeon.cik_support=0 radeon.si_support=0 amdgpu.cik_support=1 amdgpu.si_support=1
to the kernel parameters.
Freezes with "[drm] IP block:gmc_v8_0 is hung!" kernel error
If you experience freezes and kernel crashes during a GPU intensive task with the kernel error " [drm] IP block:gmc_v8_0 is hung!" [6], a workaround is to set amdgpu.vm_update_mode=3
as kernel parameters to force the GPUVM page tables update to be done using the CPU. Downsides are listed here [7].
Screen flickering white/gray
When you change resolution or connect to an external monitor, if the screen flickers or stays white, add amdgpu.sg_display=0
as a kernel parameter.
System freeze or crash when gaming on Vega cards
Dynamic power management may cause a complete system freeze whilst gaming due to issues in the way GPU clock speeds are managed. [8] A workaround is to disable dynamic power management, see ATI#Dynamic power management for details.
WebRenderer (Firefox) corruption
Artifacts and other anomalies may present themselves (e.g. inability to select extension options) when WebRenderer is force enabled by the user. Workaround is to fall back to OpenGL compositing.
Double-speed or "chipmunk" audio, or no audio when a 4K@60Hz device is connected
This is sometimes caused by a communication issue between an AMDGPU device and a 4K display connected over HDMI. A possible workaround is to enable HDR or "Ultra HD Deep Color" via the display's built-in settings. On many Android based TVs, this means setting this to "Standard" instead of "Optimal".
Issues with power management / dynamic re-activation of a discrete amdgpu graphics card
If you encounter issues where the kernel driver is loaded, but the discrete graphics card still is not available for games or becomes disabled during use (similar to [9]), you can workaround the issue by setting the kernel parameter amdgpu.runpm=0
, which prevents the dGPU from being powered down dynamically at runtime.
kfd: amdgpu: TOPAZ not supported in kfd
In the system journal or the kernel message keyring a critical level error message
kfd: amdgpu: TOPAZ not supported in kfd
may appear. If you are not planning to use Radeon Open Compute, this can be safely ignored. It is not supported in TOPAZ, as they are old GPUs. [10] [11]
High idle power draw due to MCLK locked at MAX (1000MHz), or MIN (96MHz) causing low game performance (on 6.4 kernel)
On high resolutions and refresh rates, the MCLK (vram / memory clock) may be locked at the highest clock rate (1000MHz) [12] [13] causing higher GPU idle power draw. On Linux kernel 6.4.x, MCLK clocks at the lowest (96MHz), causing low performance in games [14] [15].
This is likely due to a monitor not using Coordinated Video Timings (CVT) with a low V-Blank value for the affected resolutions and refresh rates, see this gist for a workaround.
Failure to suspend to RAM
The amdgpu
kernel module tries to buffer VRAM in RAM when the system enters S3 to prevent memory loss through VRAM decay which is not sufficiently refreshed.
If you are using a lot of VRAM and are short on free RAM this can fail despite sufficient SWAP memory would be available, because the IO subsystem might have been suspended before.
You will see something like:
kernel: systemd-sleep: page allocation failure: order:0, mode:0x100c02(GFP_NOIO|__GFP_HIGHMEM|__GFP_HARDWALL), nodemask=(null),cpuset=/,mems_allowed=0 kernel: Call Trace: kernel: <TASK> kernel: dump_stack_lvl+0x47/0x60 kernel: warn_alloc+0x165/0x1e0 kernel: __alloc_pages_slowpath.constprop.0+0xd7d/0xde0 kernel: __alloc_pages+0x32d/0x350 kernel: ttm_pool_alloc+0x19f/0x600 [ttm 0bd92a9d9dccc3a4f19554535860aaeda76eb4f4]
As a workaround, a userspace service can ensure to allocate enough RAM for the VRAM to be buffered by swapping out enough RAM before the system is suspended.
Failure to shut down and to suspend
hid_sensor_*_3d group of kernel modules can cause system lockups on bootup, shutdown, and suspend. Process list will show multiple instances of udev-worker
which then fail to freeze upon system sleep.
You will see something like:
kernel: PM: suspend entry (deep) kernel: Filesystems sync: 0.002 seconds kernel: Freezing user space processes kernel: Freezing user space processes failed after 20.004 seconds (1 tasks refusing to freeze, wq_busy=0): kernel: task:(udev-worker) state:D stack:0 pid:479 tgid:479 ppid:422 flags:0x00004006 kernel: Call Trace: kernel: <TASK> kernel: __schedule+0x3db/0x1520 kernel: ? srso_alias_return_thunk+0x5/0xfbef5 kernel: ? __wake_up_common+0x78/0xa0 kernel: ? srso_alias_return_thunk+0x5/0xfbef5
To work around this problem, blacklist the problematic modules by creating e.g. /etc/modprobe.d/blacklist-hid_sensors.conf
blacklist hid_sensor_accel_3d blacklist hid_sensor_gyro_3d blacklist hid_sensor_magn_3d