Intel GVT-g

From ArchWiki
Jump to navigation Jump to search

Intel GVT-g is a technology that provides mediated device passthrough for Intel GPUs (Broadwell and newer). It can be used to virtualize the GPU for multiple guest VMs, effectively providing near-native graphics performance in the VM and still letting your host use the virtualized GPU normally. This is useful if you want accelerated graphics in Windows VMs running on ultrabooks without dedicated GPUs for full device passthrough. (Similar technologies exist for NVIDIA and AMD GPUs, but they're available only in the "professional" GPU lines like Quadro, Radeon Pro and so on.)

There is also a variant of this technology called GVT-d - it is essentially Intel's name for full device passthrough with the vfio-pci driver. With GVT-d, the host cannot use the virtualized GPU.

Prerequisite

You'll have to create a virtual GPU (vGPU) first, then assign it to your VM. The guest with a vGPU sees it as a "regular" GPU - just install the latest native drivers. (The vGPU actually does need specialized drivers to work correctly, but all the required changes are present in the latest upstream Linux/Windows drivers.)

You'll need to:

  • Use at least Linux 4.16 and QEMU 2.12.
  • Enable IOMMU by adding intel_iommu=on to your kernel parameters.
  • Enable kernel modules: kvmgt, vfio-iommu-type1 and vfio-mdev.
  • Set i915 module parameter enable_gvt=1 to enable GPU virtualization.
  • Find the PCI address and domain number of your GPU ($GVT_PCI and $GVT_DOM in commands below), as it resides in /sys/bus/pci/devices. It looks like this: 0000:00:02.0 - you can look it up by running lspci -D -nn, looking for VGA compatible controller: Intel Corporation HD Graphics ... and noting down the address on the left.
  • Generate the vGPU GUID ($GVT_GUID in commands below) which you'll use to create and assign the vGPU. A single virtual GPU can be assigned only to a single VM - create as many GUIDs as you want vGPUs. (You can do so by running uuidgen.)

After rebooting with the i915.enable_gvt=1 flag, you should be able to create vGPUs - there are multiple vGPU types you can create, which mainly differ in the amount of resources dedicated to that vGPU. You can look up what types are available in your system (and cat description inside of each type to discover what it's capable of) like this:

# ls /sys/devices/pci${GVT_DOM}/$GVT_PCI/mdev_supported_types
i915-GVTg_V5_1  # Video memory: <512MB, 2048MB>, resolution: up to 1920x1200
i915-GVTg_V5_2  # Video memory: <256MB, 1024MB>, resolution: up to 1920x1200
i915-GVTg_V5_4  # Video memory: <128MB, 512MB>, resolution: up to 1920x1200
i915-GVTg_V5_8  # Video memory: <64MB, 384MB>, resolution: up to 1024x768

Pick a type you want to use - we'll refer to it as $GVT_TYPE below. Use the GUID you've created to create a vGPU with a chosen type:

# echo "$GVT_GUID" > "/sys/devices/pci${GVT_DOM}/$GVT_PCI/mdev_supported_types/$GVT_TYPE/create"

You can repeat this as many times as you want with different GUIDs. All created vGPUs will land in /sys/bus/pci/devices/$GVT_PCI/ - if you'd like to remove a vGPU, you can do:

# echo 1 > /sys/bus/pci/devices/$GVT_PCI/$GVT_GUID/remove

libvirt qemu hook

With libvirt, a libvirt qemu hook can be used to automatically create the vGPU when the machine is started, and to remove it when the machine is stopped. Replace the variables with the values you found above and the DOMAIN with the name of the machine.

/etc/libvirt/hooks/qemu
#!/bin/bash
GVT_PCI=<GVT_PCI>
GVT_GUID=<GVT_GUID>
MDEV_TYPE=<GVT_TYPE>
DOMAIN=<DOMAIN name>
if [ $# -ge 3 ]; then
    if [ $1 = "$DOMAIN" -a $2 = "prepare" -a $3 = "begin" ]; then
        echo "$GVT_GUID" > "/sys/bus/pci/devices/$GVT_PCI/mdev_supported_types/$MDEV_TYPE/create"
    elif [ $1 = "$DOMAIN" -a $2 = "release" -a $3 = "end" ]; then
        echo 1 > /sys/bus/pci/devices/$GVT_PCI/$GVT_GUID/remove
    fi
fi
Note: If you use libvirt user session, you need to tweak the script to use privilege escalation commands, such as pkexec or a no-password sudo.
Note: The XML of the domain is feed to the hook script through stdin. You can use xmllint and XPath expression to extract GVT_GUID from stdin, e.g.:
GVT_GUID="$(xmllint --xpath 'string(/domain/devices/hostdev[@type="mdev"][@display="on"]/source/address/@uuid)' -)"

Assign vGPU to VM

If you run qemu or libvirtd as a regular user, it may complain that some path /dev/vfio/[number] is not writeable. You need to enable write access to that path for the account, with chmod or setfacl.

QEMU CLI

Note: KVM must be enabled by -enable-kvm.

To create a VM with the virtualized GPU, add this parameter to the QEMU command line:

-device vfio-pci,sysfsdev=/sys/bus/mdev/devices/$GVT_GUID

libvirt

Add the following device to the <devices> element of the VM definition:

$ virsh edit [vmname]
...
    <hostdev mode='subsystem' type='mdev' managed='no' model='vfio-pci' display='off'>
      <source>
        <address uuid='GVT_GUID'/>
      </source>
    </hostdev>
...

Replace GVT_GUID with the UUID of your vGPU.

Getting vGPU display contents

There are several possible ways to retrieve the display contents from the vGPU.

Using DMA-BUF display

Warning: According to this issue, this method will not work with UEFI guests using (unmodified) OVMF. See below for patches/workarounds.

QEMU CLI

Add display=on,x-igd-opregion=on to the end of -device vfio-pci parameter, e.g.:

 -device vfio-pci,sysfsdev=/sys/bus/mdev/devices/$GVT_GUID,display=on,x-igd-opregion=on

libvirt

First, modify the XML schema of the VM definition so that we can use QEMU-specific elements later. Change

$ virsh edit [vmname]
<domain type='kvm'>

to

$ virsh edit [vmname]
<domain type='kvm' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>

Then add this configuration to the end of the <domain> element, i. e. insert this text right above the closing </domain> tag:

$ virsh edit [vmname]
...
  <qemu:commandline>
    <qemu:arg value='-set'/>
    <qemu:arg value='device.hostdev0.x-igd-opregion=on'/>
 </qemu:commandline>
...

Using DMA-BUF with UEFI/OVMF

As stated above, DMA-BUF display will not work with UEFI-based guests using (unmodified) OVMF because it won't create the necessary ACPI OpRegion exposed via QEMU's nonstandard fw_cfg interface. See this OVMF bug for details of this issue.

According to this GitHub comment, the OVMF bug report suggests several solutions to the problem. It is possible to:

  • patch OVMF (details) to add an Intel-specific quirk (most straightforward but non-upstreamable solution);
  • patch the host kernel (details) to automatically provide an option ROM for the vGPU containing basically the same code but in option ROM format;
  • extract the OpROM from the kernel patch (source) and feed it to QEMU as an override.

We will go with the last option because it does not involve patching anything. (Note: if the link and the archive go down, the OpROM can be extracted from the kernel patch by hand.)

Download vbios_gvt_uefi.rom and place it somewhere world-accessible (we will use / to make an example).

libvirt

Then edit the VM definition, appending this configuration to the <qemu:commandline> element we added earlier:

$ virsh edit [vmname]
...
    <qemu:arg value='-set'/>
    <qemu:arg value='device.hostdev0.romfile=/vbios_gvt_uefi.rom'/>
...

Enable RAMFB display (optional)

This should be combined with the above DMA-BUF configuration in order to also display everything that happens before the guest Intel driver is loaded (i.e. POST, the firmware interface, and the guest initialization).

QEMU CLI

Add ramfb=on,driver=vfio-pci-nohotplug to the end of -device vfio-pci parameter, e.g.:

-device vfio-pci,sysfsdev=/sys/bus/mdev/devices/$GVT_GUID,display=on,x-igd-opregion=on,ramfb=on,driver=vfio-pci-nohotplug

libvirt

First, follow the first step of this section to modify the XML schema.

Then add this configuration to the end of the <domain> element, i.e. insert this text right above the closing </domain> tag:

$ virsh edit [vmname]
...
 <qemu:commandline>
    <qemu:arg value='-set'/>
    <qemu:arg value='device.hostdev0.ramfb=on'/>
    <qemu:arg value='-set'/>
    <qemu:arg value='device.hostdev0.driver=vfio-pci-nohotplug'/>
 </qemu:commandline>
...

Display vGPU output

Due to an issue with spice-gtk, the configuration is different depending on the SPICE client EGL implementation.

Output using QEMU display (QEMU CLI only)

Add -display gtk,gl=on to the command line. The QEMU VGA adapter can be disabled by adding -vga none, or you have two virtual screens, and the one connected to the QEMU VGA adapter is blank.

Output using SPICE with MESA EGL

QEMU CLI

Add -display spice-app,gl=on to the command line. virt-viewer must be installed.

libvirt

  1. Ensure the above added <hostdev> device have the display attribute set to 'on'.
  2. Remove all <graphics> and <video> devices.
  3. Add the following devices:
$ virsh edit [vmname]
...
    <graphics type='spice'>
      <listen type='none'/>
      <gl enable='yes'/>
    </graphics>
    <video>
      <model type='none'/>
    </video>
...

The is an optional attribute rendernode in the gl tag to allow specify the renderer, e.g.:

 <gl enable='yes' rendernode='/dev/dri/by-path/pci-0000:00:02.0-render'/>

Output using SPICE with NVIDIA EGL or VNC

libvirt

  1. Ensure the above added <hostdev> device have the display attribute set to 'on'.
  2. Remove all <graphics> and <video> devices.
  3. Add the following devices:
$ virsh edit [vmname]
...
    <graphics type='spice' autoport='yes'>
      <listen type='address'/>
    </graphics>
    <graphics type='egl-headless'/>
    <video>
      <model type='none'/>
    </video>
...

The <graphics type='spice'> type can be changed to 'vnc' to use VNC instead.

Also there is an optional tag <gl> inside <graphics type='egl-headless'> tag to force a specific renderer, do not put inside the 'spice' graphics due the mentioned bug, example:

   <graphics type='egl-headless'>
     <gl rendernode='/dev/dri/by-path/pci-0000:00:02.0-render'/>
   </graphics>

Disable all outputs

If all outputs are disabled, the only way to see the display output would then be using a software server like RDP, VNC or Looking Glass. To see more details in using Looking Glass to stream guest screen to the host.

QEMU CLI

In the -device vfio-pci parameter, remove ramfb=on and change to display=off. Add -vga none to disable the QEMU VGA adapter.

libvirt

To ensure no emulated GPU is added, one can edit the VM configuration and do the following changes:

  1. Remove all <graphics> devices.
  2. Change the <video> device to be type 'none'.
  3. Ensure the above added <hostdev> device have the display attribute set to 'off'.

Troubleshooting

Missing mdev_supported_types directory

If you have followed instructions and added i915.enable_gvt=1 kernel parameter, but there is still no /sys/bus/pci/devices/0000:02:00.0/mdev_supported_types directory, probably your hardware is not supported. Check dmesg log for this message:

$ dmesg | grep -i gvt 
[    4.227468] [drm] Unsupported device. GVT-g is disabled

If that is the case, you may want to check upstream for support plans. For example, for the "Coffee Lake" (CFL) platform support, see https://github.com/intel/gvt-linux/issues/53

Windows hanging with bad memory error

If Windows is hanging due to a Bad Memory error look for more details via dmesg. If the logs show something like rlimit memory exceeded, you may need to increase the max memory linux allows qemu to allocate. Assuming you are in the group kvm, adding the following to /etc/security/limits.conf and restarting the PC fixed the errors for me.

   # qemu kvm, need high memlock to allocate mem for vga-passthrough
   @kvm	hard	memlock	8388608
   @kvm	soft	memlock	8388608

See also