Talk:PCI passthrough via OVMF

From ArchWiki

Page rewrite

I've fiddled a lot with vfio in the last few months and I've been thinking about restructuring this page based on the information I've gathered over time. Considering large chunks of the page date all the way back from wen it was first written, and that the structure of the page isn't as researched as what you'd find on the rest of the wiki, I think a restructuration could greatly improve the overall understability of the page. Here's the overall structure I had in mind :

Prerequisites
Setting up IOMMU
Enabling IOMMU
Ensuring that the groups are valid
Common mistakes/Gotchas
Isolating the GPU
Using vfio-pci (Reccomended, Linux 4.1+)
Using pci-stub (Legacy)
Common mistakes/Gotchas
Setting up QEMU-kvm
Using libvirt
Using the qemu command line
Troubleshooting
Error 43 in Windows on Nvidia cards
"Invalid ROM content" in dmesg when trying to claim the guest GPU
...
Extras
ACS override patch
Improving performance on the VM

I've already written a good chunk of what's left, but I'd like some feedback on the proposed structure and what's already there before I proceed. Victoire (talk) 20:55, 8 April 2016 (UTC)Reply

Due to a lack of technical background, I can't really comment on completeness of your draft TOC compared to what the article covers, but it reads very structured. I had a look over the last edits you already committed to the article. Well done, I only want to give a couple of hints:
  1. Please take care when moving sections and editing, best split the move into its own commit. It is very difficult for someone else (or yourself later) to follow-up on what has been done (see ArchWiki:Contributing#Do not make complex edits at once).
  2. Point 1 is particularly important if you decide content is outdated and remove it during the restructuring. Usually we put Template:Deletion to the part first. Maybe you can do that as well. However, during a restructure the content might be in the way. You can also use a temporary section e.g. "Obsoleted sections" to move the sections to for the time being. If you delete parts, please do so only per subsection (edit button next to the header). This way it is easier to figure via history what went where.
  3. For the language, simple one: We avoid contractions (Help:Style#Language register)
  4. Looking at the existing article, there are some sections (e.g. PCI passthrough via OVMF#Complete example for QEMU .28CLI-based.29 without libvirtd) with very very long code blocks. Too long for good reading. Two alternatives for those (if you need them in the anticipated structure): Either move the long example code blocks to the end of the article (e.g. the Extras section) and crosslink them from above, quoting only the required excerpts. Or quote as well, but move the actual full code to a non-ad-spoiled gist (e.g. gist.github.com). If you move them outside the wiki, please do the removal/replacement with the gist link in one edit (same reason as for deletions, thanks).
  5. In the other talk items there are a few suggestions, e.g. #Article_Specificity and #Supported_Hardware_Table that might be useful to consider going forward.
That said, I hope your push to restructure and refresh the article gets input and help from other contributors to this article. --Indigo (talk) 18:49, 18 April 2016 (UTC)Reply
Thanks for all of the contributions, please remember to keep the scope of the page inside it's original design: to achieve PCI passthrough via OVMF. I would like to have the long blocks of code removed/relocated especially those that do not necessarily explain the actual process of getting passthrough to work. Thanks!
—This unsigned comment is by Naruni (talk) 20:42, 24 April 2016‎. Please sign your posts with ~~~~!
I think it may have lost some focus because OVMF was not explicitly pointed to in the intro. As suggested in #Article_Specificity, it may be useful to split content unrelated to OVMF into a separate article, e.g. PCI passthrough and this article may stay single or become a subpage for it (e.g. PCI passthrough/OVMF), whatever is best to keep focus, while not duplicating instructions or loose contributions which are not directly OVMF related.
If you look at the above structure Victoire has worked out, does it contain sections you would consider out-of-scope for OVMF?
--Indigo (talk) 13:04, 26 April 2016 (UTC)Reply
So I have reached the point where most of the structure is now in place and most of the work left is either related to adressing specific issues people are likely to encounter or rephrasing some parts to make them more readable, such as the part on setting up the guest OS, which could use some fleshing out.
I have also taken the liberty of adding a Performance Tuning section, which may or may not be out of place considering the scope of the article. I would appreciate some feedback on whether or this belongs here.
Also, both for the sake of readability and because I couldn't test the instructions myself, I removed most of the QEMU-related (without libvirt) instructions. While it seemed like a good decision at the time, I would like to get some feedback on this, as it is still a removal of potentially useful instructions (although some of them did seem dubtious).
Victoire (talk) 14:07, 9 May 2016 (UTC)Reply
I was hoping others with topic experience chime in to feedback, perhaps that happens still. Arguably the whole article is about performance tuning, so I would see the section you added in scope. It is an interesting read even for someone like me, who has not used any of this. For the QEMU point I'm unable to give feedback. --Indigo (talk) 08:02, 30 June 2016 (UTC)Reply
I agree that some of the QEMU scripts removed from the article were too detailed and still not explained well enough, but originally when I was setting up my passthrough system with the help of this article, I preferred the scripted way instead of libvirt and still do after a long good experience with the method. At the time the scripts in the article pointed me to the right direction. I'd really like to contribute to the article with smaller and simpler script examples with decent explanations for using every core parameter needed for the VM started with QEMU commands, so anyone with an opinion about this please comment before I start working on it. The scripted way is great for many use cases and I really think it needs to be addressed better on this article. Nd43 (talk) 07:28, 20 April 2017 (UTC)Reply

Additionnal sections

In case I forget, I made a list of things I wanted to add to this article some time ago, I just haven't found the time to write those parts yet.

  • Performance tuning
    • Kernel config (Volountary preemption, 1000 Hz timer frequency, dynticks, halt_poll_ns, etc.)
    • Advanced CPUs features
      • 1GB hugepages
      • NUMA node assignment
      • Hardware virtual interrupt management (AVIC/APICv)

  • Special Procedures
    • Using identical guest and host GPUs
    • Passing the boot GPU (see here)
    • Bypassing the IOMMU groups (ACS override patch)
  • Additionnal devices
    • GPU soundcard (MSI interrupts)

As of now, I don't have the sort of hardware that would allow me to test these special cases, so it's a bit hard for me to justify writing those sections, but it might be interesting to add those someday.

Victoire (talk) 02:16, 5 August 2016 (UTC)Reply

EDIT1 : Victoire (talk) 03:10, 9 August 2016 (UTC)Reply

EDIT2 (Might need to revisit some of those now that Ryzen comes with most of these features) : Victoire (talk) 00:09, 7 August 2017 (UTC)Reply

hotplug gpu

found something interresting for hotpluging the gpu without to restart x:

http://arseniyshestakov.com/2016/03/31/how-to-pass-gpu-to-vm-and-back-without-x-restart/ repo: https://gist.github.com/ArseniyShestakov/dc152d080c65ebaa6781

It removes the card from the graphics driver module (here radeon) and add's it to the vfio-module.

I'm currently testing this. —This unsigned comment is by Xtf (talk) 22:25, 12 August 2016‎. Please sign your posts with ~~~~!

I have been using bumblebee and bbswitch for hot plugging my Nvidia GPU. It used to work flawlessly, but within the last year it broke and now it fails to unload the Nvidia driver properly. My current solution is to use the vfio_pci module and manually unload it when I want to run a native app on the dGPU. the VM seams to be able to unload nvidia if bumblebee is not using it. my setup is a work in progress, but I would like to see it mentioned here. U6bkep (talk) 16:08, 19 February 2019 (UTC)Reply

Inaccurate hugepage advice

The static hugepage section says: "On a VM with a PCI passthrough, however, it is not possible to benefit from transparent huge pages, as IOMMU requires that the guest's memory be allocated and pinned as soon as the VM starts. It is therefore required to allocate huge pages statically in order to benefit from them."

This may have been true previously but in my experience (currently verified on an Ubuntu Trusty box with kernel 4.4) this is no longer accurate:

gpu-hypervisor:~$ uname -a
Linux rcgpudc1r54-07 4.4.0-59-generic #80~14.04.1-Ubuntu SMP Fri Jan 6 18:02:02 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
gpu-hypervisor:~$ ps auxww | grep qemu | grep -v qemu
libvirt+ 105915  297 95.3 325650548 251763980 ? SLl  Apr10 9725:30 /usr/bin/qemu-system-x86_64 -name instance-00000167 -S -machine pc-i440fx-xenial,accel=kvm,usb=off -cpu host -m 245760 -realtime mlock=off -smp 24,sockets=2,cores=12,threads=1 -object memory-backend-ram,id=ram-node0,size=128849018880,host-nodes=0,policy=bind -numa node,nodeid=0,cpus=0-11,memdev=ram-node0 -object memory-backend-ram,id=ram-node1,size=128849018880,host-nodes=1,policy=bind -numa node,nodeid=1,cpus=12-23,memdev=ram-node1 -uuid 87681aae-2bc7-4b2e-b17b-f407cf23701e -smbios type=1,manufacturer=OpenStack Foundation,product=OpenStack Nova,version=12.0.4,serial=4c4c4544-0059-4710-8036-c3c04f483832,uuid=87681aae-2bc7-4b2e-b17b-f407cf23701e,family=Virtual Machine -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-instance-00000167/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/var/lib/nova/instances/87681aae-2bc7-4b2e-b17b-f407cf23701e/disk,format=raw,if=none,id=drive-virtio-disk0,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -drive file=/var/lib/nova/instances/87681aae-2bc7-4b2e-b17b-f407cf23701e/disk.eph0,format=raw,if=none,id=drive-virtio-disk1,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk1,id=virtio-disk1 -netdev tap,fd=29,id=hostnet0,vhost=on,vhostfd=31 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=fa:16:3e:5e:21:1c,bus=pci.0,addr=0x3 -chardev file,id=charserial0,path=/var/lib/nova/instances/87681aae-2bc7-4b2e-b17b-f407cf23701e/console.log -device isa-serial,chardev=charserial0,id=serial0 -chardev pty,id=charserial1 -device isa-serial,chardev=charserial1,id=serial1 -device usb-tablet,id=input0 -vnc 0.0.0.0:0 -k en-us -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device vfio-pci,host=05:00.0,id=hostdev0,bus=pci.0,addr=0x6 -device vfio-pci,host=06:00.0,id=hostdev1,bus=pci.0,addr=0x7 -device vfio-pci,host=85:00.0,id=hostdev2,bus=pci.0,addr=0x8 -device vfio-pci,host=86:01.1,id=hostdev3,bus=pci.0,addr=0x9 -device vfio-pci,host=84:00.0,id=hostdev4,bus=pci.0,addr=0xa -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0xb -msg timestamp=on
gpu-hypervisor:~$ grep AnonHuge /proc/meminfo
AnonHugePages:  246108160 kB

I don't have an Arch system to verify this on, but I don't expect it to be distro specific. Any objection to removing this?

Have you tested memory performance inside the VM? As in, statically allocated hugepages vs transparent, with VFIO devices bound and actively used inside the VM? DragoonAethis (talk) 11:18, 13 April 2017 (UTC)Reply

Using identical guest and host GPUs - did not work for me.

I had to create the script vfio-pci-override.sh in bin, not sbin, or it would not find the file. (and update the modprobe and mkinitcpio accordingly)

furthermore, I had to change the sys/devices path to search a directory deeper. The script /bin/vfio-pci-override.sh looks like this then:

#!/bin/sh
  
  for i in /sys/devices/pci*/*/*/boot_vga; do
          if [ $(cat "$i") -eq 0 ]; then
                  GPU="${i%/boot_vga}"
                  AUDIO="$(echo "$GPU" | sed -e "s/0$/1/")"
                  echo "vfio-pci" > "$GPU/driver_override"
                  if [ -d "$AUDIO" ]; then
                          echo "vfio-pci" > "$AUDIO/driver_override"
                  fi
          fi
  done
  
  modprobe -i vfio-pci


This situation is the only one that worked for me. Eggz (talk) 19:12, 14 April 2017 (UTC)Reply

Passthrough via OVMF without libvirt

I don't have a graphic card with UEFI fw to test this but it should work (with the ovmf package and not ovmf-git), please test and add it to the page if it does work.

qemu-system-x86_64 \
    -enable-kvm \
    -m 8192 \
    -M q35 \
    -cpu host \
    -smp 4,sockets=1,cores=4,threads=1 \
    -vga none \
    -drive if=pflash,format=raw,readonly,file=/usr/share/ovmf/ovmf_code_x64.bin \
    -drive if=pflash,format=raw,file=/tmp/ovmf/qemu-vm-01/ovmf_vars_x64.bin \
    -device vfio-pci,host=05:00.0,addr=09.0,multifunction=on,x-vga=on \
    -device vfio-pci,host=05:00.1,addr=09.1 \
    -drive file=diskimg.qcow2,format=qcow2

Dhead (talk) 15:58, 8 May 2017 (UTC)Reply

And you should copy the vars first
mkdir -p /tmp/ovmf/qemu_vm-01
cp /usr/share/ovmf/ovmf_vars_x64.bin /tmp/ovmf/qemu_vm-01/ovmf_vars_x64.bin
Dhead (talk) 16:08, 8 May 2017 (UTC)Reply
Oh, indeed the official ovmf package seems to be updated after a long while to include the variable files, at least I was told they were missing from it before. I'll test this out and do a writeup about it as I'm actually writing a draft for adding the scripted method back to the artictle, check here: User:Nd43/Scripted QEMU setup without libvirt. It's still incomplete but already it's also getting pretty big and detailed, any comments welcomed. Also I'm wondering if it should be a subpage or not when I finally start merging it back to this actual article. Nd43 (talk) 08:19, 9 May 2017 (UTC)Reply
Nice guide but I don't think the ArchWiki is the right place for it, it's too specific and too detailed to be incorporated into this article and adding it as a subpage will end with duplicated content which will never be in sync.
The starting point of this article is that the user (an Arch user, not a copy-paste monkey) already gone through the QEMU article so everything not directly related to pci passing like systemd service, audio, putting all in a script should not be added to this article. So instead we should just add the very basic command switches related to using OVMF.

I might be a little digressing, but I also think this page should be torn down and reworked. All the performance tuning should be in its own subpage of the main QEMU article. Using OVMF instead of the default SeaBIOS also should be in a subpage (of QEMU). So we will be left with just the PCI passthrough stuff here. Dhead (talk)
How about mentioning just the passthrough switches for QEMU here? The -device vfio-pci,host=... ones, even OVMF isn't *required* for the passthrough to work (although it is easier to work with). Other than that, passthrough can be thrown in for just about any QEMU VM, and it's up to the guest OS to handle these devices somehow. Other QEMU args are specific to that VM. DragoonAethis (talk) 10:09, 9 May 2017 (UTC)Reply
I agree, just adding the related QEMU command switches should be more than enough. p.s. OVMF might be preferred for VGA PCI passthrough as at least on my system SeaBIOS seems to have some issues with USB input devices on boot/in bootloader (not an issue when the OS/kernel is running).Dhead (talk) 12:05, 9 May 2017 (UTC)Reply
You're probably right, but I still think many details of my writeup should be easily available for Arch users looking just to deploy a QEMU VM with PCI passthrough. I also agree that the whole article needs a rewrite on some parts but I really don't want to see the actually relevant information get removed temporarily or buried behind too many internal links to other articles. I've already seen how the article has suffered during the few years and currently its structure is partly very unclear. Does anyone have additional suggestions for the information to be added or should I just start editing with small additions to see how they're received? Nd43 (talk) 18:44, 9 May 2017 (UTC)Reply
Passing through the GPUs to a ready VM is just about adding those two switches and making sure you have some "extras" in if you're using a specific setup (GeForces requiring the spoofed vendor ID, etc). It might make more sense to let people configure their VM first (= installing OS under UEFI, installing VirtIO drivers, configuring their base environment etc) and THEN explain how to pass through the GPU to that ready VM - this might fit pretty well in this article.
Most of these copy-n-paste guides assume a very specific setup (memory, CPU cores, disks, etc), and there is value in those, since you can peek at complete solutions that are known to work under specific hardware combinations. However I'd rather create a separate VFIO "examples" page, where a brief description of that user's hardware, software (kernel + cmdline, additional VM configuration steps, etc) and the QEMU script/libvirt domain file would be posted. Exact specifics (additional scripts, configuration files for Synergy, PulseAudio and such) would have to be posted on GitHub/GitLab/somewhere else as to not clutter the wiki beyond reason (older examples would have to be removed or updated over time, since QEMU/libvirt break things once in a while). Does that sound fine? DragoonAethis (talk) 21:55, 10 May 2017 (UTC)Reply
I like your thinking. I made an initial revision edit at PCI passthrough via OVMF#Plain QEMU without libvirt describing the bare minimum for achieving a practical QEMU setup with links to other relevant articles. I'm probably going to continue editing the section later as I want to pay attention to some details and terminology a bit more, but for now I think it will do for provoking people to make additions/changes to reach better guidance on the topic. So, comments and edits highly welcomed. I also removed the old links for the time being, even though I really liked Melvin's script examples, they indeed belong to the "See also" section or somewhere else totally. Feel free to relocate the links to a logical order. Nd43 (talk) 15:32, 13 May 2017 (UTC)Reply
I've created the PCI_passthrough_via_OVMF/Examples page, feel free to contribute your working setups there. There's a template etc available to make adding new entries easier and more consistent. DragoonAethis (talk) 14:37, 19 May 2017 (UTC)Reply

Moving the full tutorial elsewhere?

A number of people on this talk page have already mentioned how having a complete tutorial like this on the Arch Wiki makes some parts of the article somewhat redundant with other pages on the wiki, and that it would probably be better if the article were split into multiple parts and spread across a number of pages. At the same time, having a full tutorial like this here means it can't really cover other distros without breaking the contribution guidelines, which limits the article's reach somewhat.

However, I know a lot of people (myself included) have been convinced to try Arch after seeing a number of quality articles on the official wiki, and from what I've seen elsewhere, this very article has been this to some people. I've seen people on /r/vfio and elsewhere use this article as their primary reference for people who want to setup their machine this way. That's somewhat what I was aiming for when I started reworking this article to try and adapt AW's blog into something that's less intimidating to read, and it's great to see it has managed to evolve into what it is now thanks to all the contributions that have happened since then.

I'd like to know, according to more experienced Arch Wiki contributors than me, whether or not this article actually belongs here, or if the current page shouldn't be moved to the VFIO subreddit and the Arch page torn down and its content split on other QEMU-related pages.

Victoire (talk) 14:56, 10 July 2017 (UTC)Reply

IMO this article is fine here - it's widely linked, kept mostly up-to-date, clearly explained and doesn't require two days of research to even attempt this setup. I would agree making an exception and allowing other distro-specific information to be posted here would be great, but moving the entire post to /r/vfio, where the wiki pages don't work on mobile and limit contributors to the subreddit's moderators or manually added people will limit the article's reach even more and cause it to go out-of-date in a few months. (If the wiki is open to all registered redditors, most more-popular wiki pages *will* be defaced.) DragoonAethis (talk) 12:24, 11 July 2017 (UTC)Reply
Victoire, this is the Arch Wiki, users of other distros will be better served with full tutorials elsewhere, /r/VFIO has a wiki.
Regarding your choice of words, the page shouldn't be moved anywhere, copied somewhere else, yes, there you could just have a simple full tutorial from start to finish to set a host and guest QEMU machines with GPU passthrough, but the contents of this page should stay here so Arch Wiki users and maintainers could change it as needed, which I gather it's what you meant by "moved". In fact, from the little I've been seeing in /r/VFIO there are nice examples and tutorials that are buried and not being mentioned and linked in the /r/VFIO wiki.
Regards the needed updates to this page, first start, I would say the Performance tuning needs to go into the Tips and tricks in the QEMU page (which deserve its own subpage), the article name should be changed to PCI passthrough so it would reflect the actual content of the page which isn't UEFI specific.
Also, the complete examples page seems superfluous, why not have a wiki page in /r/VFIO with known compatible MB, or MB+GPU setups as all rest of the devices (recent CPU, RAM, storage) shouldn't matter, I wouldn't be surprised if the moderators would remove it.Dhead (talk) 14:05, 11 July 2017 (UTC)Reply

The one downside of having it here is that users that are not using arch as their daily driver can't contribute particularly easily because of the captchas. I stopped maintaining my Arch install a few months ago, so I had to have a friend run the `pacman -V|base32|head -1` on my behalf. Obviously, this is normally fine. This is ArchWiki lol. But this page is pretty general, so non-Arch user contributions have more friction that you might otherwise expect. Srutherford (talk) 01:35, 22 October 2019 (UTC)Reply

Set firmware to UEFI

>"In the "Overview" section, set your firmware to "UEFI". If the option is grayed out, make sure that you have correctly specified the location of your firmware in /etc/libvirt/qemu.conf and restart libvirtd.service."

I see the Overview section, but I do not see "firmware" or "UEFI" anywhere on that section. There is nothing "grayed out". I only see:

  • Name
  • UUID
  • Status
  • Title
  • Description
  • Hypervisor
  • Architecture
  • Emulator
  • Chipset (i440FX or Q35 are the options)

This makes me think the information could be outdated. I am not able to proceed and I suspect this to be the cause. Henstepl (talk) 19:57, 3 August 2017 (UTC)Reply

This can be set only during the VM creation - check "Customize installation" on the final creation wizard screen and the relevant option will show up. If you can't select this during the VM creation, install `ovmf-git` or `ovmf`. DragoonAethis (talk) 20:05, 3 August 2017 (UTC)Reply
Yes, I am looking at exactly that, but there is no "Firmware" option at all. It just skips directly from "Emulator" to "Chipset". Nothing greyed out. Simply not there. Henstepl (talk) 23:07, 3 August 2017 (UTC)Reply
Are you sure you've got ovmf or ovmf-git installed? If you've built it yourself, make sure /etc/libvirt/qemu.conf contains the proper OVMF binary path: nvram = [ "/usr/share/ovmf/x64/ovmf_x64.bin:/usr/share/ovmf/x64/ovmf_vars_x64.bin" ] - there should be a section like this in that file, but commented and possibly with different paths. Point it at your OVMF binaries (pacman -Ql ovmf or ovmf-git will show all files in that package). DragoonAethis (talk) 10:20, 4 August 2017 (UTC)Reply
Yes, ovmf has been installed from the start. nvram = ["/usr/share/ovmf/ovmf_code_x64.bin:/usr/share/ovmf/ovmf_vars_x64.bin"], as contains the paths listed by pacman -Ql ovfm, has been in /etc/libvirt/qemu.conf. Thanks for your advice but it hasn't elucidated anything I've done wrong, and there still is no Firmware option. Henstepl (talk) 04:38, 5 August 2017 (UTC)Reply
Okay, not sure why this option doesn't show up, but there's also a way to manually manipulate the domain XML file to use OVMF. Here are the relevant lines, once you've created the VM, open your terminal and enter sudo EDITOR=your-editor-like-nano-or-vim-or-somethign virsh edit your-domain-name and make the <os> section there contain the <loader> and <nvram> entries pointing to proper OVMF files. If you're missing the vars file, here it is from my VM. DragoonAethis (talk) 11:39, 5 August 2017 (UTC)Reply
I was hoping that a solution to this would help me avoid the error message I have been getting when I try to start my VM:
Unable to complete install: 'internal error: process exited while connecting to monitor: 2017-08-03T03:20:54.793536Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/2 (label charserial0)

Could not access KVM kernel module: Permission denied failed to initialize KVM: Permission denied'

Traceback (most recent call last):

 File "/usr/share/virt-manager/virtManager/asyncjob.py", line 88, in cb_wrapper
   callback(asyncjob, *args, **kwargs)
 File "/usr/share/virt-manager/virtManager/create.py", line 2288, in _do_async_install
   guest.start_install(meter=meter)
 File "/usr/share/virt-manager/virtinst/guest.py", line 477, in start_install
   doboot, transient)
 File "/usr/share/virt-manager/virtinst/guest.py", line 405, in _create_guest
   self.domain.create()
 File "/usr/lib/python2.7/site-packages/libvirt.py", line 1062, in create
   if ret == -1: raise libvirtError ('virDomainCreate() failed', dom=self)

libvirtError: internal error: process exited while connecting to monitor: 2017-08-03T03:20:54.793536Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/2 (label charserial0) Could not access KVM kernel module: Permission denied

failed to initialize KVM: Permission denied

Which prevents me from creating a VM with those settings... Welp. Maybe I will just reinstall everything from the start. Henstepl (talk) 05:48, 7 August 2017 (UTC)Reply

I had the same issue, fixed it by forcing the kvm gid to 78, an update of systemd give kvm group a dynamic id while libvirtd expect to be 78. groupmod -g 78 kvm and a reboot fixed it (https://bbs.archlinux.org/viewtopic.php?id=228936) Makz (talk) 22:15, 9 September 2017 (UTC)Reply

Boot GPU Passthrough

I managed to pass the boot gpu (the only one in my system) by:

  • setting the GRUB option GRUB_GFXPAYLOAD_LINUX=text (for example in /etc/default/grub)
  • and adding the kernel parameter earlymodules=vfio-pci (although I'm not sure if that is necessary at all)

That prevents the initramfs from initializing a graphical console on the boot gpu, allowing the VM to take over it directly after GRUB. However, you can't see your boot status and will have to type your LUKS password blind if you encrypted your root partition.

The Hardware I used: AMD Ryzen 7 1700 CPU, Mainboard: Biostar X370GTN, Passthrough GPU: Nvidia GeForce GTX 970

-- dasisdormax (talk) 11:33, 8 August 2017 (UTC)Reply

I followed the steps in the guide and added video=efifb:off but then I saw messages like Invalid PCI ROM header signature: expecting 0xaa55, got 0xffff.

I resolved this by putting my main GPU in a secondary slot, putting a spare GPU in the primary slot, disabling all the vfio-pci stuff (i.e. letting nouveau load) and making a copy of the ROM from /sys/devices/pci0000:40/0000:40:01.3/0000:41:00.0/rom. It's worth noting that ROM images from places like TechPowerup don't work because they use a different format, the one used by nvflash.

Once I had the ROM image, I had to add <rom file='/path/to/rom'/> to my domain XML and all was well.

I used an AMD Threadripper 1920X, ASUS PRIME X399-A and MSI Gaming X GTX1080Ti.

-- bobobo1618 (talk) 15:43, 10 February 2018 (UTC+1)

Slow boot linked to the RAM amount

> Windows 10 VM with hugepages, 16Gb ram = 1m40 from the start to login screen, same VM reduced to 1Gb ram = 10 seconds from start to login screen (i7-6900K + 64Gb DDR4-2133 non-ECC) brand new installation

Same setup on a bi-Xeon + 64Gb DDR3-1600 ECC ram, 16Gb ram = 13 seconds, 1Gb = 13 seconds

While the VM is booting, all assigned CPU are working at 100%

Is the ECC ram help to boot faster ? Is it a OVMF related issue ? Maybe the virtual bios is checking the ram before starting the OS ?

While it's loading, i have a black screen, i don't think the GPU is handled by the VM at this moment.

I saw some users on forums having this issue and tried latest ovmf build but no fix so far.

Makz (talk) 09:10, 10 September 2017 (UTC)Reply

One year later finally found a way to get rid of this slow down.
In short, this slow down is because the arch kernel is preemptive by default, compiling the kernel without CONFIG_PREEMPT=y and with CONFIG_PREEMPT_VOLUNTARY=y fixed my issue.
Boot time is down to ~35 seconds instead of +3 minutes
Makz (talk) 14:52, 24 November 2018 (UTC)Reply

Question about virtlogd and iommu=pt

Is enabling virtlogd.socket optional? If so, that should be stated.

Is iommu=pt only for AMD? It looks like that on this page https://www.linux-kvm.org/page/How_to_assign_devices_with_VT-d_in_KVM

Their instructions says:

intel_iommu=on # Intel only

iommu=pt iommu=1 # AMD only

—This unsigned comment is by Limero (talk) 19:59, 16 January 2018‎. Please sign your posts with ~~~~!

virtlogd.socket must be enabled, libvirt will otherwise fail to start. iommu=pt is separate from the intel_iommu=on and amd_iommu=on - iommu=pt enables the IOMMU hardware (on any platform) only for the passthrough devices and lets all the other devices (not passed to the VMs) do whatever they want. The "normal" IOMMU mode is to isolate all devices and make sure all devices behave nicely (no writes out of their address space, DMA only to their assigned section of memory, etc), but this may cause problems on some older platforms. DragoonAethis (talk) 20:27, 16 January 2018 (UTC)Reply

Slowed down audio pumped through HDMI on the video card

In this section it is mentioned that the "MSI_util" program does not work on windows 10 64-bit however in the page linked earlier[1] there is now a "MSI_util_v2" program available and this program did work, for me at least. I would love it if other people can test it and report back so I can update the page.

Victorheld (talk) 11:21, 5 June 2018 (UTC)Reply

I used it successfully on Windows 8.1 64-bit too, +1 for replacing it with v2. DragoonAethis (talk) 12:02, 5 June 2018 (UTC)Reply

UEFI (OVMF) Compatibility in VBIOS

Sorry if I am using this wrong and for just butting in as a new user, but I spent quite a lot of time getting a VBIOS card without UEFI working, and I feel like this guide could be clearer. I have a HD5970 card currently passing through using OVMF / VFIO (not pci-stub) to Windows 10, without flashing the card, and it would appear that CrossFire is working as well in Windows.

It is simply enough. Get the VBIOS file from the card itself, use GOPupD, pass through the updated ROM file in the VM XML using rom file=. Flashing should be a last resort (IMHO) due to the danger of bricking. I was going to just use pci-stub, but I do not have that kernel module for one reason or another, so I had to make VFIO work (I'm using Antergos with 4.17.2-1-ARCH #1 SMP PREEMPT kernel).

There are always hidden dangers with flashing that can be avoided by using rom files in the XML. In my case, the HD5970 is dual GPU, and each GPU has a different device IDs and, it seems, slightly different VBIOS files. I did not know this going in (and this is what took me a while). Had I just flashed the same GOPupD VBIOS, I likely would have bricked my card (when I passed the wrong ROM files in the XML to each GPU it crashed the whole system, I would hate to see what it would have done if I flashed it to the card itself).

I know that 8 year old hardware isn't everybody's concern, but the miracle of Linux is the breath of life it provides to older hardware. I'm using an i7 980x on a P6X58D Premium motherboard with 24GB of RAM ... I really don't feel a need to upgrade my rig when it still kicks butt. Eventually I am putting in a new GPU so I can use newer games and this will all seem very trivial to me, but may help other people unable to upgrade.

Also, with my 5970 each GPU was in its own IOMMU group. I am not sure if all dual GPUs are like that, but perhaps telling people that dual GPUs may be split up may help them. I could not get the card to function until I separated the second GPU on that card, gave it to the VM, and gave it the correct GOPupD ROM file. I don't know where to ask this question, but I do wonder if this card could use one GPU for the host and one for the guest (same card, but it has a different IOMMU group for each GPU and has 2 x DVI and a mini-DP). That would be an interesting project. Dracodeus (talk) 09:53, 27 June 2018 (UTC)Reply

The lack of instructions for this sort of thing is something I fully agree with. The article glosses over the topic of VBIOS and UEFI compatibility a whole bunch of times, but never really adresses it. It seems a lot of people (including myself, back in the days!!) run into strange VBIOS-related issues that the article does not adress and a number of sections are there to suggest ways to workaround them.
I'll mark those sections for cleanup/removal, this definitely needs to be adressed in the main article. Victoire (talk) 17:28, 28 July 2018 (UTC)Reply
This section has been flagged for removal for years, GPUs with a VBIOS without UEFI support are getting more and more rare: unless someone protests I'll say this is closed and the section will be removed. --Erus Iluvatar (talk) 08:58, 27 August 2022 (UTC)Reply
While all the new GPUs support UEFI GOP just fine, some early GeForce 900 Series have issues with buggy UEFI drivers as well and flashing them with newer VBIOS versions helps. GeForce 7xx is still supported on a LTS driver, too. IMO this should be kept, at least until 9xx kicks the bucket in a ~year or so. The note for removal is more about "make this a proper section" and not being outdated per se - not sure if anyone wants to make it proper if it's about to be thrown out though. DragoonAethis (talk) 10:43, 28 August 2022 (UTC)Reply
Fine for me, this has been flagged since 2018-07 and no one has taken the time to reshape this section, hence my suggestion of a complete deletion. If anyone cleans this up I see no reason to remove it, btw. --Erus Iluvatar (talk) 11:06, 28 August 2022 (UTC)Reply

Change hupepages allocation in runtime

In the "Static huge pages" section, the article states "Static huge pages lock down the allocated amount of memory, making it unavailable for applications that are not configured to use them". This is true. However, the amount of allocated hugepages can be modified in runtime, so the end result is you can allocate those pages just before starting the vm, and free them when it stops. Perhaps we should rewrite the warning or add a note.

The amount of allocated pages can be seen and modified in `/sys/devices/system/node/nodeX/hugepages/hugepages-1048576kB/nr_hugepages`

--Roobre (talk) 08:08, 16 August 2018 (UTC)Reply

Yeah, it can be done like this. I do this in my setup. The biggest problem with this is that after your system has been running for a while, it gets increasingly more difficult to find continuous blocks of free 1GBs of RAM to map as hugepages - in practice, if you're allocating larger amounts, it's possible only shortly after booting up. (Rebooting is an option, too.) While this could be mentioned, it would also have to contain a lot of warnings that this can behave rather erratically if you're not trying to allocate those right after booting, plus the exact behavior might be confusing (allocating hugepages occasionally fails for no user-visible reason, and the only way to check if that happened is just to manually check how many hugepages were allocated). DragoonAethis (talk) 19:09, 16 August 2018 (UTC)Reply

QEMU 4.0: Unable to load ... install using Q35 - Other solutions

I wrote the original KVM patches for split irqchip back in 2016. It should be possible to work around any issues with split irqchip by using MSIs. For Windows, you can configure this using the registry. If I'm not mistaken, Linux should be pretty light on the IOAPIC. I say this about Linux with some caution since I'm mostly familiar with VMs in Cloud, which have a different set of devices than physical machines (e.g. no USB, which the debian box I'm currently on actually does put across the IOAPIC).

For the Passthrough VM I was using to play Apex Legends on Windows, I was able to get things working without issue after changing to using MSIs. I considered getting the synthetic interrupt controller/vapic working (which should also skip the potential perf issues with split irqchip), but I wanted to use what I wrote :P. I was also nervous that getting windows to use the SynIC might also clue the nvidia driver that it's being used in an VM.

Obviously, since most users trust their VMs, using a kernel irqchip is fine. Disabling split irqchip is nowhere near as unsafe as the ACS override (which is also a workaround listed on this page). IOAPIC based interrupts will still have a high cost relative to physical hardware, but back of the envelope estimate is that the IOAPIC takes about 10x as long to talk to when it's up in usermode, which is pretty substantial if it's actually perf critical.

I considered just writing a description of working around split irqchip directly into the topic, but I don't really know the norms for wikis and it seemed like a bad plan to shill my own work in my first post.

If anyone is willing to go test out the SynIC as a work around I'd be pretty curious to hear results. I'd also be curious if someone tested out A/B testing MSIs against kernel irqchip. If anyone has gotten posted interrupts to work with MSIs to cut down the latency to be as low as possible, that'd be pretty cool to hear about as well.

Srutherford (talk) 01:30, 22 October 2019 (UTC)Reply

Is amd_iommu=on, pt still needed?

They're not configured in my grub but dmesg shows iommu is active anyway. Also is amd_iommu=on valid? it's not listed as an option https://github.com/torvalds/linux/blob/master/Documentation/admin-guide/kernel-parameters.txt#L286 Beepboo (talk) 20:02, 20 March 2020 (UTC)Reply

I don't think amd_iommu=on is valid anymore. It only takes 3 possible params as of 5.12.1: fullflush, off and force_isolation, so the default behavior is on. Relevant doc: https://www.kernel.org/doc/html/latest/admin-guide/kernel-parameters.html?highlight=amd_iommu Not sure about iommu=pt though. Hexhu (talk) 03:59, 7 May 2021 (UTC)Reply

Error 43?

Windows showed error 43. I tried may things - vbios (not this), moving PCI-e slots (didn't really help). Eventually... I followed this https://mathiashueber.com/fighting-error-43-nvidia-gpu-virtual-machine/ to get mine to work (RTX2070 super) - the hidden setting did it (along with the others) Beepboo (talk) 21:32, 20 March 2020 (UTC)Reply

Moving virtio disk section to QEMU

This page has a lot of details on both adding device block emulation with virtio, as well as passing through a SCSI drive. I consider this to be significant info, that QEMU#Installing virtio drivers does not have. As far as I can tell, this doesn't necessarily pertain to PCI passthrough, so I feel that PCI passthrough via OVMF#Virtio disk should be merged with QEMU. ~ CodingKoopa (talk) 13:15, 25 March 2020 (UTC)Reply

There is indeed a lot of overlap and offtopic info on this page and much of this info would also make sense on other pages like QEMU or libvirt. However around two thirds of the PCI Passthrough page is already completly unrelated to passthrough. The PCI Passthrough page has sort of morphed into a "Gaming VM" guide over the years and the perhaps the primary page most people look at for setup, tuning, and dealing with virtualization realated quirks. In that context its good for people to see the info here. It's certainly not the first time someone has brought up reorganizing all this info onto other pages or perhaps making an entirely new topic for optimization & quirks. Aiber (talk) 14:48, 25 March 2020 (UTC)Reply
That's not an argument against the merge, you can just leave a link to the relevant section of the QEMU page. -- Lahwaacz (talk) 15:17, 25 March 2020 (UTC)Reply

vesafb claiming GPU

I spent a lot of time trying to solve [[PCI_passthrough_via_OVMF#"BAR_3:_cannot_reserve_[mem]"_error_in_dmesg_after_starting_VM]], and the solution didn't come from the Wiki. I would like to bring some more information about the steps i follow to diagnose the issue, and how i solved the error (basically cat /proc/iomem to detect that vesafb was claiming memory, and kernel parameters different than video=efifb:off thas is proposed in the wiki.

The solution provided by the wiki was not really helpful to me.

As a beginner in wiki edition, and because the section is already marked for merge with VBIOS issues, and also because I don't know if vesafb claiming for the GPU is actually related to VBios, I would like some feedback or advices before making any changes to this section. So, do you think that i should edit this section to propose a way to diagnose the issue by checking /proc/iomem, and propose some other kernel parameters ?

—This unsigned comment is by Mjfcolas (talk) 15:25, 12 July 2020‎. Please sign your posts with ~~~~!

QEMU 5 & Zen2 with host-passthrough

Now that we have added troubleshooting for Zen2 and QEMU 5, a warning about instability has been removed. I have done some testing and the Windows 10 crashes in less than an half hour. Tested this again with 'better' method that has been edited to the page with libvirt 6.5 and it still is unstable to do anything useful than run a single benchmark. I know the troubleshoot section for this important but do we really not want to warn users about its instability? Fireflower (talk) 16:43, 1 August 2020 (UTC)!Reply

Elevated GPU driver VM detection mitigation from troubleshooting to main section

I've moved hiding the KVM flag and vendor_id to the main sections of this page because I believe that the majority of the readers of this page need to perform this step in any case, since it affects both Nvidia and AMD. As always, feel free to disagree and suggest alternative approaches. --Jyf (talk) 08:20, 23 August 2020 (UTC)Reply

I agree, the problem is common enough to be mentioned in about every possible guide. Especially since Windows VM gaming / cad working is become more popular thanks to knowledge being posted about it. I also strongly think any information regarding general stability should be left on the wiki instead being removed as unnecessary. It saves a lot time when people don't have to try figure out so much why their VM becomes unstable. Fireflower (talk) 16:25, 24 August 2020 (UTC)Reply

intel_iommu kernel parameter might no be necessary on some systems

On my system, adding intel_iommu to the kernel cmdline is not necessary, and iommu seems to be enabled without it, maybe related to this (from dmesg) :

[    0.564619] DMAR: Intel-IOMMU force enabled due to platform opt in

See [2]

--Cvlc (talk) 11:03, 15 September 2022 (UTC)Reply

Linux 6.2.1 and vfio_virqfd (GPU pinning)

Since Linux 6.2.* the vfio_virqfd module is part of the kernel. It appears we don't need that in section 3.2.1 any longer, could we remove it?

Torxed (talk) 18:10, 24 March 2023 (UTC)Reply

mkinitcpio Modification in Section 6.1.2

Please excuse any formatting errors, I'm still a newbie, but correct me if I'm wrong should the user be encouraged to add a drop file instead of directly editing /etc/mkinitcpio.conf

 Edit /etc/mkinitcpio.conf:
   Add modconf to the HOOKS array and /usr/local/bin/vfio-pci-override.sh to the FILES array.

Jonjolt (talk) 12:38, 16 August 2023 (UTC)Reply

Is Nvidia RTD3 really work with vfio?

> "Starting with Linux 4.1, the kernel includes vfio-pci. This is a VFIO driver, meaning it fulfills the same role as pci-stub did, but it can also control devices to an extent, such as by switching them into their D3 state when they are not in use."

But when I successfully isolate my nvidia dgpu, it always on D0 without a vm using it. Heddxh (talk) 16:22, 5 April 2024 (UTC)Reply

Include non-libvirt parameters to pass directly to qemu

There's a general dearth of documentation on how to use qemu directly without using libvirt, and this article has chapter 7 but it is very thin. It would be really great to bolster this page to bring direct usage of qemu up to parity with doing so through libvirt. I've done some of these things myself and can contribute some documentation, but won't be able to do everything.

Further, is it preferable to keep all of these things as a separate chapter, or weave them in by-topic? Probably my preference would be to weave them in by-topic. Kleptophobiac (talk) 14:33, 15 May 2024 (UTC)Reply

Suggest disabling all nvidia services and removing sleep hook

I was caught off-guard by very slowly starting applications and seeing this in my log many times:

Aug 25 01:48:19 WileECoyote kernel: nvidia-nvlink: Nvlink Core is being initialized, major device number 234

Aug 25 01:48:19 WileECoyote kernel: NVRM: GPU 0000:01:00.0 is already bound to vfio-pci.
Aug 25 01:48:19 WileECoyote kernel: NVRM: The NVIDIA probe routine was not called for 1 device(s).
Aug 25 01:48:19 WileECoyote kernel: NVRM: This can occur when another driver was loaded and
                                    NVRM: obtained ownership of the NVIDIA device(s).
Aug 25 01:48:19 WileECoyote kernel: NVRM: Try unloading the conflicting kernel module (and/or
                                    NVRM: reconfigure your kernel without the conflicting
                                    NVRM: driver(s)), then try loading the NVIDIA kernel module
                                    NVRM: again.
Aug 25 01:48:19 WileECoyote kernel: NVRM: No NVIDIA devices probed.

Aug 25 01:48:19 WileECoyote kernel: nvidia-nvlink: Unregistered Nvlink Core, major device number 234

Maybe suggest disabling nvidia-{hibernate,persistenced,powerd,resume,suspend} as well as disabling the systemd-sleep/nvidia hook? JL2210 (talk) 06:02, 25 August 2024 (UTC)Reply