When adding a GPU to a vSphere VM using PCI passthrough there are a couple of additional settings that you need to make or your VM won’t boot.
When creating the VM you’ll need to set the Actions > Edit > VM Options > Boot Options > Firmware and select “EFI”. You need to do this before you install the operating system on the VM. If you don’t do this the GPUs won’t work and the VM won’t boot.
To add a GPU, in vCenter go to the VM, select Actions > Edit > Add New Device. Any GPUs set up as PCI passthrough devices should appear in a pick list. Add one or more GPUs to your VM.
Note that after adding one device, when you add additional GPUs the first GPU you selected still appears in the pick list. If you add the same GPU more than once your VM will not boot. If you add a GPU that’s being used by another running VM your VM will not boot. Pay attention to the PCI bus addresses displayed and make sure that the GPUs you pick are unique and not in use on another VM.
Finally you have to set up memory-mapped I/O (MMIO) to map system memory to the GPU’s framebuffer memory so that the CPU can pass data to the GPU. In vCenter go to the VM, select Actions > Edit > VM Options > Advanced > Edit configuration.
Once you’re on the Configuration parameters screen, add two more parameters:
pciPassthru.use64bitMMIO = TRUE
pciPassthru.64bitMMIOSizeGB = ????
The 64bitMMIOSizeGB value is calculated by adding up the total GB of framebuffer memory on all GPUs attached to the VM. If the total GPU framebuffer memory falls on a power-of-2, setting pciPassthru.64bitMMIOSizeGB to the next power of 2 works.
If the total GPU framebuffer memory falls between two powers-of-2, round up to the next power of 2, then round up again, to get a working setting.
Powers of 2 are 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024 …
For example, two NVIDIA A100 cards with 40GB each = 80GB (in between 64GB and 128GB), so round up to the next power of 2 (128GB), then round up again to the next power of 2 after that (256GB) to get the correct setting. If you set it too low the VM won’t boot, but it won’t give you an error message telling you what the issue is either.
Here are some configurations that I’ve tested and verified:
- 2 x 16GB NVIDIA V100 = 32GB, 32 is a power of 2, so round up to the next power of 2 which is 64, set pciPassthru.64bitMMIOSizeGB = 64 to boot.
- 2 x 24GB NVIDIA P40 = 48GB, which is in-between 32 and 64, round up to 64 and again to 128, requires pciPassthru.64bitMMIOSizeGB = 128 to boot.
- 8 x 16GB NVIDIA V100 = 128GB, 128 is a power of 2, so round up to the next power of 2 which is 256, set pciPassthru.64bitMMIOSizeGB = 256 to boot.
- 10 x 16GB NVIDIA V100 = 160GB, which is in-between 128 and 256, round up to 256 and again to 512, set pciPassthru.64bitMMIOSizeGB = 512 to boot.
Hope you find this useful.