Tag: proxmox

  • How do I… enable GPU passthrough from Proxmox to unprivileged LXC?

    Prerequisites

    1. Downloaded driver and installed the driver with kernel modules on the Proxmox host

    Push the driver run-file into the LXC

    From the Proxmox host shell:

    pct push <lxc id> <path/to/source/file> <path/to/target/file>
    pct push 100 ./NVIDIA-Linux-x86_64-580.65.06.run ./NVIDIA-Linux-x86_64-580.65.06.run

    Enter into the LXC:

    pct enter 100

    Find the file and make the run-file executable

    chmod +x NVIDIA-Linux-x86_64-580.65.06.run

    Uninstall any previous drivers

    If you already have a driver installed, this can be uninstalled using the uninstall script that was included in the installation:

    /usr/bin/nvidia-uninstall

    It will complain about not finding the kernel modules, but since this is an LXC these kernel modules are located on the host (Proxmox) and not in the guest (LXC), so this is fine.

    Install the driver

    Install using the run-file, but skip the kernel modules:

    ./NVIDIA-Linux-x86_64-580.65.06.run --no-kernel-module

    Mount the GPU device into the LXC

    external reference: Using an Nvidia GPU with Proxmox LXC

    On the Proxmox host, identify the device IDs:

    ls -al /dev/nvidia*

    It should return something like this:

    # ls -al /dev/nvidia*
    crw-rw-rw- 1 root root 195,   0 Aug  4 11:19 /dev/nvidia0
    crw-rw-rw- 1 root root 195, 255 Aug  4 11:19 /dev/nvidiactl
    crw-rw-rw- 1 root root 195, 254 Aug  4 11:19 /dev/nvidia-modeset
    crw-rw-rw- 1 root root 234,   0 Aug  5 14:49 /dev/nvidia-uvm
    crw-rw-rw- 1 root root 234,   1 Aug  5 14:49 /dev/nvidia-uvm-tools

    The device Ids are in columns 5, so 195 and 234 for the devices under /dev/. Add or update these ids into the unprivileged LXC config. Using the nano editor here:

    nano /etc/pve/lxc/<lxc id>.conf

    Allow access to the device ids:

    lxc.cgroup2.devices.allow: c 195:* rw
    lxc.cgroup2.devices.allow: c 234:* rw

    Then mount the corresponding device paths into the LXC in the same config file:

    lxc.mount.entry: /dev/nvidia0 dev/nvidia0 none bind,optional,create=file
    lxc.mount.entry: /dev/nvidiactl dev/nvidiactl none bind,optional,create=file
    lxc.mount.entry: /dev/nvidia-modeset dev/nvidia-modeset none bind,optional,create=file
    lxc.mount.entry: /dev/nvidia-uvm dev/nvidia-uvm none bind,optional,create=file
    lxc.mount.entry: /dev/nvidia-uvm-tools dev/nvidia-uvm-tools none bind,optional,create=file
    lxc.mount.entry: /dev/nvram dev/nvram none bind,optional,create=file

    Save the config, reboot the LXC and verify that the GPU is available by running nvidia-smi

  • How do I… install Nvidia Drivers on Debian (or Proxmox)?

    I have installed Nvidia drivers using the official run-file from Nvidia (Unix Driver Archive), and when it is time to update I always forget how to do all the steps. So here is the notes I have taken for my future self:

    Download the run-file

    Go to the Manual Driver Search and find the latest driver for the GPU model. Go to the corresponding download page and copy the URL from the Download Now button

    SSH into the machine that needs an update and download the driver there:

    wget <download-url>

    In this particular case it was this version:

    wget https://us.download.nvidia.com/XFree86/Linux-x86_64/580.65.06/NVIDIA-Linux-x86_64-580.65.06.run

    Make the run-file executable:

    chmod +x <filename>

    In this particular case it was:

    chmod +x NVIDIA-Linux-x86_64-580.65.06.run

    Uninstall the previous driver

    When installing the driver using the run-file, Nvidia will also install the uninstaller script and other scripts here: /usr/bin/nvidia-*. It might not be necessary to uninstall previous versions, perhaps unless you are downgrading. For my case, since installing on a server, I like to clean up any unused packages.

    Run the uninstaller:

    /usr/bin/nvidia-uninstall

    Select “No”

    Installing the new driver

    Run the installer and build the kernel modules. Run with the –dkms flag to automatically rebuild kernel modules when the kernel is updated:

    ./NVIDIA-Linux-x86_64-580.65.06.run --dkms

    Test the installation by running nvidia-smi. The output should be something like this:

    +-----------------------------------------------------------------------------------------+
    | NVIDIA-SMI 580.65.06              Driver Version: 580.65.06      CUDA Version: 13.0     |
    +-----------------------------------------+------------------------+----------------------+
    | GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
    | Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
    |                                         |                        |               MIG M. |
    |=========================================+========================+======================|
    |   0  NVIDIA GeForce GTX **** Ti     Off |   00000000:01:00.0 Off |                  N/A |
    |  0%   52C    P0             37W /  180W |       0MiB /   8192MiB |      0%      Default |
    |                                         |                        |                  N/A |
    +-----------------------------------------+------------------------+----------------------+
    
    +-----------------------------------------------------------------------------------------+
    | Processes:                                                                              |
    |  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
    |        ID   ID                                                               Usage      |
    |=========================================================================================|
    |  No running processes found                                                             |
    +-----------------------------------------------------------------------------------------