Set up SR-IOV on Mellanox ConnectX-6 network interfaces

Driver installation

Install the NVIDIA DOCA drivers:

$ export DOCA_URL="https://linux.mellanox.com/public/repo/doca/3.0.0/ubuntu22.04/x86_64/"
$ curl https://linux.mellanox.com/public/repo/doca/GPG-KEY-Mellanox.pub | gpg --dearmor | sudo tee /etc/apt/trusted.gpg.d/GPG-KEY-Mellanox.pub > /dev/null
$ echo "deb [signed-by=/etc/apt/trusted.gpg.d/GPG-KEY-Mellanox.pub] $DOCA_URL ./" | sudo tee /etc/apt/sources.list.d/doca.list
$ sudo apt-get update
$ sudo apt-get -y install doca-ofed

Refer to https://developer.nvidia.com/doca-downloads.

Enable SR-IOV

SR-IOV needs to be enabled both on the Mellanox device and in the kernel.

Enable SR-IOV on the Mellanox device

First, find the Mellanox devices available on the system using MST:

$ sudo mst start
$ sudo mst status

From the output, note the path to the MST device. For example:

MST modules:
------------
    MST PCI module is not loaded
    MST PCI configuration module loaded

MST devices:
------------
/dev/mst/mt4125_pciconf0         - PCI configuration cycles access.
                                  domain:bus:dev.fn=0000:a1:00.0 addr.reg=88 data.reg=92 cr_bar.gw_offset=-1
                                  Chip revision is: 00

The device settings can be queried and changed using mlxconfig.

Querying the device settings
$ sudo mlxconfig -d /dev/mst/mt4125_pciconf0 q
Enabling SR-IOV on a device
$ sudo mlxconfig -d /dev/mst/mt4125_pciconf0 set SRIOV_EN=1
Changing the maximum number of SR-IOV VFs to 16
$ sudo mlxconfig -d /dev/mst/mt4125_pciconf0 set NUM_OF_VFS=16

Note

Changing these settings require a reboot.

Enable SR-IOV in the kernel

Make sure /etc/default/grub contains the boot parameters intel_iommu=on iommu=pt. For example:

/etc/default/grub
# If you change this file, run 'update-grub' afterwards to update
# /boot/grub/grub.cfg.
# For full documentation of the options in this file, see:
#   info -f grub -n 'Simple configuration'

GRUB_DEFAULT=0
GRUB_TIMEOUT_STYLE=hidden
GRUB_TIMEOUT=0
GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`
GRUB_CMDLINE_LINUX_DEFAULT="intel_iommu=on iommu=pt"
GRUB_CMDLINE_LINUX=""

After updating this file, run:

$ sudo update-grub
$ sudo reboot

Managing virtual functions

VFs are managed per interface, and by default no VFs are created for an interface on boot.

The number of VFs for an interface are managed through the file /sys/class/net/<interface>/device/sriov_numvfs. So, to manually create 4 VFs on the interface enp161s0f1np1 for example, run:

$ echo 4 | sudo tee /sys/class/net/enp161s0f1np1/device/sriov_numvfs

Inspect the VFs using ip link:

7: enp161s0f1np1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 9000 qdisc mq state DOWN mode DEFAULT group default qlen 1000
    link/ether b8:3f:d2:34:6b:33 brd ff:ff:ff:ff:ff:ff
    vf 0     link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
    vf 1     link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
    vf 2     link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
    vf 3     link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
134: enp161s0f1v0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether da:04:48:fb:9d:d5 brd ff:ff:ff:ff:ff:ff permaddr a6:bd:ac:37:f8:62
135: enp161s0f1v1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether 3a:53:be:01:ba:f8 brd ff:ff:ff:ff:ff:ff permaddr 02:9c:fb:f2:04:fc
136: enp161s0f1v2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether 02:8f:44:8b:e2:71 brd ff:ff:ff:ff:ff:ff permaddr 32:12:54:cf:73:81
137: enp161s0f1v3: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether 92:2c:e5:ab:7c:93 brd ff:ff:ff:ff:ff:ff permaddr 22:a0:fc:15:54:1c

Note that these VFs are not automatically recreated after the host reboots. To automate the creation of the VFs on boot, you can create an udev rule, for example:

/etc/udev/rules.d/99-enp161s0f1np1-sriov.rules
ACTION=="add", SUBSYSTEM=="net", ENV{ID_NET_DRIVER}=="mlx5_core", ATTR{address}=="b8:3f:d2:34:6b:33", ATTR{device/sriov_numvfs}="4"