跳过正文

在Linux系统下安装Mellanox网卡驱动

Mellanox Linux Device Driver
目录

在旧版本的Linux操作系统中(如Cenos7.4),部分网卡不能被识别出来,需要手动安装驱动。

查看网卡
#

BMC界面查看安装了7张网卡,共13个网口,如下图

Mellanox Network List

在系统中使用lspci也可以看到13个网口

[root@localhost ~]# lspci | grep Mellanox
04:00.0 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5]
04:00.1 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5]
08:00.0 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]
08:00.1 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]
2d:00.0 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5]
5b:00.0 Ethernet controller: Mellanox Technologies MT28841
5b:00.1 Ethernet controller: Mellanox Technologies MT28841
5c:00.0 Ethernet controller: Mellanox Technologies MT28841
5c:00.1 Ethernet controller: Mellanox Technologies MT28841
96:00.0 Ethernet controller: Mellanox Technologies MT28841
96:00.1 Ethernet controller: Mellanox Technologies MT28841
97:00.0 Ethernet controller: Mellanox Technologies MT28841
97:00.1 Ethernet controller: Mellanox Technologies MT28841

但是使用命令查看网口,只可以看到5个ens网口,有8个没有,需要手动安装驱动

[root@localhost ~]# ip add
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
2: ens19f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
3: ens19f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
4: ens21f0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN qlen 1000
5: ens21f1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN qlen 1000
6: ens12: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
7: virbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN qlen 1000
8: virbr0-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast master virbr0 state DOWN qlen 

安装驱动
#

下载驱动
#

驱动下载地址:

https://developer.nvidia.com/networking/ethernet-software

EN版本只包含网卡驱动,OFED版本既有驱动还有部分配套软件,建议下载OFED版本。

Mellanox Network List

然后选择对应的系统和架构,下载安装包

Mellanox Network List

查看操作系统版本

[root@localhost ~]# cat /etc/redhat-release 
CentOS Linux release 7.4.1708 (Core) 

安装驱动-EN
#

解压安装包
#

[root@localhost Downloads]# ls
mlnx-en-23.10-3.2.2.0-rhel7.4-x86_64.tgz
[root@localhost Downloads]# tar -vxf mlnx-en-23.10-3.2.2.0-rhel7.4-x86_64.tgz 

安装程序
#

进入目录,执行安装程序

[root@localhost Downloads]# cd mlnx-en-23.10-3.2.2.0-rhel7.4-x86_64/
[root@localhost mlnx-en-23.10-3.2.2.0-rhel7.4-x86_64]# ls
common_installers.pl  common.pl  create_mlnx_ofed_installers.pl  distro  install  is_kmp_compat.sh  LICENSE  mlnx_add_kernel_support.sh  RPM-GPG-KEY-Mellanox  RPMS  RPMS_ETH  src  uninstall.sh
[root@localhost mlnx-en-23.10-3.2.2.0-rhel7.4-x86_64]# ./install 
Logs dir: /tmp/mlnx-en.55110.logs
General log file: /tmp/mlnx-en.55110.logs/general.log
Verifying KMP rpms compatibility with target kernel...
This program will install the mlnx-en package on your machine.
Note that all other Mellanox, OEM, OFED, RDMA or Distribution IB packages will be removed.
Those packages are removed due to conflicts with mlnx-en, do not reinstall them.

# 这里输入y即可
Do you want to continue?[y/N]:y

重新加载新驱动
#

等安装完,需要执行提示命令,重新加载新驱动

/etc/init.d/mlnx-en.d restart

注意,重新加载驱动后,网卡名会变化,对应的网卡配置文件也要修改才行

# 如,ens19f0变为ens19f0np0
9: ens19f0np0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000

查看网卡信息
#

重新查看网卡数量,可以看到变为13个,恢复正常。

9: ens19f0np0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
10: ens19f1np1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
11: ens21f0np0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN qlen 1000
12: ens21f1np1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN qlen 1000
13: ens12np0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
14: ens13f0np0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN qlen 1000
15: ens13f1np1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN qlen 1000
16: ens14f0np0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN qlen 1000
17: ens14f1np1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN qlen 1000
18: ens15f0np0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN qlen 1000
19: ens15f1np1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN qlen 1000
20: ens16f0np0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN qlen 1000
21: ens16f1np1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN qlen 1000

安装驱动-OFED
#

解压后执行安装程序

[root@localhost ~]# tar -vxf MLNX_OFED_LINUX-23.10-3.2.2.0-rhel7.4-x86_64
[root@localhost ~]# cd MLNX_OFED_LINUX-23.10-3.2.2.0-rhel7.4-x86_64/ 
[root@localhost MLNX_OFED_LINUX-23.10-3.2.2.0-rhel7.4-x86_64]# ./mlnxofedinstall

如果缺少依赖,按提示安装

General log file: /tmp/MLNX_OFED_LINUX.74618.logs/general.log
Error: One or more required packages for installing MLNX_OFED_LINUX are missing.
Please install the missing packages using your Linux distribution Package Management tool.
Run:
yum install tcl tk

然后继续安装,输入y即可

[root@localhost MLNX_OFED_LINUX-23.10-3.2.2.0-rhel7.4-x86_64]# ./mlnxofedinstall 
Logs dir: /tmp/MLNX_OFED_LINUX.75823.logs
General log file: /tmp/MLNX_OFED_LINUX.75823.logs/general.log
Verifying KMP rpms compatibility with target kernel...
This program will install the MLNX_OFED_LINUX package on your machine.
Note that all other Mellanox, OEM, OFED, RDMA or Distribution IB packages will be removed.
Those packages are removed due to conflicts with MLNX_OFED_LINUX, do not reinstall them.

Do you want to continue?[y/N]:y

安装完也一样重新加载驱动

Log File: /tmp/Fpnr9q8X6m
Real log file: /tmp/MLNX_OFED_LINUX.75823.logs/fw_update.log
Failed to update Firmware.
See /tmp/MLNX_OFED_LINUX.75823.logs/fw_update.log
To load the new driver, run:
/etc/init.d/openibd restart

后记
#

Mellanox网卡在系统下执行ip link set down命令后link灯依然点亮,而intel的网卡是熄灭的。经过测试排查,总结记录一下处理过程如下:

需要将网卡的KEEP_ETH_LINK_UP配置项关闭

mellanox的MFT工具需要提前安装下载并安装,可以从以下网址下载(https://www.mellanox.com/products/adapter-software/firmware-tools

以下是具体处理过程:

[root@localhost ~]#mst start
[root@localhost ~]#mst status
[root@localhost ~]#mlxconfig –d /dev/mst/**** s KEEP_ETH_LINK_UP_P1=0 (其中***部分为上一步命令的输出。)
[root@localhost ~]#mlxconfig –d /dev/mst/**** s KEEP_ETH_LINK_UP_P2=0
[root@localhost ~]#reboot

原因是Mellanox的网卡在 执行ip link set down命令之后,网口link灯依然是亮的,是由于网卡的KEEP_ETH_LINK_UP配置项是默认开启的。

该配置项可以保证网卡的PHY在部分在没有物理断连的情况下一只保持的link状态。

在后期的实验室中实测,将KEEP_ETH_LINK_UP配置关闭,执行ip link set down命令之后,link灯可以熄灭。

相关文章

CXL内存扩展卡在服务器上如何安装和配置
CXL Memory Expansion
AMD 计划推出全新 Ryzen 200 系列APU
AMD APU
VXLAN为什么需要EVPN
VXLAN EVPN