Linux enabling persistent PCIe Gen 4.0 and PCIe buswidth 16 for L40S GPU

In my server application, Using RHEL I would like my system to have persistent settings to configure PCIe Gen 4.0 and max PCIe buswidth x16 for all L40S GPUs .
These settings have been confirmed to be achievable with the supermicro server configuration.

The following forum post suggests PCIe Gen 3 can be forced:

Edit /etc/modprobe.d/nvidia.conf ``’
options nvidia NVreg_EnablePCIeGen3=1


The NVreg_EnablePCIeGen3 setting appears to force PCIe Gen3 mode.

Q1. Is there a way to force the PCIe Gen 4.0 and PCIe buswidth 16 ?

I would expect the nvidia-smi --query-gpu=$csv_params --format=csv   to look similar to the following:

![GPU_pcie_gen4_width16|690x109](upload://1fQEergCwBOYUK6KIihQGtbq2rJ.jpeg)

(Note. This is my 5th edit of this post because I have been unsuccessful to upload any images to the NVIDIA site, after several edit requests. After each upload the image does NOT appear. So describing the image here with words..).
The Image displays the CSV file with following same values for all GPUs 0 through 7 in my system.
pcie.link.current=4
pcie.link.gen.gpucurrent=4
pcie.link.gen.max=4
pcie.link.gen.gpumax=4
pcie.link.gen.hostmax=5
pcie.link.width.current=16
pcie.link.width.max=16

In other words:
During system operation, the following "xxx.current" values change dynamically for each system boot and during system operation:  pcie.link.current,  pcie.link.gen.gpucurrent, and pcie.link.width.current.
However, I want to make these values FIXED over the life of the system.

Q2. Is there a way to also force the PCIe Gen and bus width to be maximum supported by the host hardware and the GPU hardware, and leave at the those negotiated settings ? 

Q3. Do you recommend a specific PCIe Active State Power Management (ASPM) setting such as "off" or "performance" for my high performance server application?


Related links:
https://docs.redhat.com/en/documentation/red_hat_enterprise_linux/6/html/power_management_guide/aspm
https://forums.developer.nvidia.com/t/enabling-pcie-3-0-with-nvreg-enablepciegen3-on-titan/29542
https://resources.nvidia.com/en-us-l40s/l40s-datasheet-28413
1 Like

According to this,

cat /proc/driver/nvidia/params

will give you a list of valid parameters your driver/card will accept.

cat /proc/driver/nvidia/params
ResmanDebugLevel: 4294967295
RmLogonRC: 1
ModifyDeviceFiles: 1
DeviceFileUID: 0
DeviceFileGID: 0
DeviceFileMode: 438
InitializeSystemMemoryAllocations: 1
UsePageAttributeTable: 4294967295
EnableMSI: 1
EnablePCIeGen3: 0
MemoryPoolSize: 0
KMallocHeapMaxSize: 0
VMallocHeapMaxSize: 0
IgnoreMMIOCheck: 0
TCEBypassMode: 0
EnableStreamMemOPs: 0
EnableUserNUMAManagement: 1
NvLinkDisable: 0
RmProfilingAdminOnly: 1
PreserveVideoMemoryAllocations: 0
EnableS0ixPowerManagement: 0
S0ixPowerManagementVideoMemoryThreshold: 256
DynamicPowerManagement: 3
DynamicPowerManagementVideoMemoryThreshold: 200
RegisterPCIDriver: 1
EnablePCIERelaxedOrderingMode: 0
EnableResizableBar: 0
EnableGpuFirmware: 18
EnableGpuFirmwareLogs: 2
EnableDbgBreakpoint: 0
OpenRmEnableUnsupportedGpus: 1
DmaRemapPeerMmio: 1
RegistryDwords: “”
RegistryDwordsPerDevice: “”
RmMsg: “”
GpuBlacklist: “”
TemporaryFilePath: “”
ExcludedGpus: “”

So short of filing a feature request to add an, “EnablePCIeGen4”, parameter, it seems there is no way to do this.

Is there a particular reason you don’t wish the cards to reduce speed and width to save power during idle time?

We don’t want to save power and want to have maximum performance.
The devices are running code all the time and there are brief moments in time, the GPU is idle.
We’ve seen many instances where p2pBandwidthLatencyTest tool indicates half the bandwidth is available for at least one GPUx. All traffic to and from the GPUx is half the value reported by all the other GPUs.

P2P Connectivity Matrix
D\D 0 1 2 3 4 5 6 7
0 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1
2 1 1 1 1 1 1 1 1
3 1 1 1 1 1 1 1 1
4 1 1 1 1 1 1 1 1
5 1 1 1 1 1 1 1 1
6 1 1 1 1 1 1 1 1
7 1 1 1 1 1 1 1 1
Test Cycle=0________________
Unidirectional P2P=Disabled Bandwidth Matrix (GB/s)
D\D 0 1 2 3 4 5 6 7
0 657.06 21.80 21.86 21.94 22.04 21.91 12.88 21.85
1 21.87 670.31 21.93 22.17 21.83 22.10 12.88 21.94
2 22.06 21.86 668.88 21.92 21.96 22.05 12.89 21.99
3 21.81 21.84 21.87 666.31 22.01 21.84 12.89 21.91
4 21.79 21.75 21.95 21.96 670.89 21.54 12.87 21.67
5 21.76 21.80 21.96 21.92 21.54 671.18 12.87 21.83
6 12.77 12.78 12.77 12.77 12.78 12.77 669.16 12.78
7 21.69 21.71 21.88 21.98 21.68 21.66 12.85 670.60
Unidirectional P2P=Enabled Bandwidth (P2P Writes) Matrix (GB/s)
D\D 0 1 2 3 4 5 6 7
0 659.28 22.16 22.15 22.16 19.71 19.71 9.86 19.71
1 22.15 684.11 22.14 22.15 19.71 19.71 9.86 19.71
2 22.16 22.16 687.44 22.15 19.71 19.71 9.86 19.71
3 22.16 22.16 22.16 687.72 19.71 19.70 9.86 19.71
4 19.71 19.71 19.71 19.71 683.81 22.15 11.17 22.15
5 19.70 19.71 19.71 19.71 22.15 683.21 11.17 22.15
6 11.09 11.09 11.09 11.09 11.09 11.09 682.02 11.09
7 19.67 19.71 19.71 19.71 22.16 22.16 11.17 684.41

PCIe bus width gets reduced from 16x to 8x following message gets logged by the kernel:
messages:Jul 7 13:17:26 rhel88-hostname kernel: pci 0000:c0:00.0: 126.024 Gb/s available PCIe bandwidth, limited by 16.0 GT/s PCIe x8 link at 0000:be:01.0 (capable of 252.048 Gb/s with 16.0 GT/s PCIe x16 link)

The particular GPUx pcie.link.width.current = 8, while all the other GPUs show pcie.link.width.current = 16.

Once the system gets into this state, the GPU remains in this state. We close all processes to the GPU, but the width remains limited to 8. A reboot of the system, does not always fix this issue.
We’ve pulled the GPUx card that has the issue, ensured the slot was clean, etc , but problem still pops up.
Since we’ve recently pair-swapped of all the cards (0/1, 2/3, etc) in the respective PCIe slots, we have not seen the PCIe limited to 8 issue. However, have major concerns the bus width issue is going to pop-up again!!!

As far as pcie.link.gen…
We’ve also seen pcie.link.gen.current=4 and pcie.link.gen.gpucurrent=4 , change to pcie.link.gen.current=1 and pcie.link.gen.gpucurrent=1 , and back to pcie.link.gen.current=4 and pcie.link.gen.gpucurrent=4 .. There seems to be a correlation that p2pBandwith has lower results when initially at pcie.link.gen.current=1 and pcie.link.gen.gpucurrent=1.

1 Like

So just to clarify your response to my questions:
A1 “No, current driver does not have the ability to set user desired GPU PCIe Gen4.0 and Width 16x”
A2. “No, current driver does not have the ability to keep the GPU PCIe bus at a fixed maximum negotiated generation and bus width as limited by the GPU and PCIe bus hardware & software configuration.”

Q3. (Earlier) Do you recommend a specific PCIe Active State Power Management (ASPM) setting such as “off” or “performance” for my high performance server application?
A3. ??? Can you please provide answer??)

The only parameter supported is the EnablePCIeGen3 one which is no use to you and there is none for link width.

As an aside, the P2PLatencyTest tool has been superceded by this one.

I can’t offer any other worthwhile information, and while not directly related to the issue you’re seeing, this post may be of use.

Later: My apologies, the comment about P2PLatencyTest tool being superceded is incorrect. It’s the “BandwidthTest” sample that nvbandwidth replaces.

Thank you for providing some feedback related to performance measurement and answering questions 1 & 2.
Q3. (Earlier & added clarification) While using NVIDIA GPUs and with priority of highest bandwidth and lowest latency performance. Do you recommend a specific PCIe Active State Power Management (ASPM) setting such as “off” or “performance” for my high performance server application?
A3. ??? Can you please provide answer??)

Q4. Is there some linux operating system commands or configuration settings that can be used to force the PCIe gen 4 and bus width 16x ? Possibly use pcie_set_speed.sh provided here PCIe Set Speed [Alex Forencich] or troubleshooting steps here: https://unix.stackexchange.com/questions/42361/force-re-negotiation-of-pcie-speed-on-linux . On my system, the pcie_set_speed.sh was unsuccessful at forcing the setting.
I’ve also confirmed BIOS settings are configured correctly to support the 16x operation.

On Q3, I do not know, but given your situation, I’d probably go with “Disable”.

Q4. I’m not sure any OS side command would be effective, as the the card is negotiating speed/width with the PCIe root controller, based on it’s own power saving algorithm.

I started to look at this but after developing a huge respect for those that deal with this stuff, dropped out early.

1 Like

Hi @dan.hartman , the PCIe settings are determined by the lowest common denominator in the chain (GPU → PCIe slot → CPU/chipset).

  1. L40S supports Gen 4 natively
  2. Verify your Hardware Support:
# Check current PCIe capabilities
lspci -vvv | grep -A 20 "NVIDIA"
# Look for "LnkCap" and "LnkSta" lines

# Check PCIe slot capabilities  
sudo dmidecode -t slot | grep -A 10 -B 2 "PCIe"
  1. In your BIOS / UEFI config, you can try setting the PCIe slots explicitly to Gen 4 instead of ‘Auto’ with x16.

Also, could you please help to upload a bug report here? I will take a look.

1 Like

Thank you for your replies..

Answer to your questions:

  1. For Verify Hardware support please see attached files.
  2. BIOS/UEFI is configured to support Gen4 x16.
  3. bugreport is attached as well.

2025-08-06_12-05-57__nvidia-bug-report_SANITIZED.zip (6.4 MB)

2025_08_07_0830__s2_dmidecode.txt (4.3 KB)

2025_08_07_0830__s2_lspci.txt (14.5 KB)