I am running 5 Nvidia graphics cards on a Ubuntu 16.04 Desktop machine with Nvidia 384.XX driver. I am trying to automate the system overall, but more specifically I would like to persist the overclock settings over a restart.
I have seen multiple ways to do this. The recommended way seems to either be through the xx-nvidia.conf XOrg configuration with the “RegistryDwords” option, or through the nvidia-settings.rc file. The problem is, I can not find documentation for the life of me on either. Specifically what the RegistryDwords options are, or the nvidia-settings.rc options. I have tried to create a nvidia-settings.rc file from the nvidia-settings GUI, but it does not seem to hold any information about the comamnds I am running below.
For this reason, and also for readability and dynamic usage on my end (easy settings changes) I have opted to try to set these values from a bash script.
#!/bin/bash # Enable nvidia-smi settings so they are persistent the whole time the system is on. nvidia-smi -pm 1 # Define the various overclocking settings (powerLimit in watts) powerLimit="100" coreOffset="150" memoryOffset="1000" targetFanSpeed="40" TOTAL_GPU=5 GPU_INDEX=0 while [ $GPU_INDEX -lt $TOTAL_GPU ]; do nvidia-smi -i $GPU_INDEX -pl $powerLimit nvidia-settings -a [gpu:$GPU_INDEX]/GpuPowerMizerMode=1 nvidia-settings -a [gpu:$GPU_INDEX]/GPUMemoryTransferRateOffset=$memoryOffset nvidia-settings -a [gpu:$GPU_INDEX]/GPUGraphicsClockOffset=$coreOffset nvidia-settings -a [gpu:$GPU_INDEX]/GPUFanControlState=1 nvidia-settings -a [fan:$GPU_INDEX]/GPUTargetFanSpeed=$targetFanSpeed let GPU_INDEX=GPU_INDEX+1 done
This script works perfectly when ran from a terminal after login. Though, it has problems when ran through either a cron job or /etc/rc.local. I am still a bit novice so I don’t understand the exact issue, but from what I understand these two options, cron and rc.local, run a bit before “Nvidia is up and running” on the system; not sure what that means technically.
I am currently most interested in running this script through rc.local. When ran in this manner, and the output is captured, I find that the power limits for the cards are set (nvidia-smi works correctly) but the calls to nvidia-settings do not and return the error
Failed to connect to Mir: Failed to connect to server socket: No such file or directory Unable to init server: Could not connect: Connection refused ERROR: The control display is undefined; please run `nvidia-settings --help` for usage information.
Which from what I can tell, means that “Nvidia” (at this point in the OS’s start up) can not find the Xorg server/information/display. I have found a few “solutions” to this, which come from headless systems. Namely export DISPLAY=:0. Though, I don’t think this is the correct/proper solution, and I also can’t seem to get it to work.
So my questions are.
How do I solve the problem described above, calling nvidia-settings from my rc.local?
When, in terms of OS startup, can I make the first call to nvidia-settings? Do I need to wait for the X server to startup, maybe for a ‘ps ax’ to return something with “nvidia” in it?
As a backup, where if anywhere, can I find information about the options available in nvidia-settings.rc or the specific variables and values of RegistryDwords?