Rendering problems on drivers above 383

Hey guys,

we are a small media agency, working with different render engines like:
Octane, Cycles, Vray

We have about 9 Systems with following configuration:

1x Titan for viewport
2x 1080 Ti for rendering

With this config, we have fluid viewports while rendering.
But after windows 10 updates, we have nvidia drivers above 383.
Now it feels like all 3 gpus are rendering, although the load on the viewport gpu is nearly 0%

I already talked to nvidia support and send the gpuz log files. The problems occur on every workstation
and it seems that its not a hardware problem.

Looks like something changed in drivers above 383 so we had to roll back to 381.89
Everything is fine here, and fluid. On all engines…

Would be nice if your guys could test and reproduce with blender.
Just download a free scene and rendering with cycles over gpu on multiple gpus.
But disable the display connected gpu in settings. Then use two viewports, first one for rendering and second one for navigating the viewport. The second one should be fluid while the other cards are on load.

Thanks for help
Have a nice day

Thanks for the report.
Could you please supply some additional information to be able to triage that as bug report?

  • Which exact display driver version “above 383” has shown that problem?
  • Which specific Titan model is that?
    The “System Information” saved from the NVIDIA Control Panel accessible via the link in the bottom left or under Help menu would be sufficient.
  • Have you tried if the problem persists with newer drivers?
    Release 383 drivers are over six months old and there are drivers from three newer branches available already here:
    http://www.nvidia.com/Download/Find.aspx?lang=en-us

When trying these I’d recommend to chose the “Custom Install” and “Clean Install” options inside the display driver installer to make sure previous installations are completely replaced

Hey

thanks for the fast response.

  • Which exact display driver version “above 383” has shown that problem?
    All drivers above 383 we tried have the same issue. I think we tried about 7 different drivers.

  • Which specific Titan model is that?
    In my case it is a Titan x founders. The other workstations have Titan Black. Same problem…
    Doesnt matter if we use Titan as Display gpu or 1080TI, same behavior.

  • Have you tried if the problem persists with newer drivers?
    Yesterday is was on 390.65 to make the gpuz log files. Same issue here.

I made about 5 clean installs, even deinstalld every gpu manually and installed again.
In games in general we have no problem, only while rendering over cuda.

Thank you for help

NVIDIA Systeminformationen-Bericht erstellt am: 01/12/2018 15:30:41
Name des Systems: 3D-018

[Anzeige]
Betriebssystem:	Windows 10 Enterprise, 64-bit
DirectX-Version:	12.0 
GPU-Prozessor:		GeForce GTX TITAN X
Treiberversion:		381.89
Direct3D-API-Version:	12
Direct3D-Funktionsebene:	12_1
CUDA-Kerne:		3072 
Kerntakt:		1000 MHz 
Speicher-Datenrate:	7010 MHz
Speicherschnittstelle:	384-Bit 
Speicherbandbreite:	336.48 GB/s
Gesamter verfügbarer Grafikspeicher:	45000 MB
Dedizierter Videospeicher:	12288 MB GDDR5
System-Videospeicher:	0 MB
Freigegebener Systemspeicher:	32712 MB
Video-BIOS-Version:	84.00.45.00.03
IRQ:			Not used
Bus:			PCI Express x16 Gen3
Geräte-ID:		10DE 17C2 113210DE
Teilenummer:		G600 0000
GPU-Prozessor:		GeForce GTX 1080 Ti
Treiberversion:		381.89
Direct3D-API-Version:	12
Direct3D-Funktionsebene:	12_1
CUDA-Kerne:		3584 
Kerntakt:		1480 MHz 
Speicher-Datenrate:	11010 MHz
Speicherschnittstelle:	352-Bit 
Speicherbandbreite:	484.44 GB/s
Gesamter verfügbarer Grafikspeicher:	43976 MB
Dedizierter Videospeicher:	11264 MB GDDR5X
System-Videospeicher:	0 MB
Freigegebener Systemspeicher:	32712 MB
Video-BIOS-Version:	86.02.39.00.22
IRQ:			Not used
Bus:			PCI Express x8 Gen3
Geräte-ID:		10DE 1B06 85E51043
Teilenummer:		G611 0050
GPU-Prozessor:		GeForce GTX 1080 Ti
Treiberversion:		381.89
Direct3D-API-Version:	12
Direct3D-Funktionsebene:	12_1
CUDA-Kerne:		3584 
Kerntakt:		1480 MHz 
Speicher-Datenrate:	11010 MHz
Speicherschnittstelle:	352-Bit 
Speicherbandbreite:	484.44 GB/s
Gesamter verfügbarer Grafikspeicher:	43976 MB
Dedizierter Videospeicher:	11264 MB GDDR5X
System-Videospeicher:	0 MB
Freigegebener Systemspeicher:	32712 MB
Video-BIOS-Version:	86.02.39.00.22
IRQ:			Not used
Bus:			PCI Express x16 Gen3
Geräte-ID:		10DE 1B06 85E51043
Teilenummer:		G611 0050

[Komponenten]

nvui.dll		8.17.13.8189		NVIDIA User Experience Driver Component
nvxdplcy.dll		8.17.13.8189		NVIDIA User Experience Driver Component
nvxdbat.dll		8.17.13.8189		NVIDIA User Experience Driver Component
nvxdapix.dll		8.17.13.8189		NVIDIA User Experience Driver Component
NVCPL.DLL		8.17.13.8189		NVIDIA User Experience Driver Component
nvCplUIR.dll		8.1.940.0		NVIDIA Control Panel
nvCplUI.exe		8.1.940.0		NVIDIA Control Panel
nvWSSR.dll		6.14.13.8189		NVIDIA Workstation Server
nvWSS.dll		6.14.13.8189		NVIDIA Workstation Server
nvViTvSR.dll		6.14.13.8189		NVIDIA Video Server
nvViTvS.dll		6.14.13.8189		NVIDIA Video Server
nvLicensingS.dll		6.14.13.8189		NVIDIA Licensing Server
NVSTVIEW.EXE		7.17.13.8189		NVIDIA 3D Vision Photo Viewer
NVSTTEST.EXE		7.17.13.8189		NVIDIA 3D Vision Test Application
NVSTRES.DLL		7.17.13.8189		NVIDIA 3D Vision Module
nvDispSR.dll		6.14.13.8189		NVIDIA Display Server
NVMCTRAY.DLL		8.17.13.8189		NVIDIA Media Center Library
nvDispS.dll		6.14.13.8189		NVIDIA Display Server
PhysX		09.17.0524		NVIDIA PhysX
NVCUDA.DLL		6.14.13.8189		NVIDIA CUDA 8.0.0 driver
nvGameSR.dll		6.14.13.8189		NVIDIA 3D Settings Server
nvGameS.dll		6.14.13.8189		NVIDIA 3D Settings Server

Ok, I’ve filed a bug report already and added these information now.

If the mechanism via the display and compute selections for the individual boards doesn’t work via the control panel, you could try if manually setting only the two GeForce 1080Ti boards to be visible to the compute driver helps to work around that.

You can select a subset of the installed GPU devices to be visible for compute via the CUDA_VISIBLE_DEVICES environment variable. The syntax is nicely explained in the linked article at the beginning of this blog entry: https://devblogs.nvidia.com/parallelforall/cuda-pro-tip-control-gpu-visibility-cuda_visible_devices/

Means leave the control panel setting alone and, for example, if device 0 is the Titan and 1 and 2 are the GeForce 1080Ti boards, try setting the environment variable to CUDA_VISIBLE_DEVICES to 1,2 in the computer’s advanced system settings (taking effect globally) and restart the application.

If the topology resp. CUDA device numbering is different change the numbers accordingly. That’s three tries at maximum.

I used the environment variable,
the cuda device was not visible anymore
but the viewport is still laggy.

Feels like the display gpu uses the same refresh the renderview uses.
In other words, while rendering is enabled,
the framerate of the viewport drops to the refreshrate of cycles(renderer).
The framerate is getting worse, when the renderview is enlarged.
When the renderview is about 100x100 pixel in size, its alsmost fluid.
On older drivers the viewport had high framerates (even with big renderborders),
like it is not connected to the other gpus at all.

Ok guys, i think i found a solution.

In Nvidia Control pannel, i had to set “OpenGL renderingGPU” to Titan x
Now it looks like the opengl viewport is restricted to the display GPU…

Funny, on the drivers before i never had to change this setting.

Thank you for your help and time
have a nice day!

Cheers Daniel

Awesome! Thanks for testing. Those are the easiest bug reports. ;-)

Totally, we are very happy!
Would be awesome, if nvidia driver could automatically detect the display gpu (if only one is connected)

Another question, is it possible to play games over a 1080TI while the display is connected to a different card?
Is the picture transferred over pci? Or do i have to connect the display directly to the desired/rendering gpu?

Automatically limiting the GPU which has the monitor connector attached to display-only is not actually what you would want to have with many other multi-GPU capable rendering implementations.

That’s exactly what your second question touches. If you limited the display to only the Titan, the other devices in the system shouldn’t actually be visible during the display device enumeration done in applications to determine which device to render on.
For example in D3D12 and Vulkan, the application is responsible for the multi-GPU device selection.

I have no experience with what happens when you set a single display device to a board which doesn’t have the monitor attached. My guess would be no image displayed on the monitor.

(Disclaimer: I’m not up to date with recent WDDM versions.) If not selecting a single device for display, I’d expect PCI-E transfers to the primary device which runs the desktop compositor when rendering on any non-primary device.