ATTN: David Hart -- All drivers after 430.86 are breaking Iray for DAZ Studio 4.12

As the title says, all display drivers after 430.86 seem to be causing Iray CPU fallback on both RTX and non-RTX cards, regardless of the amount of memory used by the scene. The only difference is that the newer the drivers the faster they fallback to CPU. With the latest 441.28 studio driver I just switch to Iray preview on an empty scene and it falls back to CPU.

Since it worked in 430.86 I am going to assume it is a driver regression. There is no proper channel to report this issue to NVIDIA as a user so because Iray and OptiX are connected I am posting here hoping that someone from NVIDIA OptiX team will pick it up and investigate.

2019-11-23 16:47:04.409 Iray [INFO] - IRAY:RENDER ::   1.15  IRAY   rend info : Using OptiX version 6.1.2
2019-11-23 16:47:04.414 Iray [INFO] - IRAY:RENDER ::   1.15  IRAY   rend info : Initializing OptiX for CUDA device 0
2019-11-23 16:47:04.478 WARNING: ..\..\..\..\..\src\pluginsource\DzIrayRender\dzneuraymgr.cpp(304): Iray [ERROR] - IRAY:RENDER ::   1.15  IRAY   rend error: "buildInputs[0].instanceArray", "numInstances" is non-zero, but "instances" is null
2019-11-23 16:47:04.478 WARNING: ..\..\..\..\..\src\pluginsource\DzIrayRender\dzneuraymgr.cpp(304): Iray [ERROR] - IRAY:RENDER ::   1.15  IRAY   rend error: optixAccelBuild( context,nullptr, &opt,&inst_input,1, get_address(m_d_scratch),tlas_size.tempSizeInBytes, get_address(m_tlas.data),tlas_size.outputSizeInBytes, &m_tlas.traversable, &bbox_desc,1) failed: Invalid value
2019-11-23 16:47:04.479 WARNING: ..\..\..\..\..\src\pluginsource\DzIrayRender\dzneuraymgr.cpp(304): Iray [ERROR] - IRAY:RENDER ::   1.10  IRAY   rend error: CUDA device 0 (GeForce RTX 2080 Ti): Scene setup failed
2019-11-23 16:47:04.479 WARNING: ..\..\..\..\..\src\pluginsource\DzIrayRender\dzneuraymgr.cpp(304): Iray [ERROR] - IRAY:RENDER ::   1.10  IRAY   rend error: CUDA device 0 (GeForce RTX 2080 Ti): Device failed while rendering
2019-11-23 16:47:04.479 WARNING: ..\..\..\..\..\src\pluginsource\DzIrayRender\dzneuraymgr.cpp(304): Iray [WARNING] - IRAY:RENDER ::   1.10  IRAY   rend warn : All available GPUs failed.
2019-11-23 16:47:04.479 WARNING: ..\..\..\..\..\src\pluginsource\DzIrayRender\dzneuraymgr.cpp(304): Iray [WARNING] - IRAY:RENDER ::   1.10  IRAY   rend warn : No devices activated. Enabling CPU fallback.
2019-11-23 16:47:04.479 WARNING: ..\..\..\..\..\src\pluginsource\DzIrayRender\dzneuraymgr.cpp(304): Iray [WARNING] - IRAY:RENDER ::   1.10  IRAY   rend warn : Re-rendering iteration because of device failure
2019-11-23 16:47:04.479 WARNING: ..\..\..\..\..\src\pluginsource\DzIrayRender\dzneuraymgr.cpp(304): Iray [ERROR] - IRAY:RENDER ::   1.10  IRAY   rend error: All workers failed: aborting render

DAZ Studio version is 4.12.1.16 Public Beta. Iray version is 5.0, build 317500.5529 n, 14 Sep 2019, nt-x86-64-vc14. Hardware is RTX 2080 Ti.

Thanks for letting us know, it has been reported to the Iray team.


David.

David,

Thanks for reporting it.

There is a thread on DAZ forums where numerous people complained about it, with different errors than the one I posted here (memory access errors, etc):
https://www.daz3d.com/forums/discussion/341006/daz-studio-pro-beta-version-4-12-1-16-updated

Search the thread for CPU fallback posts and check the DAZ studio logs people included to see the variations.

I have this problem with 430.86 as well, but it triggers under very hard to figure out conditions. Sometimes just changing a single texture on a model will do it, sometimes switching spectral rendering to ON will do it.

NVIDIA has released so many driver versions recently it is getting hard to track all this. It seems that the quality assurance has gone down the hill – latest CUDA 10.2 even comes with a non-WHQL (unsigned) driver.

It would be nice if NVIDIA slowed down a bit and focused on fixing the bugs for a while now after adding so many new (often half-baked) features.

Just my 0.02c.

If this is scene dependent, do you have a minimal scene or config files you can share with the Iray team that reproduces the problems reliably? It could take longer to fix if the conditions to reproduce aren’t easily discoverable or reproducible all the time.

I’ll pass your comments on, but I think it’s premature to assume this has anything to do with the driver. The error log you shared looks like it’s a bug in Iray to me. It’s possible for application bugs to surface in different driver versions due to increased error reporting and increased QA. OptiX has been adding more error checking recently and other applications have had bugs exposed because of the extra checking, my initial speculative guess is that’s what’s happening in your case as well.


David.

David,

The scenes are in DAZ Studio format, not sure if sharing those would help?

Anyway, it seems that we are both right and that there are two separate issues at play here.

Issue #1:

With both DAZ Studio 4.12.0.86 General Release (which is using Iray 5.0-beta, build 317500.3714 n, 19 Jul 2019, nt-x86-64-vc14) and DAZ Studio 4.12.1.16 Public Beta (which is using Iray 5.0, build 317500.5529 n, 14 Sep 2019, nt-x86-64-vc14) and R440 branch of NVIDIA drivers (I tested with buth 441.12 and 441.28 Studio driver) just launching DAZ Studio to an empty environment (empty scene) and switching viewport to Iray preview triggers immediate CPU fallback with the error shown in the log above.

This particular error does not happen with 431.86 or 430.86 studio driver.

I have just seen another user on DAZ forums post that they don’t have any issues with 441.28 studio driver, but they use GTX 1060 card so this seems to be an RTX specific issue.

Issue #2:

With DAZ Studio 4.12.1.16 Public Beta (which is using Iray 5.0, build 317500.5529 n, 14 Sep 2019, nt-x86-64-vc14) and older studio driver versions such as 431.86 and 430.86, loading a certain human figure, a certain clothing item, and any material for said clothing item causes CPU fallback with a different error:

2019-11-24 13:11:22.077 WARNING: ..\..\..\..\..\src\pluginsource\DzIrayRender\dzneuraymgr.cpp(304): Iray [ERROR] - IRAY:RENDER ::   1.2   IRAY   rend error: CUDA device 0 (GeForce RTX 2080 Ti): an illegal memory access was encountered (while launching CUDA renderer)
2019-11-24 13:11:22.077 WARNING: ..\..\..\..\..\src\pluginsource\DzIrayRender\dzneuraymgr.cpp(304): Iray [ERROR] - IRAY:RENDER ::   1.2   IRAY   rend error: CUDA device 0 (GeForce RTX 2080 Ti): Failed to launch renderer
2019-11-24 13:11:22.077 WARNING: ..\..\..\..\..\src\pluginsource\DzIrayRender\dzneuraymgr.cpp(304): Iray [ERROR] - IRAY:RENDER ::   1.15  IRAY   rend error: CUDA device 0 (GeForce RTX 2080 Ti): Device failed while rendering
2019-11-24 13:11:22.077 WARNING: ..\..\..\..\..\src\pluginsource\DzIrayRender\dzneuraymgr.cpp(304): Iray [WARNING] - IRAY:RENDER ::   1.15  IRAY   rend warn : All available GPUs failed.
2019-11-24 13:11:22.077 WARNING: ..\..\..\..\..\src\pluginsource\DzIrayRender\dzneuraymgr.cpp(304): Iray [WARNING] - IRAY:RENDER ::   1.15  IRAY   rend warn : No devices activated. Enabling CPU fallback.
2019-11-24 13:11:22.077 WARNING: ..\..\..\..\..\src\pluginsource\DzIrayRender\dzneuraymgr.cpp(304): Iray [WARNING] - IRAY:RENDER ::   1.15  IRAY   rend warn : Re-rendering iteration because of device failure
2019-11-24 13:11:22.077 WARNING: ..\..\..\..\..\src\pluginsource\DzIrayRender\dzneuraymgr.cpp(304): Iray [ERROR] - IRAY:RENDER ::   1.15  IRAY   rend error: All workers failed: aborting render
2019-11-24 13:11:22.077 WARNING: ..\..\..\..\..\src\pluginsource\DzIrayRender\dzneuraymgr.cpp(304): Iray [ERROR] - IRAY:RENDER ::   1.15  IRAY   rend error: CUDA device 0 (GeForce RTX 2080 Ti): an illegal memory access was encountered (while de-allocating memory)

This does not happen with DAZ Studio 4.12.0.86 General Release so this might as well be an Iray (or even DAZ Studio application) issue. I am thinking that Iray might be more likely to blame because sometimes there is visual corruption in the Iray viewport after loading materials – they affect the parent figure textures (for example the texture of the clothing item seems to be blended with the skin, and not just color but bump as well). Toggling Iray preview or loading another object clears that up.

I will submit a bug report to DAZ 3D for the #2 with steps to reproduce, but the #1 seems like a driver bug to me.

If you have scenes that reproduce the issue, yes it would help to have access to those scenes. I’m sure someone on the Iray team can use DAZ assets. If you can post a link, I will attach the assets to the bug report I filed.

Your second log error also indicates the problem is likely an application side issue. It is possible that all of your bug reports here could actually be problems with errors in specific DAZ assets. They could be bugs in DAZ Studio, or Iray. And they could be bugs in RTX or the driver, that is also possible, but it should not be assumed, even if the issues are only being experienced with certain GPUs or certain driver versions. Again, any changes that are tested safe on the driver side can expose latent bugs in the application code. In fact, this just happened with our own SDK samples, here on the forum.

It would be best to report ALL the issues you have while running DAZ to DAZ, and let DAZ rule out asset & application errors and then DAZ can contact the Iray team about any bugs that DAZ believes are out of their hands. In that case, the Iray team will be able to determine whether the problem is an Iray application issue or a driver issue, and get it fixed. Skipping over DAZ and Iray, and jumping straight to driver bug reports, will typically lead to it taking longer to get the bug fixed than if the problem is fist vetted properly by DAZ and then Iray.


David.

David,

Of course I did report both issues to DAZ.

  • Issue #1 ticket is #314541.
  • Issue #2 ticket is #314537.

For reporudcing both issues the common SW & HW pre-requisites are:

  1. Windows 10 x64 1809 (I am on Enterprise LTSC, shouldn’t matter much as long as RTX is supported)
  2. RTX 2080 Ti card or similar
  3. DAZ Studio 4.12.0.86 General Release
  4. DAZ Studio 4.12.1.16 Public Beta

They can be installed and used side by side. You obtain Beta by “purchasing” it (it’s free) and afterwards it shows in your DAZ Install Manager once you check “Public Build” in Download Filters.

For the issue #1, additionally:

  1. NVIDIA Studio or Game Ready Driver R440 series (any version).

Steps to reproduce issue #1:

  1. Launch DAZ Studio (any version).
  2. Switch to Iray preview.

Iray will immediatelly fallback to CPU even with an empty scene.

For the issue #2, additionally:

  1. NVIDIA Studio or Game Ready Drivers lower than R440 but at least 430.86 (DAZ Studio requirement).
  2. Genesis 8 Female figure installed
  3. Babina 8 Character (https://www.daz3d.com/babina-8)
  4. Lingerie Robe Set (https://www.daz3d.com/lingerie-robe-set-for-genesis-8-female-s)

Steps to reproduce issue #2:

  1. Launch DAZ Studio 4.12.1.16 Public Beta.
  2. Switch to Iray preview.
  3. Load Babina 8 Character
  4. Load Lingerie Robe Set Panties
  5. Load Lingerie Robe Set Bra

Iray will immediatelly fallback to CPU.

In the next post I will give my feedback on CUDA and driver releases.

This is what I have to say about recent NVIDIA releases.

First, if NVIDIA releases drivers which break professional graphics applications then in my book that’s a driver regression regardless of whether it is actually a bug in the driver itself, application, Iray, OptiX SDK, or CUDA.

It is totally irresponsible behavior from NVIDIA to release drivers which are breaking professional graphics applications when you know how long it can take for the whole chain to get diagnosed, reported, reproduced, triaged, fixed, and updated before it reaches the customers.

R440 drivers should not have passed QA without checking whether Iray still works in at least the latest release version of DAZ Studio (which is 4.12.0.86).

Sadly, that’s not the only example of poor QA – CUDA 10.2 was released with driver 441.20 in the package which is not even WHQL signed and as such it won’t even be loaded by Windows after install. If you do install it and reboot your PC will boot with a monitor powered off and you will have to reach for the reset button to reboot into VGA mode and fix this by installing proper drivers. How did this release of CUDA 10.2 happen with unsigned driver?

Furthermore, I reported a bug in CUDA and display driver setup (it’s in NVI which is common to both) which corrupts PATH variable which after 6+ months still isn’t fixed, and I was wrongly informed that it was fixed in CUDA 10.2 release.

I spent considerable effort reporting that bug, communicating, waiting, uninstalling CUDA 10.1, installing CUDA 10.2, recovering my system from unsigned drivers, reporting that it’s still broken.

I also spent most of my weekend reinstalling different driver versions around 20 times to diagnose those DAZ Studio issues which ultimately point to driver and Iray, and frankly at this point I feel like an idiot because I paid 1,600 EUR for RTX 2080 Ti, I am doing the job of NVIDIA’s QA team, and I am not getting paid for it.

The worst part? I still can’t even properly use the bloody thing for raytracing most of the time, and it’s been more than a year since I got it. I don’t dare to imagine what my frustration level would have been if I bought a Quadro card.

You guys should try to step into your user’s shoes for a while, see how long you would tolerate screwup after screwup like that. I know you do a lot of great work too and I appreciate that (I love and use NVJPEG library), but as I said I think it’s a high time to slow down the feature mill and do some serious bug and QA procedure fixing.

Igor, the team is in the process of investigating the issues you brought up and fixing them. They are taking your point seriously and considering this a driver regression as you suggest.

One of the issues you mentioned has a fix already, the one where Iray will fallback to CPU with an empty scene (Issue #1). This fix will appear in the next driver update. You might be able to avoid this problem by not testing empty scenes in the mean time.

For Issue #2, I believe the team is going to try and repro, but yes as you mentioned since it’s a beta version of DAZ, it may be best to wait for the next general release on that one.

We hear your level of frustration loud and clear. I know first hand what it feel like as a user when things don’t work and it seems like nobody is fixing it. The OptiX team doesn’t have any say whatsoever in WHQL signing or CUDA toolkit releases, but we will take your feedback and strive to improve OptiX releases going forward as much as we possibly can.

Out of curiosity, does DAZ publish certified driver version numbers with their releases? Some of our other customers recognize that the fast pace of driver updates isn’t necessarily good for users, and have started publishing their own list of known good driver version numbers so users don’t waste their time trying to update and test the latest driver releases before the app developers have a chance to do it. If DAZ isn’t doing this yet, and you would like us to suggest it, we’d be happy to reach out to them and propose it from our end.


David.

David,

Thanks you very much for your patience with me, and also for escalating this.

Just to clarify regarding issue #1 – I wasn’t testing empty scenes, Iray always falls back to CPU regardless of scene content for me. That’s the reason I classified it as a driver regression.

As for issue #2 I just saw another user reproduce it with the assets I listed. For them it took loading HD version of the figure and 4 times loading and deleting of clothing assets to fallback to CPU but they can also reproduce it every time they do that so they also added this information to their support ticket. Hopefully DAZ figures out what the problem is and publishes another Public Beta soon.

The reason some of us are even using DAZ Studio Public Beta instead of General Release is that it has some new interesting features (such as strand based hair) and newer Iray which brings better performance for RTX owners. Furthermore, until 4.12.0.86 General Release was published, those Public Betas were the only versions which worked on RTX cards while not even using RTX cores for the most part of this year.

As for WHQL signing and CUDA toolkit I know your team has nothing to do with that but I wanted to give you more examples of poor QA to better get my point across. I would appreciate if you forward that feedback to the relevant teams.

To answer your question. DAZ publishes minimum required driver version – for current General Release and Public Beta it is 430.86 or higher. I believe this is dictated by the Iray/OptiX/CUDA requirements. It would be great if NVIDIA could work with DAZ so things like this don’t happen.

Finally, regarding whether driver updates are good or not – I personally believe that they are, and that most people should always update. However, people working with professional graphics applications and developers who need to use specific driver version for 3D, video encoding, CUDA, etc, should have a way of knowing which version will work for them and stick to it until the next one is ready.

Sadly, there are two issues with this.

  1. This is a personal example, but I am sure there are others in similar situation – I work with DAZ Studio, but I am also a software developer and I use CUDA libraries in some of my projects. Iray issue has forced me to roll back to 431.86, and as far as I know, CUDA 10.2 requires R440 series driver, and it gets even worse if you throw latest games into the mix. Now what, have a separate PC for each task with different driver version installed?

  2. There exist two mechanisms of automatic driver updates – Geforce Experience (which at least I can avoid installing) and in Windows 10 there is Windows update. Just few days ago Windows decided to change my video driver from a working one to 432.00 while I was using the PC. For regular users this is good, but I specifically used group policy designed to prevent this and it still did it, probably because driver was tagged as a security update.

In other words, it is getting harder and harder to control our working environments, and with software vendors rushing new features and getting more and more lax with QA we are nearing the point where we spend more time tinkering with our PCs to make everything click in place than actually using them. This is reminding me of the time when I owned Radeon card and had to change drivers for every game I wanted to play because not a single one worked with all of them. That was around 15 years ago, please let’s not go back there.

Igor, I heard a rumor that Daz released a Daz Studio Beta update yesterday that fixes some of these issues you mentioned. Would you mind trying 4.12.1.40 Beta and letting us know which of the problems you listed have been resolved?


David.

David,

It is not a rumor, DAZ did publish 4.12.1.40 Public Beta – they have a public changelog here:
http://docs.daz3d.com/doku.php/public/software/dazstudio/4/change_log

It the details thread there is this info:

- Integrated Iray RTX 2019.1.5 (317500.7473)
   REQUIRES NVIDIA Driver 430.86 (or newer) on Windows
   This requirement has not changed since the 4.12.0.33 build
<b> - Works arround defects in 440 series drivers</b>
 - Removed the "Architectural Sampler" property from Render Settings
   Support for this setting was removed from NVIDIA Iray as of 2017.1 beta (296300.616)

I cannot test myself at the moment but according to other users the issue #1 seems to have been worked around in 4.12.1.40 Public Beta. However, that issue still affects 4.12.0.86 General Release so either DAZ has to push new General Release (which I doubt they would be willing to do so soon, especially given the issue #2 is still present in the new Public Beta) or NVIDIA has to fix this in the next driver. Frankly, I hope for the latter.

The issue #2 with fallback to CPU which was present in 4.12.1.16 Public Beta and all driver versions from 430.86 to 441.28 (the one with figure and clothing triggering CPU fallback) still persists under the exact same conditions for me and at least another user in the latest 4.12.1.40 Public Beta.

That’s the reason I have to stay on 4.12.0.86 and 431.86 until both of those issues are fixed.

If you want to see what people using DAZ Studio Beta are saying about new beta and drivers look here:
https://www.daz3d.com/forums/discussion/341006/daz-studio-pro-beta-version-4-12-1-40-updated

Note that the thread is cumulative (it includes comments for older beta versions as well) so you need to read backwards to see the latest posts. Take what you read there with a grain of salt though, not everyone there is tech-savvy and systematic in their testing efforts.

David,

Is there any news on when the new driver with a fix for CPU fallback will be available for download?

I can’t use CUDA 10.2 with 431.86 and I can’t render with 441.28.

Even a hotfix driver would be of great help.

Is there any ETA for a fixed driver?

No, I’m sorry, but we do not publish release dates in advance of releases. You may be able to guesstimate based on the recent pace of our driver releases. At the moment, I can only recommend trying the DAZ Studio Beta that already has a fix for this problem, or use the most recent stable combination of DAZ & NVIDIA driver from before the problem appeared. I’m not certain whether the next driver will fully resolve all the fallback issues you have or whether it will require another DAZ Studio release.


David.

David,

  1. I reinstalled Windows 10 – now I am on 1909 version, fully patched (10.0.18363.476).
  2. I installed NVIDIA Game Ready Drivers 441.41.
  3. I installed both DAZ Studio 4.12.0.86 and DAZ Studio Public Beta 4.12.1.40.
  • In DAZ Studio 4.12.0.86 I can work if I switch to Iray preview after I load something into the scene.

  • In DAZ Studio Public Beta 4.12.1.40 I still get immediate CPU fallback as soon as I switch any material – for example load different eye material for the figure.

I see that there is a new driver 445 for Windows Insiders which also supports CUDA 11 so I take it we should expect a new release soon. I wonder if R445 branch has any of those fixes and I am even tempted to test them though I’d rather NVIDIA publish a new driver even if it’s beta than having to download dumped drivers from some untrusted sources.

This appears to be an issue still. Over a year later. Guess this is a necro bump? Am I mistaken in my attempts to debug issues based on ‘improvements’? Performance seems to have taken a backseat from pushing “RTX” and other abstraction features. If it matters I can put some dump output here. Pretty much had it with Nvidia TBH. If you did not have Cuda half your base would have left.

Hi @raytronics, welcome to the OptiX forum!

Have you contacted DAZ support about this issue? The thread that you’re replying to was a problem supporting features between different versions of DAZ and Iray. It was not a problem in either OptiX or in the NVIDIA display drier, and DAZ and Iray together fixed and resolved that issue more than a year ago. I can understand your frustration, but just so you are aware, this thread should have been a DAZ support ticket and does not really belong on the OptiX developer forum.

What is the issue, exactly? What is your driver version, OptiX SDK version, CUDA toolkit version, GPU model, VRAM size, system OS? Does this problem occur with the latest driver, and latest versions of DAZ and Iray? What versions of these applications are you using?

What have you done to debug your problem, and what evidence do you have that the problem you’re seeing is caused by anything outside of the DAZ Studio application? Is it an issue with a specific DAZ feature?

I don’t know if there’s anything we can do to help you, but if you would like any help with this, then you will need to send some details including the output of the error messages you’re seeing and the version numbers of the software and hardware you’re using.


David.