OptiX 6 PTX parse error: Internal error while parsing builtin declarations


I am using OptiX 6 on many different GPU’s with no problems. Users of the library I am developing are covering even more hardware configurations. And I found two cases where an error appears that I don’t understand. One of these cases is on Google Colab, on Tesla T4, NVIDIA-SMI 430.40, Driver Version: 418.67, CUDA Version: 10.1 (the other one was reported by user with 2x RTX2080). The error message says:

Parse error (Details: Function "RTresult _rtProgramCreateFromPTXString(RTcontext, const char*, const char*, RTprogram_api**)" caught exception: (api input string): error: Failed to parse input PTX string
Internal error while parsing builtin declarations

What could be the reason? Thanks for any hint!

For which streaming multiprocessor target do you create the *.ptx source code in the failing cases?

This seems to be isolated to newer boards and if you generated PTX for SM 7.5 code for example, the OptiX 6.0.0 parser might not be able to handle that.

It’s normally enough to generate the PTX code with the lowest necessary version to run on all OptiX-supported boards (Maxwell and up for OptiX 6 and higher).
Means generating the code for SM 3.0 (Kepler) or SM 5.0 (first Maxwell) should work on all supported GPUs.

The OptiX 6.5 and 7.0 parsers have been updated to handle newer versions.

Thank you for so prompt reply!

OptiX 6.5 sounds like a solution, but requires driver update (Google Colab just updated to 418, they are quite a few versions behind!). OptiX 7, for the moment, terrifies me with the code rewrites on my side.

In my code, first I check the device capabilities. In all problematic cases it is SM 7.5 so I use PTX pre-generated for it. Parsing throws an exception and then the library attempts to use PTX prepared for lower SM’s: 7.0, 6.1, 6.0, 5.2, 5.0. So parsing fails also when SM 5.0 is the target on Tesla T4.

All PTX’s targetting all SM’s are pre-generated in advance on a system with 6.1 board (just forcing SM in the compiler options).

Do I understand correctly this does not make sense and it should be enough for me to use PTX made for SM 5.0? But this one also fails (on some boards, e.g. on a quite new RTX2070 Super it works well).

I’m normally translating CUDA code to SM 3.0 PTX and have not seen any problems, though I’m normally working only with Quadro RTX Turing or Pascal boards.

There were definitely issues when using SM 6.1 in OptiX 6.0.0 before the parser was updated.

See these threads:

OK, thanks for links! Now I understand how this works in OptiX.

Yesterday I tried with SM 3.0 as the target. This one also failed on Tesla T4. PTX was generated with CUDA 10.0 and I can see “.target sm_30” there, so it should be configured correctly. The PTX file is here for the reference: https://github.com/rnd-team-dev/plotoptix/blob/master/plotoptix/bin/cuda/compute_30/default_exception.ptx.

I think I have all the nvcc flags set correctly. If I figure out what else to try differently I’ll let you know here. Otherwise will need to move to the newer OptiX/drivers. Already asked at Google Colab: https://github.com/googlecolab/colabtools/issues/620#issuecomment-527970556