Ptxas error while migrating from OptiX 6.0 to 7.2

Hi everyone,

I have a working project built against OptiX 6.0 and I’m in the process of moving it to 7.2. As there are plenty of API differences, I have to rewrite my CUDA files, however, for some reason I can’t compile them anymore. I tried removing various parts from the code, until I got the simplest possible thing:

#include <optix.h>

struct Constants
{
	float4* input_buffer;
};
extern "C"
{
	__constant__ Constants cs;
}

extern "C" __global__ void __raygen__oxMain()
{
	const uint3 pixelID = optixGetLaunchIndex();
	cs.input_buffer[pixelID.x] = make_float4(0, 0, 0, 0);
}

… and it doesn’t compile either. The error I’m getting is:

Label expected for argument 0 of instruction 'call'
Call target not recognized
ptxas fatal   : Ptx assembly aborted due to errors

Interestingly as I googled for this error (including this forum), it was only mentioned together with some “function X not found” message, while I don’t get anything else apart from this. But given that, I suspect it has something to do with optixGetLaunchIndex not being compiled correctly.

I’m using:
MSVC 2015, 64-bit project.
CUDA Toolkit 9.1

Any ideas? Thanks in advance.

Could you please list some details about your system configuration as well:
OS version, installed GPU(s), and display driver version are always required.

First, do the OptiX SDK 7.2.0 examples compile and work on your system?

As there are plenty of API differences, I have to rewrite my CUDA files, however, for some reason I can’t compile them anymore.

The code looks fine on first glance.

Are you saying this error happens during compile time of your CUDA code to PTX source code?
In that case there shouldn’t be any errors from the PTX assembler (ptxas) because that it not involved at that step.
The NVCC command line needs to compile from CUDA to PTX source only, not obj or cubins.

I can compile that just fine with MSVS 2019 and CUDA 11.1 and OptiX 7.2.0 headers.
(I just plugged your code into the raygeneration.cu of my OptiX 7 examples.)
The resulting PTX code contains one inline assembly “call” instruction which can only be resolved by the OptiX internal compiler.

// Generated by NVIDIA NVVM Compiler
//
// Compiler Build ID: CL-29069683
// Cuda compilation tools, release 11.1, V11.1.74
// Based on LLVM 3.4svn

.version 7.1
.target sm_50
.address_size 64

  // .globl	__raygen__oxMain
.visible .const .align 8 .b8 cs[8];

.visible .entry __raygen__oxMain(

)
{
  .reg .f32 	%f<2>;
  .reg .b32 	%r<4>;
  .reg .b64 	%rd<5>;


  .loc 1 41 1

  .loc 2 1096 5
  // inline asm
  call (%r1), _optix_get_launch_index_x, ();
  // inline asm
  .loc 1 44 3
  ld.const.u64 	%rd1, [cs];
  cvta.to.global.u64 	%rd2, %rd1;
  mul.wide.u32 	%rd3, %r1, 16;
  add.s64 	%rd4, %rd2, %rd3;
  mov.f32 	%f1, 0f00000000;
  .loc 1 44 3
  st.global.v4.f32 	[%rd4], {%f1, %f1, %f1, %f1};
  .loc 1 45 1
  ret;
}

Please read some of these threads which talk about the CUDA to PTX compilation NVCC command line options:
https://forums.developer.nvidia.com/t/how-would-you-generally-compile-and-run-a-program-which-has-optix-without-make-or-cmake/75302/2
https://forums.developer.nvidia.com/t/problem-with-running-optix-6-5-program-invalid-value-for-gpu-architecture/157284/4

include <optix.h>

Make sure that is the version from OptiX SDK 7.2.0.

Which SM target version did you use for the PTX compilation?
Try SM 5.0 which is the minimum GPU version (Maxwell) supported by OptiX since version 6.

You could also try updating the CUDA Toolkit to 10.x or 11.x versions.
(Always install in ascending order. Never install the display driver components of that.)
According to the CUDA installation guide for Windows even CUDA 11.1 still supports versions of MSVS 2015.

If you say this happens with the compiled PTX code during OptiX module compile or pipeline compile and link time, then that would mean you have a new enough display driver for OptiX 7.2.0 installed, otherwise you wouldn’t have passed the function table ABI version check.
The error is from the ptxas which ships with the display driver and is invoked by OptiX at runtime, so it’s GPU and driver dependent (that’s why that system configuration is mandatory).

Just to be sure, did you compile the PTX input sources with debug information, --device-debug or -G on the NVCC command line? (But then the OptiX compiler would have complained earlier.)
If yes, try without.

Ah, that’s interesting. I thought ptxas is a part of CUDA Toolkit and doesn’t depend on the driver.

Thanks, I will update the driver and see if it works.

I’m on Windows 7 and GTX 1060. I noticed OptiX 7.2 and CUDA 11 release notes mention Win 8.1+, but I was hoping to at least compile it for others to use.

  • Didn’t try to build the examples yet.
  • Using sm_50 didn’t seem to work (was using sm_30 though).
  • –device-debug didn’t print any more details.

You misunderstood. There is a ptxas inside the CUDA toolkit, which is NOT needed in this case either.

Again, if that is invoked during your CUDA *.cu source code to PTX *.ptx source code compilation, your NVCC command line is incorrect.
ptxas is not involved in that step at all. It’s only required if you compile to CUDA binary formats, which is not the case for the input PTX to OptiX and never was.
Means if you have an existing OptiX 6.0.0 based project, the PTX compilation step itself shouldn’t have changed at all.

There is also a PTX assembler and a microcode generator inside the drivers. Those are invoked on the final CUDA kernel code OptiX compiles internally, means at runtime. If that’s where a ptxas error happens, that’s the one inside the driver.
Again, this has nothing to do with your compile-time problem.

Mind that CUDA 11.0 discontinued support for SM 3.0 and 3.2 and deprecated support for SM 3.5, 3.7 and 5.0.
There is an NVCC command line option I’m using to suppress these deprecation warnings:
https://github.com/NVIDIA/OptiX_Apps/blob/master/apps/rtigo3/CMakeLists.txt#L180

Not sure what that “CUDA 11 release notes mention Win 8.1+” is about.
The CUDA toolkit parts you require shouldn’t have any OS version dependency like that.
If at all it’s probably about that display driver in there which is a major pet peeve of mine and shouldn’t be in there if you ask me.
Do not install any of those display drivers inside the toolkit unless you are a user of the first hour when there is no official display driver released supporting the newest CUDA toolkit version, yet. At all other times, install the newest available display driver which supports that CUDA version and be done with it.

Anyway, CUDA 10.x versions should do just fine as well.

That said there are display drivers with CUDA 11.1 available under Windows 7 64-bit for your board: https://www.nvidia.com/Download/Find.aspx?lang=en-us

Didn’t try to build the examples yet.

You should do that as first step before running own experiments.
If those work, then you could have plugged in your code into one of the examples and see if that compiled.

Hmmm, my command line only calls nvcc (shortened some paths for forum):

"path/nvcc.exe" -gencode=arch=compute_30,code=\"sm_30,compute_30\" --use-local-env --cl-version 2015 -ccbin "path to vc" -x cu -I"path to includes" -I"path to includes" --keep-dir x64\Release72 -maxrregcount=0 --machine 64 --compile -cudart static -DWIN32 -DWIN64 -DNDEBUG -D_CONSOLE -D_MBCS -Xcompiler "/EHsc /W3 /nologo /O2 /FS /Zi /MD " -o x64\Release72\denoisePrepare72.cu.obj "D:\denoiser2\denoisePrepare72.cu"

ptxas is called somewhere from the nvcc:

Anyway - will update the driver and report back!

That’s the whole problem. You’re not compiling to PTX source code but to an object.

“C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.1\bin\nvcc.exe” --use-local-env --cl-version 2015 -ccbin “C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\bin\x86_amd64” -x cu --keep-dir x64\Release72 -maxrregcount=0 --machine 64 --compile -cudart static -o x64\Release72%(Filename)%(Extension).obj “%(FullPath)”

Do not not use --compile and do not use *.obj as output for the *.cu files.

A usual NVCC command line option for OptiX PTX code looks something like this:

nvcc.exe --machine=64 --ptx --gpu-architecture=compute_50;–use_fast_math;–relocatable-device-code=true;–generate-line-info;-Wno-deprecated-gpu-targets;-IC:/sdk/OptiX SDK 7.2.0/include;-IC:/optix_apps/apps/intro_runtime/shaders C:/optix_apps/apps/intro_runtime/shaders/raygeneration.cu -o C:/optix_apps/msvs2019/bin/$(ConfigurationName)/intro_runtime_core/raygeneration.ptx

That is the CMake output you get when uncommenting this line in my OptiX 7 examples:
https://github.com/NVIDIA/OptiX_Apps/blob/master/3rdparty/CMake/nvcuda_compile_ptx.cmake#L44

Sorry, I’m an idiot. Apparently I somehow reset compilation settings per .cu file. Indeed they just weren’t set to compile to ptx! Ouch.