Strange error while reading a PTX file

Hi all,

I have been facing a strange problem while dealing with RT_CALLABLE_PROGRAMs.
Here is my CUDA code :

RT_CALLABLE_PROGRAM void test(float & value)
{
	const float a = cosf(0.2f);
	value = a;
}

Here is my C++ code from host side :

void main()
{
	optix::Context context = optix::Context::create();
	try // This is gonna explode.
	{
		optix::Program program = context->createProgramFromPTXFile(ptxpath("scene"), "test");
	}
	catch(optix::Exception e)
	{
		std::cout << e.getErrorString() << std::endl;
	}
	context->destroy();
}

And here is what I get :

Invalid value (Details: Function "_rtProgramCreateFromPTXFile" caught exception: defs/uses not defined for PTX instruction (Try --use_fast_math):  madc.hi, [1310866])

I tried using fast-math, which did not help. I get the same error for cosf, sinf, tanf… But not for acosf or some others. My guess is that these trigonometric functions are supposed to be intrinsics but can’t be used in RT_CALLABLE_PROGRAMs for some reason.

How can I manage to compute a cosinus in a RT_CALLABLE_PROGRAM ? I could rewrite some trigonometric functions, but I’m not sure it would be the best way here :)

Thanks.
Hamza

Edit : By the way, I’m on Windows 7 and working on a Nvidia GTX Titan.

are you sure that your cuda version is compatible with the optix version and the drivers are up to date?
optix 3.5 needs cuda <= 5.5 and optix 3.0 needs cuda <= 5.0. cuda 6 won’t work always.

I tried to use

const float a = cosf(0.2f);

in a callable program and it worked.

I know I’m using OptiX 3.5 and I’m almost sure I’m using CUDA 5.5 since I had compatibility issues before.
I don’t have access to my computer right now, but I’ll double check.

Thanks.

oh, another guess is to increase the SM version. as far as I know, the default is 1.3 or earlier, which causes some problems with callable programs.

in cmake this works with

OPTIX_add_sample_executable( helsinki
   starter.cpp
   cudaCode.cu

   OPTIONS -arch sm_21
)

I’m already using SM 3.0 so i guess the problem is not here. But I’ll double check this too to be sure.

Edit : Well, I double checked and it appears I was using CUDA 6.0 headers since my last driver update or something like that. Thanks !

Edit 2 : Hmm, actually… No. I’m now using CUDA 5.5 and the problem’s still here. I can bypass it by using __cosf instead of cos but that’s not a solution in my case.

I cleaned/reinstalled the Nvidia driver, CUDA 5.5 and OptiX 3.5 and still meet the same problem.

is it possible that you still use old dlls? i had to copy them into the exe directory, otherwise it didn’t work. this caused problems with the update to optix 3.5 later.

i think the cmake scirpt from the samples does this by default.

I just replaced the dlls I was using by the new ones and it still changes nothing.
That’s weird.

weird indeed, but it has to be a build or runtime issue. but then weird is the second name of OptiX :)

three more ideas (but maybe you did it already):

  • did you do a full clean and rebuild?
  • you could use the search function and look for CUDA/OptiX dlls, headers etc.
  • create the project from scratch and copy just the sources

otherwise, did you switch it off and on again (probably yes, since you reinstalled the driver)? i’m serious, it solved an issue with a different project just today.

Thanks :)
I tried all of these ideas (even rebooting, I know it can be really useful sometimes…)

I just managed to make the problem disappear using different compilation options.
When using compute_20,sm_20 or over, the error appears.

So, when using compute_10,sm_10, I don’t get the previous error. I get another one :

Unknown error (Details: Function "_rtProgramCreateFromPTXFile" caught exception: Assertion failed: "is_const", [11403401])

I guess I’m using a feature that’s only available on 2.0 compute capabilities and over, but I can’t use cosine and sinus functions while using compute_20,sm_20

Could you try the small example I gave before under compute_30,sm_30, for example ?

No, sorry, I can’t test with sm_30 because I have only a sm_21 device. But it did work on sm_21 and sm_20 (i used your code)…

with sm < 20 you would have to pickle the pointer/reference arguments, the programming manual tells how to do that. I didn’t try…

And come on, cos should work always. (:
hence it must be a build issue. if nothing works you’ll have to reinstall your operating system :/

Yeah, cos seems to be kind of an essential feature while dealing with ray-tracing :)
I’ll try to start my project from scratch and see what happens. Otherwise, I have access to some other workstations with similar hardware, I’ll try to execute my code on them.

Thank you again!

I just tried to run simple code with cosf on another workstation (CUDA 5.5, OptiX 3.5.1, Tesla C2070) and it still does not work.
Even for the very simple piece of code with only a cosf call, it does not work unless I’m under compute_10,sm_10 or compute_12,sm_12.

According to http://docs.nvidia.com/cuda/parallel-thread-execution/index.html#extended-precision-arithmetic-instructions-madc , since madc is an instruction that’s defined only from compute capability 2.0, and it since it appears that somes “defs/uses” are not defined in PTX, that would explain my problem.

Apart from trying to look into the generated PTX code (which I actually did without much success), I don’t see what I can do…

how are you building? can you provide the full project?

did you try to run the optix sample code? maybe modify one example and add your calls. i mean, it’s working on my sm_21 device, why wouldn’t it on your almighty titan work.

Hi,

Sorry, I was out of my office. Here is a link to the full zipped project :
https://www.dropbox.com/l/Z3AB5QxsKzW2kY9kLNZXfs?
I don’t think it’s linked to the GPU I’m using since what fails is the call to createProgramFromPTXFile. By the way, I just tested this very simple example on a colleague’s computer to get the same problem.

Could you try an tell me what you get ?

Hi HamzaC, to send your zip file to us please write an e-mail to optix-help@nvidia.com and we’ll provide you a place where to upload your trace/project.

Thanks!

Sorry for the long delay HamzaC, I was quite much under stress and then forgot about it.

And I have no good news since unfortunately I couldn’t make it run. I can’t even compare to my running project, because it uses a different build system (CMake) and the Visual Studio Settings look different.

But maybe you should see it as a sign not to use OptiX :P, I had huge problems with it…

Edit (didn’t see the answer by marknv):
Was it possible to resolve the issue with the help of NVIDIA by now?

Hi adamce,

Thanks for trying :)
The problem is half solved. Really using the --fast-math option actually solves the problem. The culprit was Visual Studio: even though --fast-math was set on “True”, it wasn’t adding it to the command line. Manually adding it made the problem disappear.
But it’s equivalent to my solution (using __cosf instead of cosf). I still don’t know why one can’t use the accurate version of cosf inside a PTX program. However, my code works, so for now I’m satisfied.

About using OptiX or not, I read your post on your website (and Mandlebulb generation reminded me so much of an old project of mine!). I have to say that even if I had a lot of trouble debugging my OptiX code until now, using OptiX saved me a lot of time: I’m pretty sure that if I had written my own raytracer, it’d have been slower, buggier and less efficient.
So even though OptiX is sometimes a little bit hard to understand, I really think it’s a very good framework and, depending on what you’re doing, it can be almost essential.

Hi,

did you make any progress on this? I’m getting the exact same error message since I switched to OptiX 3.5.

Thanks

Make sure to really use --use_fast_math when compiling the *.cu files to *.ptx.
OptiX doesn’t handle some of the more involved implementations for trigonometric functions.

BTW, if you’re still on OptiX 3.5.x, version 3.6.0 is available on the download location you received after registering for 3.5. That added support for CUDA 6.0 which includes Maxwell GPUs.