Access Violation in nvcompiler.dll Looks like a bug, flies like a bug

I am on Win 64, using 195.62. I tried to add on to a kernel that I have had for a while. I added a program scope, constant sampler, 3 functions, and an image2d_t arg to the kernel. The kernel conditionally calls one of the new functions, passing the image.

A call to clCreateProgramWithSource, sets errcode_ret to CL_SUCCESS. The call to clBuildProgram with the cl_program returned causes the access violation. I am calling from a Java VM. The beginning of the log file generated, below, shows the error in nvcompiler.dll

[codebox]# A fatal error has been detected by the Java Runtime Environment:

EXCEPTION_ACCESS_VIOLATION (0xc0000005) at pc=0x000000004899d491, pid=6028, tid=4976

JRE version: 6.0_17-b04

Java VM: Java HotSpot™ 64-Bit Server VM (14.3-b01 mixed mode windows-amd64 )

Problematic frame:

C [nvcompiler.dll+0x6d491]

[/codebox]

I pulled the zip drive, where everything is stored out of the PC, & put it in a OSX machine. It compiles and runs, although the run time condition does not yet exist that actually gets the new code called. It does not just blow away my process, however.

Is there anyway someone can think of where this is not a bug? Any recommended problem isolation I should try?

Forgot, that also added a typedef’ed struct, just a float4, an int, and an int2. No banned substances, like image2d_t (section 6.8). My sampler has program scope, so I have no need to any of the restricted stuff to it, listed just below the struct image2d_t ban.

I had the same thing happen before in that shared library. After much selective commenting of code it turned out I was using an undeclared variable in my kernel. It shouldn’t cause a crash, but this is fresh software. :-) Try looking for the use of an undeclared variable in your kernel code.

Thanks for the suggestion, but remember it did compile on OSX. I have tracked it down to a bug Passing a Typedef’ed Structure to a function. I submitted a problem report, then while thinking about a work around I realized that I really wanted to pass a pointer, so it could be updated by the function & keep track of where in the image I was. That compiles no problem & is producing reasonable data that still needs to be verified.

I went back and tried to set the priority to low in a comment on the report. I have not worked with a non-OO language over 15 yrs. It is hard to imagine structures anymore. My 1978 Kernighan & Ritchie says that passing structures was not even possible then. It has to by now.

FYI, here is the code example I used in the report:

[codebox] typedef struct {float member;} myType;

 float getValue(myType z){

     return 1.3f;

 }

 kernel void foo(global float *output){

     myType z;

     output[0] = getValue(z);

}

[/codebox]

I also added this at the bottom of the report:

There are other forum threads, which have complete examples, that might also need to be considered at the same time:

Initializing data structures http://forums.nvidia.com/index.php?showtopic=155503

Structure copy error http://forums.nvidia.com/index.php?showtopic=155418

Structure alignment bug related to calling build-in functions http://forums.nvidia.com/index.php?showtopic=107811