Problem with kernel compilation Using 1.0 sdk

I am getting a problem with kernel compilation while trying to do image processing.

What I get is this:

Program Build Log:

ptxas ptx input, line 430; error   : Illegal argument for formal parameter 'ptxbe_rdo_image'

The problem happens when I have a kernel which calls a function twice. It doesn’t complain when I am calling it only once. It is very strange that this is a compile-time problem.

Example:

foo(bla, bla, bla);

will compile just fine.

However, if my kernel does this:

foo(bla, bla, bla);

foo(bla, bla, bla);

It fails!

Any ideas why this would happen? I can’t post the code in it’s entirety, but rest assured this is the exact problem. Just by commenting or uncommenting one line, which is the exact copy of the line above it, I can make the problem appear/disappear. I have never seen or debugged such a problem before. Please let me know how to solve it.

Regards,

I have the same problem:

[codebox]__constant sampler_t sampler = CLK_NORMALIZED_COORDS_FALSE | CLK_ADDRESS_CLAMP_TO_EDGE | CLK_FILTER_NEAREST;

unsigned char readPixel(__read_only image2d_t frame, int2 coords)

{ // add 0.5f and round towards -INF to get the desired rounding

// behavior on values that are x.50000f

float norm = read_imagef(frame, sampler, coords).x;

return (unsigned char) floor(0.5f + 255.0f * norm);

}

void loadBlock(__read_only image2d_t image,

			int x, int y, 

			int tid, 

			__local unsigned char *target)

{

// each thread is responsible for 4 pixels.

y = (tid / 4) + y;

x = 4 * (tid & 3) + x;



int four_tid = 4*tid;



target[four_tid + 0] = readPixel(image, (int2)(x, y));



x++;

target[four_tid + 1] = readPixel(image, (int2)(x, y));



x++;

target[four_tid + 2] = readPixel(image, (int2)(x, y));



x++;

target[four_tid + 3] = readPixel(image, (int2)(x, y));

}

__kernel void test(__read_only image2d_t im1, __read_only image2d_t im2, __global unsigned char* out)

{

__local unsigned char buf1[256];

__local unsigned char buf2[256];

int tid = get_local_id(0);

loadBlock(im1, 0, 0, tid, buf1);

loadBlock(im2, 0, 0, tid, buf2);



out[0] = buf1[0];

out[1] = buf1[1];

}[/codebox]
image_compile_error.ptx.txt (10.6 KB)

Maybe I should add that I’m using the conformant release with the recommended driver.

Am I doing something obviously stupid here or is this a real bug?

I just tried the new version of the SDK and OpenCL driver (based on CUDA 2.3).

The driver version was 190.89 on Windows Vista 32-bit.

I still get a ptxas error when trying to compile the kernel that I posted above.

The message I get looks as follows:

ptxas ptx input, line 461; error   : Illegal argument for formal parameter 'ptxbe_rdo_image'

: Retrieving binary for 'anonymous_jit_identity', for gpu='sm_13', usage mode='

When one of the calls to loadBlock() is removed the following message is generated instead:

ptxas ptx input, line 383; error   : Unknown symbol 'sam_sampler'

: Retrieving binary for 'anonymous_jit_identity', for gpu='sm_13', usage mode='

The generated ptx for both cases is attached to this post.
compile_error_one_call.ptx.txt (16.9 KB)
image_compile_error_new.ptx.txt (17.2 KB)

Can anyone say anything about this problem?

I get a similar error in some code like this:

__global Foo *ctx=....;

	__global Bar *ptr0		 = ctx->a;

	

	ctx->b->thing  = ptr0->thing;

It goes away if I make it this:

__global Foo *ctx=....;

	__global Bar *ptr0		 = ctx->a;

	float4 thing=ptr0->thing;

	ctx->b->thing  = thing;

Chalk it up to another internal compiler bug. Would be nice if there was a bug database

I just tried the new beta version of the SDK and OpenCL driver (based on CUDA 3.0).
The driver version was 195.39 beta on Windows Vista 32-bit.

Looks like the error is gone now.