use constant memory to pass kernel parameters as struct

FangQ · April 11, 2011, 2:14pm

I need to use the constant memory to pass a bunch of parameters as a struct to a kernel.

To do so, in my .cl file, I define the s

typedef struct KernelParams {

  float4 ps,c0;

  float4 maxidx;

  uint4  dimlen,cp0,cp1;

  uint2  cachebox;

  float  minstep;

  float  twin0,twin1,tmax;

  float  oneoverc0;

  unsigned int isrowmajor,save2pt,doreflect,dorefint,savedet;

  float  Rtstep;

  float  minenergy;

  float  skipradius2;

  float  minaccumtime;

  unsigned int maxdetphoton;

  unsigned int maxmedia;

  unsigned int detnum;

  unsigned int idx1dorig;

  unsigned int mediaidorig;

} KParam __attribute__ ((aligned (16)));

...

__kernel void main_loop( ..., __constant KParam gcfg[]){

   ...

   if(gcfg->doreflect && ...){

      ...

   }

}

in my host unit, I define the (exactly) same struct in the header file,

and

KParam param={...};

     cl_mem gparam;

     mcx_assess((gparam=clCreateBuffer(context,RO_MEM,sizeof(KParam),&param,&status),status));

     ...

     mcx_assess(clSetKernelArg(kernel,15, sizeof(cl_mem), (void*)&gparam));

     mcx_assess(clEnqueueNDRangeKernel(commands,kernel,1,NULL,mcgrid,mcblock, 0, NULL, NULL));

     ...

when running this code with an ATI card, it did ok and the values are corrected

passed into the kernel. However, when running it on a nvidia card with CUDA 3.2,

the buildprogram phase failed with segfault error. If I replace “if(gcfg->doreflect && …)”

to “if(0)”, the program can compile but the results are incorrect.

Did I do anything obviously wrong in this case? I am particularly interested

if my alignment settings and the orders of the struct members are ok.

Any comment is welcome! thanks in advance

FangQ · April 11, 2011, 2:19pm

by the way, I did aware that the maximum # of constant parameters to a kernel is 8; in my kernel, I have only 4 __constant parameters, and gcfg is the last one.

Rom1 · April 12, 2011, 7:36am

We often had the same kind of problem …
Check that your struct is aligned at least on a 128 bits base (= multiple of 4 int).
Print the sizeof(KernelParams) on both your host and your device it should be the same.
If different or not on a 128 bits basis introduce some padding with a few int dummies.

FangQ · April 12, 2011, 7:38pm

thanks. I printed sizeof(KParam) in the host and the device, and their sizes are different.

In the host, the size is 180, in the device, it is 192. I prepend all member types with

cl_, for example, float4 → cl_float4 etc in the host code, and their sizes now match.

But now I am still having difficulties to run this code on my GTX 470 and the results

appears to be random: for 80% of the time, I get an “Out of resources” error, and for the

rest 20% cases, I get results but they look strange. I am now commenting out line by line

and debug the code. My impression is that nvidia’s OpenCL compiler is buggy. If any of

the developers are interested in testing my code, please email me.

tonyo_au · April 30, 2011, 9:45am

I took a crude approach to alignment problems. I returned all the parameters from the constants structure in a output buffer that I read an compared on the cpu.
Where the results started to not match I put a pad. Eventually it all worked. I am sure I should not have to do this

Topic		Replies	Views
__constant__ memory failed to load. CUDA Programming and Performance	3	2595	August 25, 2007
Problem setting __constant__ memory CUDA Programming and Performance	4	4128	May 23, 2007
[Driver API] Kernel parameters CUDA Programming and Performance	1	2324	January 13, 2010
.cpp - .cu struct miss alignment problem on cuda for CPU or kernel functions struct, alignment CUDA Programming and Performance	0	7035	January 25, 2011
__constant memory issues CUDA Programming and Performance	32	12626	September 15, 2010
Strange behavior with constant memory CUDA Programming and Performance	0	1462	November 13, 2008
Dynamic array inside a struct on constant memory? CUDA Programming and Performance	0	1559	August 16, 2010
constant memory problem CUDA Programming and Performance	7	9732	January 29, 2010
Problem with costant memory Can I define it as external CUDA Programming and Performance	7	1606	September 27, 2010
Alignment doesn't help for constant memory? CUDA Programming and Performance	3	2450	October 12, 2007

use constant memory to pass kernel parameters as struct

Related topics