Hello Im using 9800GT and 9500GT on Windows. Im just curious why mixing types when calling kernel actually crushes
the function. I mean if I call
kernel(int, int, float,…) like this
somehow on my computer the values calculated become useless eventhough they are deterministic. Can somebody help me with this?
Thank you.
I’m not sure I follow your question; you mean that whenever you call a kernel with more than 1 data type in the parameter list, it’ll crash? Are your parameters in device memory?
__global__ void doResample( int* a, cuComplex* d, float* e, int* f, float* g ){
/*
a = dev_frame b = dev_fft c = dev_resample
d = d_in e = k_resampledspacing
f = dev_b g = dev_resamp
*/
int xx = threadIdx.x;
//General stucture of kernel for parallel computing
//Do computing while xx and yy are smaller than XDIM and YDIM and boolean
//dev_resample and dev_fft are both true.
if(xx < XDIM)
{
for(int yy =0; yy<YDIM;yy++)
{
int i = int(g[yy*2]);
d[yy].x = f[a[0]*YDIM*XDIM+xx*YDIM+i]-g[yy*2+1]*(float)(f[a[0]*YDIM*XDIM+xx*YDIM+i+1]
-f[a[0]*YDIM*XDIM+xx*YDIM+i])/(float)e[0];
d[yy].y = 0;
}
}
}
When I change a, and f into float type it works perfect. Does anybody know what this happens? Also Im not sure why I have to cast
all variable that are float type in first place in order to make the code work right. Thank you
When you change a and f to float, do you also change the calling code? Since a and f are pointers, obviously the kernel and it’s caller need to agree what they point to.
Yes I do cast (int) in front of a whenever I call it. I also cast (float) in front of f which does not make sense to me but somehow makes the code work as it should be.
Casting a pointer to a different type will not change the type of the data it points to (i.e., the binary data remains unchanged), it will only change how the code interprets this binary data. Thus casting a float* to an int* will not work.
irst-chance exception at 0x75999673 in Resampling2.exe: Microsoft C++ exception: [rethrow] at memory location 0x00000000…
The program ‘[2604] Resampling2.exe: Native’ has exited with code 0 (0x0).
and when I try to use Parallel Nsight it says
The thread ‘CUDA Default Context’ (0x0) has exited with code 0 (0x0).
The thread ‘’ (0xcb6970) has exited with code 0 (0x0).
The program ‘[3276] Resampling2.exe: CUDA’ has exited with code 0 (0x0).
The part I marked in bold was what my question was about. I strongly suspect that the types of your arguments do not match the types of your kernel, and you are just hiding this with an incompatible type cast.
You managed to again leave out the declarations of many of the relevant variables. However, from the parts you posted I can already see that you are mixing different pointer types: [font=“Courier New”]frame[/font] is of type integer, copied into [font=“Courier New”]dev_frame[/font] of unknown type but size [font=“Courier New”]sizeof(float)[/font] then used as a float kernel parameter, whose content is later cast back to integer. None of these casts is necessary. Copying an int into a float* is just wrong and cannot be undone by casting the resulting float back to int.
If you run the Device Query SDK program, what does it say your max thread dimensions are? (It’s probably 512 9800/9500). Instead of having 1000 threads just in the x-dimension, why not split it up to use both the x and y dimensions? Or have multiple blocks?