// Simple 8-bit bit reversal Compute test
#define N 256
global void bitreverse(unsigned int *data)s
unsigned int *idata = data;
unsigned int x = idata[threadIdx.x]; x = ((0xf0f0f0f0 & x) >> 4) | ((0x0f0f0f0f & x) << 4); x = ((0xcccccccc & x) >> 2) | ((0x33333333 & x) << 2); x = ((0xaaaaaaaa & x) >> 1) | ((0x55555555 & x) << 1); idata[threadIdx.x] = x;
unsigned int *d = NULL; int i;
unsigned int idata[N], odata[N];
for (i = 0; i < N; i++) idata[i] = (unsigned int)i; cudaMalloc((void**)&d, sizeof(int)*N); cudaMemcpy(d, idata, sizeof(int)*N, cudaMemcpyHostToDevice); bitreverse<<<1, N>>>(d); cudaMemcpy(odata, d, sizeof(int)*N, cudaMemcpyDeviceToHost); for (i = 0; i < N; i++) printf("%u -> %u\n", idata[i], odata[i]); cudaFree((void*)d); return 0;
The above code is from the CUDA-GDB manual V2.3, June 2009. It is designed so a student can step through the execution lines and understand what goes on in a CUDA program. I cannot get into the bitreverse subprogram. The command
if it is the only break command in the program, does not stop the program’s execution in the bitreverse subprogram. What am I doing wrong? The CUDA-GDB manual states clearly that you cannot step over subroutines, yet that is apparently what I am doing. It seems even when I step one through the program one executable line at a time that I am still not going into the bitreverse subprogram. I want to learn abut CUDA code and that I why I want this. I already know about c code. It goes into c subprograms, but not CUDA subprograms.
I compile with
nvcc -g -G bitreverse.cu -o bitreverse
and then run
cuda -gdb bitreverse
then once in the debugger I type
it responds as it says in the manual, but it still it does not go into CUDA subprograms.
why is this?