I’m developing a particle simulation using CUDA and c. I’m having quite a bit more trouble than I should with 2d arrays (and random other memory issues).
The following is a simplified test program to illustrate some of the issues I’m having in my main code right now:
#include <stdio.h>
#include <cuda_runtime.h>
#include "visual/myhelpers.h"
float *fDev;
size_t fPitch;
const int npart = 50;
const int ndim = 2;
__global__ void kernel(float *myfl, int fpitch) {
const int n = threadIdx.x;
const int dim = threadIdx.y;
float *f = (float*)((char*)myfl + dim * fpitch) + n;
*f = 0;
__syncthreads();
printf("n=%i *f= %f / %f, n: %i, dim:%i \n", n, f[n], f[n], n, dim);
if (*f !=0)
printf("BUGBUGBUG *f= %f, n: %i, dim: %i\n", *f, n, dim);
}
void init() {
HANDLE_ERROR(cudaMallocPitch(&fDev, &fPitch, npart * sizeof(float), ndim));
}
int main() {
dim3 threads = dim3(npart, ndim);
init();
HANDLE_ERROR(cudaMemset2D(fDev, fPitch, 0, npart*sizeof(float), ndim));
kernel<<<1, threads>>>(fDev, fPitch);
cudaDeviceSynchronize();
return 0;
}
myhelpers.h is only there for HANDLE_ERROR, which you can imagine the function of.
This code is not tripping the BUGBUGBUG, but is still coming back with *f = -nan sometimes.
To explain some weirdnesses in the code: commenting out the *f=0; does nothing. I eventually want to be rid of that line, but the cudaMemset2d call isn’t always doing what it should (so that’s a backup). The printf line prints *f twice because I was originally printing %x for the second one - that gives even weirder results, making the second n print out a strange large value, and the dim prints out n.
When my program itself is run, I tend to get the BUGBUGBUG (i.e. *f !=0) to trigger fairly often - it will either be nan, or one of 2 or 3 small numbers (seem to correspond to values 0x80000000, 0xa0000000 or 0xe0000000). It also seems to be limited to a few different values of n (in particular 25 and 46).
I’m compiling with: nvcc -arch=sm_30 -g -G -o bugtest bugtest.cu, on a system with two Quadro k5000 cards (obviously only using one at a time).
Any ideas? I’ve been just scratching my head on this one.