poona
January 28, 2010, 5:02pm
1
This is my kernel
__device__ void AddOne(int *acc)
{
atomicAdd(acc, 1);
}
extern "C"
__global__ void test(int* var)
{
__shared__ int acc;
if(threadIdx.x==0)
acc = 0;
__syncthreads();
AddOne(&acc);
__syncthreads();
if(threadIdx.x==0)
var[0] = acc;
}
I compiled this using the following command and I had no errors or warnings while compiling using nvcc.
nvcc kernel.cu --ptx -arch sm_11
If I run this program, the module fails to load using cuModuleLoad(). Even the error returned isn’t any of the ones listed in the reference manual.
If I change the line
atomicAdd(acc, 1);
with
acc[0] = 1;
the module gets loaded correctly, and I am able to retrieve the value 1 from the kernel.
Can someone help me with this? Thanks for reading.
I have a 1.1 device and I am running 2.3 version of the toolkit.
Shared memory atomic operations are not supported on compute capability 1.1 hardware.
poona
January 28, 2010, 5:57pm
3
Right. Just saw it in the appendix. Is the atomic operation expensive to use, assuming I am using it on shared data?
poona
January 28, 2010, 6:15pm
4
I am running this code instead.
__device__ int acc;
__device__ void AddOne()
{
atomicAdd(&acc, 1);
}
extern "C"
__global__ void test(int *var)
{
if(threadIdx.x==0)
acc = 0;
__syncthreads();
AddOne();
__syncthreads();
if(threadIdx.x==0)
var[0] = acc;
}
I am invoking the kernel with a single block containing 31 threads.
When I do a ./a.out I get the right value. But next time I run a.out, I am getting 62. If I continue to call the app, it seems to be adding to old value of the variable from the previous invocation. The device var should have a lifetime of the app according to the guide. I am not sure how it is persisting across multiple invocations.
31 poona@poona_desktop:~/development/cuda# ./a.out
62 poona@poona_desktop:~/development/cuda# ./a.out
93 poona@poona_desktop:~/development/cuda# ./a.out
124 poona@poona_desktop:~/development/cuda# ./a.out
155 poona@poona_desktop:~/development/cuda# ./a.out
186 poona@poona_desktop:~/development/cuda# ./a.out
217 poona@poona_desktop:~/development/cuda# ./a.out
and so on.
Try declaring acc explicitly as global .
poona
January 28, 2010, 6:23pm
6
Can we declare a var as global? I tried it with
__global__ int acc;
and I got these errors
kernel_atomic.cu(1): warning: invalid attribute for variable "acc"
kernel_atomic.cu(6): error: identifier "acc" is undefined
kernel_atomic.cu(14): error: identifier "acc" is undefined
kernel_atomic.cu(23): error: identifier "acc" is undefined