atomicAdd with shared memory failing last err "invalid device function"

gatoatigrado · July 17, 2008, 5:14pm

Maybe I’m making a stupid mistake; the following code fails on my machine.

#include <assert.h>

#include <cuda.h>

#include <stdio.h>

#define CU_ASSERT(a) (cudaAssert(a, #a, __PRETTY_FUNCTION__, __LINE__))

void cudaAssert(cudaError err, const char *function_name,

                const char *function, int line);

__global__ void simple_shared() {

    __shared__ int a;

    atomicAdd(&a, 1);

}

int main() {

    simple_shared<<<1, 1>>>();

    CU_ASSERT(cudaGetLastError());

    CU_ASSERT(cudaThreadSynchronize());

    return 0;

}

cudaAssert is pretty much just what it says – it just prints cudaGetErrorString. The output is

ERROR - [int main():16] - ‘cudaGetLastError()’ failed, error 'invalid device function '. It seems to work fine when I take out the shared variable, or do standard operations (read/write). I have an 8600gts, nvcc built “Tue_Jun_10_05:42:45_PDT_2008”, nvidia driver 177.13, and am running SuSE 10.3 (kernel 2.6.22).

Thanks in advance.

SPWorley · July 17, 2008, 5:23pm

Native atomic functions in shared memory requires compute device 1.2, see appendix A of the programming guide.

I made a little library for doing shared memory atomics with any device, but that’s not as efficient as hardware support.

gatoatigrado · July 17, 2008, 6:09pm

thanks so much. perhaps the compiler should stop it when I compile with “–gpu-name compute_11”.

I’m interested in seeing your library if it’s free; I’ve seen regular → atomic registers in a book, but it would be neat to see it in practice.

SPWorley · July 20, 2008, 9:08pm

Take a look at http://forums.nvidia.com/index.php?showtopic=72925 .

Topic		Replies	Views
cryptic 'invalid device function'... when returning value from shared mem CUDA Programming and Performance	2	2938	July 28, 2008
atomicAdd and shared memory issue Running the histogram code from "Cuda by example" book. CUDA Programming and Performance	2	26097	June 20, 2011
SM13: 64-bit atomic functions on shared memory? Is this really supported? CUDA Programming and Performance	0	3508	December 7, 2008
Atomic operation in shared memory CUDA Programming and Performance	1	3829	August 12, 2008
atomicAdd problems. CUDA Programming and Performance	3	2363	April 13, 2011
Suggestion for nvcc docs No function pointers CUDA Programming and Performance	0	1600	May 3, 2007
atomicAdd() failed nvcc compiles but module fails to load CUDA Programming and Performance	5	8365	January 28, 2010
Atomic operation problem CUDA Programming and Performance	2	881	June 2, 2008
Memory access violations in kernel code when handling arrays of short integers CUDA Programming and Performance	8	5002	June 4, 2011
64-bit integer atomic instruction to shared memory CUDA Programming and Performance	0	3923	May 25, 2010

atomicAdd with shared memory failing last err "invalid device function"

Related topics