how to use global device struct variables in device functions

educnq · May 18, 2011, 2:06pm

The problem is as title. The code is as below (tested). Compilation had no problem. The program gave wrong result (member ‘m’ is not assigned to 10). What’s the problem?

#include <stdio.h>

struct g{

  int m;

};

__device__ struct g *d;

__global__ void kernel()

{

  int tid=blockIdx.x * blockDim.x + threadIdx.x;

  d[tid].m=10;

}

int main()

{

  size_t size = 1 * sizeof(struct g);

  cudaMalloc(&d, size);

  kernel<<<1,1>>>();

  cudaDeviceSynchronize();

  struct g *h = (struct g*)malloc(size);

  cudaMemcpy(h, d, size, cudaMemcpyDeviceToHost);

  printf("Result: %d\n",h[0].m);

}

avidday · May 18, 2011, 2:47pm

If you add some error checking (all of those API functions return error codes), things will become a lot clearer.

The basic problem is the cudaMalloc call. You cannot call cudaMalloc directly onto a device symbol. Instead, perform cudaMalloc onto a host pointer, then copy the value of that pointer onto the device variable using cudaMemcpyToSymbol.

educnq · May 18, 2011, 4:41pm

I have error checking in my original code. To shorten the code here (for easy reading), I removed the checking code. But the checking code didn’t give the right error place. It told me kernel launch had unspecified error. My checking code is as below. Is it ok?

// check runtime call error

#define cudaSafeCall(call) {  \

  cudaError err = call;       \

  if(cudaSuccess != err){     \

    fprintf(stderr, "%s(%i) : %s.\n", __FILE__, __LINE__, cudaGetErrorString(err));   \

    exit(EXIT_FAILURE);       \

}}

// check kernel launch error

#define cudaCheckErr(errorMessage) {    \

  cudaError_t err = cudaGetLastError(); \

  if(cudaSuccess != err){               \

    fprintf(stderr, "%s(%i) : %s : %s.\n", __FILE__, __LINE__, errorMessage, cudaGetErrorString(err)); \

    exit(EXIT_FAILURE);                 \

}}

So, I

cudaMalloc a host pointer (so it’s in device memory)
modify its value in kernel functions
cudaMemcpyToSymbol its value to the device variable

This may work. However, what I intended to do is: I have tens of kernel functions. Most of them will use the device variable (to share infomation). So by setting the device variable as global, all kernel functions can see it (and hence use it). Following the way above, I can pass everything in parameters, but it needs to modify all definitions and calling of the kernel functions, looks not convenient. Any way can let me get what I wanted?

avidday · May 18, 2011, 4:51pm

Which is completely correct, because the device symbol d is uninitialized in your current code.

No, you do this:

cudaMalloc a host pointer (so it’s in device memory)
cudaMemcpyToSymbol its value (ie. its device address) to the device variable
modify its value in kernel functions using the canonical device symbol

which, if I understand you correctly, is what you are asking for.

educnq · May 19, 2011, 10:42am

Thanks, it worked! “Result: 10”! See new code below (note ‘sizep’ used in symbol copies). But I’m still not sure about the meaning of “copy symbol address” (not the content!), and why NVIDIA created such complex memory processing way for global variables (global here means only the scope of the variable - visible for whole application, not global memory). Could you please explain the process in detail and the reason if possible? It seems not in NVIDIA guides. And it’s not using double size of device memory (one for ‘d’, one for ‘ld’), right?

#include <stdio.h>

// check runtime call error

#define cudaSafeCall(call) {  \

  cudaError err = call;       \

  if(cudaSuccess != err){     \

    fprintf(stderr, "%s(%i) : %s.\n", __FILE__, __LINE__, cudaGetErrorString(err));   \

    exit(EXIT_FAILURE);       \

}}

// check kernel launch error

#define cudaCheckErr(errorMessage) {    \

  cudaError_t err = cudaGetLastError(); \

  if(cudaSuccess != err){               \

    fprintf(stderr, "%s(%i) : %s : %s.\n", __FILE__, __LINE__, errorMessage, cudaGetErrorString(err)); \

    exit(EXIT_FAILURE);                 \

}}

struct g{

  int m;

};

__device__ struct g *d; // device (global)

__global__ void kernel()

{

  int tid=blockIdx.x * blockDim.x + threadIdx.x;

  d[tid].m=10;

}

int main()

{

  size_t size = 1 * sizeof(struct g);

  size_t sizep = 1 * sizeof(struct g*);

  struct g *ld; // device (local)

  cudaSafeCall(cudaMalloc(&ld, size));

  cudaSafeCall(cudaMemcpyToSymbol(d,&ld,sizep));

  kernel<<<1,1>>>();

  cudaSafeCall(cudaDeviceSynchronize());

  cudaCheckErr("kernel error");

  struct g *h = (struct g*)malloc(size);

  if(h==NULL){

    fprintf(stderr, "%s(%i) : malloc error.\n", __FILE__, __LINE__);

    exit(EXIT_FAILURE);

  }

  //cudaSafeCall(cudaMemcpyFromSymbol(&ld,d,sizep)); // not necessary

  cudaSafeCall(cudaMemcpy(h, ld, size, cudaMemcpyDeviceToHost));

  printf("Result: %d\n",h[0].m);

}

Topic		Replies	Views
Invalid Device Pointer CUDA Programming and Performance	9	24456	January 15, 2009
Problem accessing global memory General protection fault CUDA Programming and Performance	4	7767	October 31, 2007
Global device pointer access using cudaMemcpyToSymbol and cudaMemcpyFromSymbol CUDA Programming and Performance	4	3508	November 21, 2013
How can Iget the pointer to the device memory var CUDA Programming and Performance	9	4807	October 31, 2007
__device__ array to __global__ Cant pass a __device__ array to __global__ CUDA Programming and Performance	3	2290	February 24, 2012
cudaMalloc() return "cudaErrorLaunchFailure" CUDA Programming and Performance	19	14466	December 11, 2008
Can we do malloc inside a __global__ function CUDA Programming and Performance	26	9631	February 21, 2010
__device__ variables and arrays CUDA Programming and Performance	8	15117	August 16, 2014
How to pass large arguments in CUDA kernels Kernel arguments CUDA Programming and Performance	10	18983	December 18, 2009
segmentation fault at the first cudaMalloc with --device-emulation everything was fine CUDA Programming and Performance	10	4317	January 25, 2010

how to use global device struct variables in device functions

Related topics