How to allocate global dynamic memory on device from host

Hi,

is it possible to allocate global dynamic Memory on the device from the host ??
I must calculate a amount of Memory on the Host. Then after a init process this amount is fixed.
So i want to allocate some Memory on the Device within the pointer p_GLOBAL_HOLDER_Picture.
The Function call cudaMemcpyToSymbol(…) does not allocate any Memory, but executes without any error…
The Example is the same as in the CUDA-Programmers Guide.

__device__ unsigned char * p_GLOBAL_HOLDER_Picture;
unsigned char * ptr;

cudaMalloc((void **)&ptr, SIZE);
 cudaMemcpyToSymbol(p_GLOBAL_HOLDER_Picture, &ptr, sizeof(ptr));


__global__ void CKNL_GrayTransform()
{
    if (NULL == p_GLOBAL_HOLDER_Picture)
    {
        // Is True for ever
    }


}

Regards TinTin

1 Like

It’s possible. Not very sensible, but possible.

you do it exactly the way you have indicated

Hi Robert, thanks for the answer. Can you briefly give me a working example? Because it does not work for me … The memory is not placed under my global pointer. The pointer remains NULL

Its not much more than what you have shown already.

$ cat t402.cu
#include <stdio.h>

__device__ unsigned char * p_GLOBAL_HOLDER_Picture;


__global__ void CKNL_GrayTransform()
{
    if (NULL == p_GLOBAL_HOLDER_Picture)
    {
        printf("omg bob does not know what he is talking about\n");
    }
    else
      printf("the data is: %u\n", p_GLOBAL_HOLDER_Picture[0]);

}



int main(){

  const int SIZE = 1;
  unsigned char * ptr;
  cudaMalloc((void **)&ptr, SIZE);
  cudaMemcpyToSymbol(p_GLOBAL_HOLDER_Picture, &ptr, sizeof(ptr));
  cudaMemset(ptr, 43, SIZE);
  CKNL_GrayTransform<<<1,1>>>();
  cudaDeviceSynchronize();
}

$ nvcc -o t402 t402.cu
$ cuda-memcheck ./t402
========= CUDA-MEMCHECK
the data is: 43
========= ERROR SUMMARY: 0 errors
$

Hi Robert,
okay you are right. I reduced everything in my program to the minimal example and then it works.
So it has to be something else. There are still some things different which I will then check piece by piece.
Many thanks for the support.

Hi Robert, as soon as I apply the global memory externally, my project no longer works.
File1.cu

__device__ unsigned char * p_GLOBAL_HOLDER_Picture;

File2.cu

extern __device__ unsigned char * p_GLOBAL_HOLDER_Picture;

unsigned char * ptr;



cudaMalloc((void **)&ptr, SIZE);

cudaMemcpyToSymbol(p_GLOBAL_HOLDER_Picture, &ptr, sizeof(ptr));

__global__ void CKNL_GrayTransform()
{
    if (NULL == p_GLOBAL_HOLDER_Picture)
    {
        // Is True for ever
    }
}

Okay, for separate compilation you have to explicitly tell the CUDA compiler to use relocatable object files.
https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html#code-changes-for-separate-compilation
Visual-Studio: Configuration Properties → CUDA C/C++ → Common : Generate Relocatable Device Code := Yes (-rcd=true)

Now it works,
Thx for help