where to declare constant memory for multi-gpu case? constant memory, multiple gpus, multi-gpu

hi, there

My code uses constant memory which works well on single gpu. Since I am greedy, I want to use multiple gpus to run the same job. The parallelization idea is quite straightforward. I use pthread.h to create 4 threads on CPU (my machine has 8 cores). Then, each thread calls a gpu device and writes part of results into a big array. It works fine for a small testing job without using constant memory. But for my project, I don’t know where/how to declare the constant memory.

For the single gpu code. I declared it at the first beginning and call the copy as following:

[codebox]device constant float c_A_n[CHUNK_SIZE_fwd];



Apparently, I can not do this anymore, since the compiler may get confused about which device I mean. So I decided to put the same lines to the pthread_functions and expected each CPU thread to declare a constant memory on each corresponding GPU. I can pass the compiling but I got the error when I run it:

multifwd_func.cu:95: error: cannot convert ‘float’ to ‘const char’ for argument ‘1’ to ‘cudaError_t cudaMemcpyToSymbol(const char*, const void*, size_t, size_t, cudaMemcpyKind)’

I did not have the error for the same code running on single gpu. I don’t know what to do. Is there any way to specify the device when I declare the constant memory? or I can just locate the same delcaration at a smarter place?



Hi Kun,

I am facing a similar problem while trying to use constant memory from multiple GPUs. Can you please share the solution if you found one.


I usually divide my code into two files, code.cu and code_kernel.cu. I always put constant memory declarations in the kernel file and I’ve never had any problems with constant memory for multi-GPU.