Cuda constant memory

I want to use constant memory, but I found a problem.
If I define a constant symbol in the cu file and want to initialize it in cpp, I have to pass its size and ptr through the function.
such as:

#include <cuda_runtime.h>
#include <iostream>
__constant__ int const_data;

extern size_t get_size() {
    size_t size;
    cudaGetSymbolSize(&size, const_data);
    return size;

extern void* get_pointer() {
    void* ptr;
    cudaGetSymbolAddress(&ptr, const_data);
    std::cout << "cu ptr:" << ptr << std::endl;
    return ptr;

__global__ void kernel_print_const() {
    printf("value = %d\n", const_data);

cudaError_t print_const() {
    return cudaDeviceSynchronize();

and my main.cpp:

#include <cuda_runtime.h>
#include <iostream>
extern cudaError_t print_const();
extern void* get_pointer();
extern size_t get_size();
int main() {
    std::cout << get_size() << std::endl;
    void* d_const_data = get_pointer();
    std::cout << "main ptr: " << d_const_data << std::endl;

    int h_const_data = 10;
    cudaMemcpy(d_const_data, &h_const_data, sizeof(int), cudaMemcpyHostToDevice);
    return 0;

how can I use cudaGetSymbolSize() and cudaMemcpyToSymbol() in my main.cpp?

By the way, if I define lots of constant symbol in the

__constant__ int a1;
__constant__ int a2;
__constant__ int a3;
__constant__ int a4;

Does I have to define lots of get_point() and get_size()?
How can I use function parameters to define a function to solve this?
such as :

template<class T>
void* get_size(T symbol) {
    void* ptr;
    cudaGetSymbolAddress(&ptr, symbol);
    std::cout << "cu ptr:" << ptr << std::endl;
    return ptr;

Yes. The __constant__ syntax is not something that can be processed by the host compiler. When you have a file in your project that is named filename .cpp, that file will be processed by the host compiler, by default. The host compiler will not understand __constant__ syntax.

If you want to switch to using a file, you can use the extern declaration in order to access the constant symbol in another compilation unit. For this you will need to compile with relocatable device code with device linking. There are examples like this already published.

I haven’t tried using a constant symbol with templating. At first glance, I don’t think it will work. (A simple test suggests to me it will not work: cudaErrorInvalidSymbol).

It is not clear to me what exactly you are trying to achieve, but an alternative to consider may be passing the desired data as a kernel argument, which is placed in constant memory. With recent hardware and software, CUDA is able to pass a good-sized chunk of data this way. This avoids separate cudaMemCpyFromSymbol() calls.

CUDA kernel function parameters are passed to the device through constant memory and have been limited to 4,096 bytes. CUDA 12.1 increases this parameter limit from 4,096 bytes to 32,764 bytes on all device architectures including NVIDIA Volta and above.

In fact, I want to encapsulate a cpp class for Cuda’s constant memory to implement the functions of upload, download, and memset. But I have to export its device pointer in the Cu file that defines the constant memory. If there are too many constant variables, this will require to be implemented get_pointer() for each variable.

I do not know your use case, but off-hand I fail to see the benefits of such an arrangement.

Note that there is not the constant memory, but multiple banks of constant memory (the details differ by GPU architecture), one of which is used to back __constant__.

You could always aggregate multiple variables into a single data object, a struct in the simplest case, to reduce the the number of distinct data objects in constant memory.

1 Like