BUG: Linking together multiple .cu files and using the same constant symbols

Erik_Valkering · October 24, 2009, 11:42am

Hello everyone,

I’m trying to separate my kernels into multiple .cu files, in order to fine-tune the nvcc compile options per kernel. Since they all depend on the same constant datastructures, I must call cudaMemcpyToSymbol for each file. However, this does not work as expected: when launching the kernels, all datastructures are empty. Only one is correctly copied, which is the last object-file passed to the linker. It seems that although the constant declarations cannot be referenced externally, they also cannot have equal names.

The following code can be used to reproduce the problem:

kernel1.cu:

[codebox]#include <stdio.h>

constant float value;

constant float value_kernel1;

extern “C” void set_values_kernel1(float value)

{

cudaMemcpyToSymbol("value", &value, sizeof(float), 0, cudaMemcpyHostToDevice);

cudaMemcpyToSymbol("value_kernel1", &value, sizeof(float), 0, cudaMemcpyHostToDevice);

}

global void kernel1(float *data)

{

if (threadIdx.x == 0 && threadIdx.y == 0)

{

	data[0] = value;

	data[1] = value_kernel1;

}

}

extern “C” void launch_kernel1(void)

{

float value[2];

float *devPtr;

cudaMalloc((void **)&devPtr, sizeof(value));

dim3 grid(1);

dim3 block(32);

kernel1<<<grid, block>>>(devPtr);

cudaThreadSynchronize();

cudaMemcpy(&value, devPtr, sizeof(value), cudaMemcpyDeviceToHost);

printf("kernel1, __constant__ value: %f\n", value[0]);

printf("kernel1, __constant__ value_kernel1: %f\n", value[1]);

cudaFree(devPtr);

}

[/codebox]

kernel2.cu:

[codebox]#include <stdio.h>

constant float value;

constant float value_kernel2;

extern “C” void set_values_kernel2(float value)

{

cudaMemcpyToSymbol("value", &value, sizeof(float), 0, cudaMemcpyHostToDevice);

cudaMemcpyToSymbol("value_kernel2", &value, sizeof(float), 0, cudaMemcpyHostToDevice);

}

global void kernel2(float *data)

{

if (threadIdx.x == 0 && threadIdx.y == 0)

{

	data[0] = value;

	data[1] = value_kernel2;

}

}

extern “C” void launch_kernel2(void)

{

float value[2];

float *devPtr;

cudaMalloc((void **)&devPtr, sizeof(value));

dim3 grid(1);

dim3 block(32);

kernel2<<<grid, block>>>(devPtr);

cudaThreadSynchronize();

cudaMemcpy(&value, devPtr, sizeof(value), cudaMemcpyDeviceToHost);

printf("kernel2, __constant__ value: %f\n", value[0]);

printf("kernel2, __constant__ value_kernel2: %f\n", value[1]);

cudaFree(devPtr);

}

[/codebox]

main.cpp:

[codebox]extern “C” void set_values_kernel1(float value);

extern “C” void set_values_kernel2(float value);

extern “C” void launch_kernel1(void);

extern “C” void launch_kernel2(void);

int main(int argc, char *argv)

{

set_values_kernel1(0.5f);

set_values_kernel2(0.7f);

launch_kernel1();

launch_kernel2();

}

[/codebox]

To build the test-program, issue:

nvcc -c kernel1.cu

nvcc -c kernel2.cu

nvcc main.cpp kernel1.o kernel2.o

Running the program results in an incorrect output (these values should be equal kernel1: 0.5, kernel2: 0.7):

kernel1, constant value: 0.000000

kernel1, constant value_kernel1: 0.500000

kernel2, constant value: 0.700000

kernel2, constant value_kernel2: 0.700000

When linking kernel2 before kernel1, the output becomes:

nvcc main.cpp kernel2.o kernel1.o

kernel1, constant value: 0.700000

kernel1, constant value_kernel1: 0.500000

kernel2, constant value: 0.000000

kernel2, constant value_kernel2: 0.700000

It seems, that the constant value of kernel1 is overwritten by the third cudaMemcpyToSymbol call (using the value 0.7).

The platform I’m using is Ubuntu 9.04 32bit, CUDA 2.3, GeForce 8800M GTX.

Is this normal behaviour, or should it be considered a bug?

montmorency · November 14, 2009, 4:55pm

I have the same problem. The Programming guide points out:

Did you find a work around for this?

Erik_Valkering · November 15, 2009, 10:55am

Actually I have.

I created a utility class, which I call the ‘symbol manager’.

Each constant declaration per .cu file uses a macro to define a unique name and to register itself to the symbol manager.

When uploading the data to the constant memory, the data is copied to the files.

Include in each .cu file the following:

[codebox]

include “symbolmanager.h”

define KERNEL_NAME(name) name ## _ ## kernelName // replace ‘kernelName’ with a unique name identifying this .cu file[/codebox]

To declare a constant, replace

[codebox]constant type name;[/codebox]

with:

[codebox]

DEFINE_SYMBOL(type, name);

define name REF_SYMBOL(name)

[/codebox]

To upload the data to constant memory for each file simultaneously, call:

[codebox]

memcpyToSymbol(“systemParameters”, &systemParameters);

[/codebox]

To use the constant data, replace the name of the constant variable with:

[codebox]

name // replace name with the name of the variable

[/codebox]

symbolmanager.h:

[codebox]

#pragma once

define DEFINE_SYMBOL(type, name) \

__constant__ type KERNEL_NAME(name); \

extern "C" void KERNEL_NAME(memcpyToSymbol_##name)(const char *symbolName, void *data) \

{ \

	if (strcmp(symbolName, #name) == 0) \

	{ \

		size_t size; \

		CUDA_SAFE_CALL(cudaGetSymbolSize(&size, KERNEL_NAME(name))); \

		CUDA_SAFE_CALL(cudaMemcpyToSymbol(KERNEL_NAME(name), data, size)); \

	} \

} \

Symbol KERNEL_NAME(symbol_##name)(&KERNEL_NAME(memcpyToSymbol_##name))

define REF_SYMBOL(name) KERNEL_NAME(name)

class Symbol;

void registerSymbol(Symbol *symbol);

void memcpyToSymbol(const char *symbolName, void *data);

typedef void (*MEMCPYTOSYMBOLCALLBACK)(const char *, void *);

class Symbol

{

public:

Symbol(MEMCPYTOSYMBOLCALLBACK callback)

{

	this->callback = callback;

	registerSymbol(this);

}

MEMCPYTOSYMBOLCALLBACK callback;

};

[/codebox]

symbolmanager.cpp:

[codebox]

include “symbolmanager.h”

include

std::vector<Symbol *> *symbols;

void registerSymbol(Symbol *symbol)

{

static std::vector<Symbol *> symbols;

symbols.push_back(symbol);

symbols = &symbols;

}

void memcpyToSymbol(const char *symbolName, void *data)

{

for (size_t i = 0; i < symbols->size(); i++)

{

	(*symbols)[i]->callback(symbolName, data);

}

}

[/codebox]

Kind Regards,

Erik

montmorency · November 17, 2009, 9:18pm

great. thanks very much.

Topic		Replies	Views
Using cudaMemcpyToSymbol for constants declared in a separate file CUDA Programming and Performance	3	591	July 21, 2022
__constant__ stopped working CUDA Programming and Performance	20	4505	October 17, 2009
CUDA constant variables not loading properly in kernels CUDA Programming and Performance	2	1483	January 30, 2016
Constant variabiles scope CUDA Programming and Performance	5	1773	June 9, 2009
Issue with cudaMemcpyToSymbol and Separable Compilation. CUDA Programming and Performance	10	1244	February 28, 2019
Constant Memory - When are variables cleaned? Multiple kernels for same constant... CUDA Programming and Performance	10	8862	January 22, 2008
unresolved external symbol _main referenced in function ___tmainCRTStartup CUDA Programming and Performance	7	9310	February 22, 2011
problem in reading constant memory variables inside kernels problem in using several __constant__ va CUDA Programming and Performance	4	6949	June 23, 2009
Constant Memory and File Organization CUDA Programming and Performance	4	5385	May 12, 2010
Cuda giving wrong result CUDA Programming and Performance cuda , kernel	2	637	May 3, 2020

BUG: Linking together multiple .cu files and using the same __constant__ symbols

Related topics

BUG: Linking together multiple .cu files and using the same constant symbols