Set the address of a symbol in Driver API

MartyMcFly · July 27, 2010, 7:48am

Hello,

maybe this question has been asked already before, but I couldn’t find any entry.

I’m working in C# with the driver API. For example I have following code:

__global__ float *A = NULL;

void someFunc (float Theta)

{

// Some access to A

}

In C#, I need to get the symbol from the module via cuModuleGetGlobal() method where I gain access to the device pointer of A.

The problem is that I don’t know at design time the size of the array, so what I would need is a something like this:

CUdeviceptr *devptr;

cuMemAlloc(devptr, 100 /* Example */);

cuSetGlobalAddress("A", devptr); /* A method to set the address of the symbol */

Of course I could do this, but for performance reasons I would not prefer this option, since the parameter list is much longer than necessary.

Is there a way to set the address of the symbol programatically?

Thanks

Martin

tera · July 27, 2010, 12:07pm

After calling cuMemAlloc(), you’ve got a pointer to device memory, which is all you need. You can pass that pointer as an argument to a kernel, or you can write it to a global variable on the device for later use.

What are you trying to achieve by setting the symbol address?

MartyMcFly · July 27, 2010, 2:33pm

Hello tera,

the first solution you have mentioned would work, but I thought of a method where I don’t have to pass the parameters through the kernel all the time.

I simply want to reduce the load of settings the parameters with cuSetParam, therefore I also reduce the complexity of my code. Currently I have to call the kernel

about 4000 times having to set the kernel parameters all the time. Most of the parameter I can simply set once as global variables.

What I’m searching for is a method where I can set the address of a global variable in the host code but there is currently (as far as I know) no such method.

As you have mentioned, how can I write the resulting device pointer from cuMemAlloc to a variable on the device?

Do you have an idea for this?

Thanks

Martin

tera · July 27, 2010, 4:05pm

Just declare a pointer variable on the device, and copy the device pointer into it that you got from cudaMemAlloc() on the

host:

float *data;

	__device__ float *device_data;

	cudaMemAlloc(&data, 1234*sizeof(float));

	cudaMemcpyToSymbol(device_data, &data, sizeof(device_data), 0, cudaMemcpyHostToDevice);

You cannot set the address of a variable on the device, because that would require some kind of linker to be run afterwards to fix up every access to that variable in device code.

MartyMcFly · July 28, 2010, 9:04am

Just declare a pointer variable on the device, and copy the device pointer into it that you got from cudaMemAlloc() on the

host:
float *data;

	__device__ float *device_data;

	cudaMemAlloc(&data, 1234*sizeof(float));

	cudaMemcpyToSymbol(device_data, &data, sizeof(device_data), 0, cudaMemcpyHostToDevice);
You cannot set the address of a variable on the device, because that would require some kind of linker to be run afterwards to fix up every access to that variable in device code.

Hello Tera,

great. This is the method I was looking for. I completely forgot that this method exists.

Thanks for your help.

MartyMcFly · July 28, 2010, 4:23pm

Can somebody tell me what the analogon to the cudaMemcpyToSymbol method is available in the driver API?

Thanks

Martin

tera · July 28, 2010, 4:44pm

In the driver API you get the symbol address with cuModuleGetGlobal() and then do a normal cuMemcpyHtoD().

cmaster.matso · August 10, 2011, 5:24am

The problem is that it DOES NOT work this way - when You use cuModuleGetGlobal() on a name of a device pointer, You get only THE POINTER address (and THE POINTER size in bytes) and do not allocate the pointer. Thus, when calling cuMemcpyHtoD() with data array of given length, say N, there will be CUDA_ERROR_INVALID_VALUE result returned by the second call. How the driver is supposed to know the amount of memory (here N times size of the type the pointer is pointing to, in bytes) needed at the call of cuMemcpyHtoD()? With no prior allocation You are tring to copy host data of given length to device data of fixed size (not N), which is the size of the pointer (at my computer it is 8 bytes). The question is how to allocate such pointer? Or how to map a allocated memory (by call cuMemAlloc()) to certain name listed in the kernel code as the device memory variables? I don’t want to use the kernel parameters, of course.

My hardware is GeForce 9600 GT. Please correct me if I got something wrong here.

Regards,

MK

Topic		Replies	Views
Change the address of an existing kernel symbol CUDA Programming and Performance	0	712	May 4, 2009
How can Iget the pointer to the device memory var CUDA Programming and Performance	9	4940	October 31, 2007
Global device pointer access using cudaMemcpyToSymbol and cudaMemcpyFromSymbol CUDA Programming and Performance	4	3619	November 21, 2013
device memory declared Globally not passed in CUDA Programming and Performance	1	1326	March 31, 2011
How to allocate global dynamic memory on device from host CUDA Programming and Performance	6	1091	March 2, 2019
The most basic problem,ask for help CUDA Programming and Performance	5	2172	February 2, 2009
how to use global device struct variables in device functions CUDA Programming and Performance	4	9304	May 19, 2011
Memory allocation with driver API CUDA Programming and Performance	0	895	April 27, 2009
accessing __device__ global variables CUDA Programming and Performance	2	1501	July 28, 2008
Unable to read global memory inside a kernel allocation CUDA Programming and Performance	1	525	April 14, 2019

Set the address of a symbol in Driver API

Related topics