QD4_33
October 22, 2008, 12:37pm
1
Hi,
I tried to copy some kernel calculated values to constant memory, because I want to reuse them in other kernels.
Unfortunately I get segmentation faults.
Some little code snippet:
#include <stdio.h>
__constant__ float testVar[1];
__global__ void test_kernel( float *testVar_d )
{
testVar_d[ 0 ] = 7.0f;
}
__global__ void output_kernel( float *out_d )
{
out_d[ 0 ] = testVar[0];
}
int main()
{
float *testVar_d;
cudaMalloc( (void**)&testVar_d, sizeof( float ) );
float *out_d;
cudaMalloc( (void**)&out_d, sizeof( float ) );
float h = 1;
cudaMemcpyToSymbol( testVar, &h, sizeof( float ) );
dim3 dimGrid( 1 );
dim3 dimBlock( 1 );
test_kernel<<< dimGrid, dimBlock >>>( testVar_d );
cudaThreadSynchronize();
cudaMemcpyToSymbol( testVar, testVar_d, sizeof( float ), cudaMemcpyDeviceToDevice );
output_kernel<<< dimGrid, dimBlock >>>( out_d );
float result = 2;
cudaMemcpy( &result, out_d, sizeof( float ), cudaMemcpyDeviceToHost );
printf( "%f\n", result );
return 0;
}
This programm should print 7, if everything works fine…
Any ideas how to copy from global memory to constant without making a copy to host memroy?
Thx ;o)
DarkAr
October 23, 2008, 7:39am
2
you cannot copy from device memory to const memory directly
you need first copy value from device memory to system memory, and then from system memory to symbol (const memory) :D
QD4_33
October 27, 2008, 10:20am
3
I think you are right.
But the reference manual sais…
cudaMemcpyToSymbol - copies data from host memory to GPU
[…]
Copies count bytes from the memory area pointed to by src to the memory area pointed to by offset bytes from the start of symbol symbol. The memory areas may not overlap. symbol can either be a variable that resides in global or constant memory space, or it can be a character string, naming a variable that resides in global or constant memory space. kind can be either cudaMemcpyHostToDevice or cudaMemcpyDeviceToDevice.
I hope somebody writing this reference manual is reading that.
When I can use a device to device copy and I am not allowed to copy to constant device memory, it should be noticed.
I think you are right.
But the reference manual sais…
I hope somebody writing this reference manual is reading that.
When I can use a device to device copy and I am not allowed to copy to constant device memory, it should be noticed.
Yes, i got the same problem.
I was trying to copy data from device[Global memory] to constant memory .
But it was never work.
so I think that we can only copy data from host memory to constant memory.
I think that in the “reference manual” has this mistake.
1 Like
QD4_33
November 11, 2008, 12:05pm
5
I got a hint!
Definition after includes…
__contant__ float constant_device_variable
Host code…
float *device_pointer
cudaMalloc( (void**)&device_pointer, device_pointer_size );
[...]
cudaMemcpyToSymbol( "constant_device_variable", device_pointer, sizeof( float ), 0, cudaMemcpyDeviceToDevice );
This version works.
Important is to set a 0-offset. Perhaps cudaMemcpyDeviceToDevice is misinterpreted by nvcc as an offset, when there are three arguments in the function call.
edit: perhaps cudaMemcpyHostToDevice is equal to zero. If this assumption is true, the reference manual is ok but the sample code in the programming guide is misleading.
1 Like