Accessing GPU global memory allocated on device - by host

Hi again, let me put some code in here first:

__global__
void  // memory allocation done by gpu
malloc_test( int** p ) {
	*p = new int(7);
	printf( "malloc address:\t%p\n",  *p );
}

__global__
void  // deleting by gpu
free_test( int* p ) {
	printf( "deallo address:\t%p\n",  p );
	delete p;
}

__global__
void // print some value
print( int* p ) {
	printf( "address print:\t%p\tvalue:\t%d\n", p, *p );
}

int* // function which will return a pointer to memory allocated on gpu (by gpu)
malloc_gpu() {
	int** d_ptr;
	rcudaError( cudaMalloc( &d_ptr, sizeof(int*) ) );

	malloc_test <<<1,1>>> (d_ptr);
	cudaDeviceSynchronize();

	int* result;
	rcudaError( cudaMemcpy( &result, d_ptr, sizeof(int*), cudaMemcpyDeviceToHost ) );

	rcudaError( cudaFree(d_ptr) );

        return result;
}

int
main() {
        // get a pointer to memory on gpu (will be allocated FROM KERNEL)
	int*	dev_p = malloc_gpu();

        // print this value
	print <<<1,1>>> (dev_p);
	cudaDeviceSynchronize();

	int p[] = {5};
        // a try to copy some value from host to device
        // unfortunatelly - it return an error..
	rcudaError( cudaMemcpy( dev_p, p, sizeof(int), cudaMemcpyHostToDevice) );
	
        // the same thing - but using a pointer, which will be allocated on gpu by host
	int*	dev_malloc_p;
	rcudaError( cudaMalloc( &dev_malloc_p, sizeof(int) ) );
	rcudaError( cudaMemcpy( dev_malloc_p, p, sizeof(int), cudaMemcpyHostToDevice) );

        // print again from pointer allocated by GPU (still show 7 instead of 5)
	print <<<1,1>>> (dev_p);
	cudaDeviceSynchronize();

        // print from pointer allocated by HOST (print corrent value 5)
	print <<<1,1>>> (dev_malloc_p);
	cudaDeviceSynchronize();

	rcudaError( cudaFree( dev_malloc_p ) );

        // delete memory (will be deallocated FROM KERNEL)
	free_test <<<1,1>>> (dev_p);
	cudaDeviceSynchronize();

	SYSTEM_PAUSE;
}

So… as described step by step - i see only one explanation.
If i didn’t make any stupid mistakes - host has no rights to write to memory allocated by GPU!
That’s stupid, but i think it’s because cudaMemcpy “checks” pointers and it doesn’t know the dev_p pointer - it wasn’t allocated with cudaMalloc so - it thinks it’s a wrong argument.

Please, tell me if i’m wrong and I made a mistake somewhere, or at least is there a way to solve this problem!

Here is a link to screen with console output:
(LINE 54 is diffrent from in my code, but here it is line number 50)

Thanks in advance!

The following line seems odd to me:

in malloc_gpu()

int* result;
rcudaError( cudaMemcpy( &result, d_ptr, sizeof(int*), cudaMemcpyDeviceToHost ) );

You declare a pointer, pointing to nowhere (no initialization), then you try to get value from GPU back which could only be copied to … nowhere

try to add :

int* result;
result = (int*) malloc(sizeof(int));
rcudaError( cudaMemcpy( &result, d_ptr, sizeof(int*), cudaMemcpyDeviceToHost ) );

[…] deallocate with free elsewhere

I did not tried to get further into your code so, there may be other problems

An other advice, you could also be interested in cudaMemcpyFromSymbol to get addresses from functions or any other data pointer residing in global or constant memory

Good luck

hmm… it is not copied to nowhere,
may say - this is actualy an initialization of this pointer -
cudaMemcpy write on ‘result’ - but not where it point to (there is a reference & before ‘result’ )

I thought about cudaMemcpyFromSymbol and cudaGetSymbolAdress but…
it doesn’t work when i put there this pointer (error: invalid device symbol)

Nobody knows ): ?

For me- it’s very weird and… i’d rather even say: pathetic that i cannot access memory allocated on GPU by GPU from host. I can’t believe it, because i don’t see a real reason for this.
So i still think, there must be a way to make it.