i’ve tried to enable shared memory between opencl and cuda functions. what i’ve done is allocating gpu memory using cudaMalloc and then providing this pointer to clCreateBuffer function as host pointer.
in case when data is first copied from host memory to device memory using cudaMemcpy and the original cuda pointer, and then copying the data back to host memory using clEnqueueReadBuffer using the cl_mem object received from clCreateBuffer i was successfull in receiving the original data.
however if i first copy the data form host memory to device memory using clEnqueueWriteBuffer and then copy the data back using cudaMemcpy, then i get invalid data.
i’m now not sure what is the reason for this behaviour. is it just luck that it does function in first case or is it a valid approach of achieving shared memory for OpenCL and CUDA applications? does it have anything to do with the UVA support from CUDA?
i would be glad if anybody would help me…