Does pinned memory can accessed by Device?

Rookie_programmer · March 17, 2024, 9:26am

I was learing something about pinned memory and zero-copy memory.

But i found the pinned memory can access by device without cudamemcpy

GPU is A800, CUDA version is 12, My code as follows:

__global__ void sumArraysGPU(float*a)
{
  int i=blockIdx.x*blockDim.x+threadIdx.x;
  printf("%f\n", a[i]);
}


int main(int argc,char **argv)
{
  int dev = 0;
  cudaSetDevice(dev);

  int nElem=32;

  int nByte=sizeof(float)*nElem;

  float *a_h=(float*)malloc(nByte);
  float *a_d;  
  cudaMallocHost((float**)&a_d,nByte);

  for(int i=0; i<32; i++)
    a_h[i] = i;
  
  cudaError_t error = cudaMemcpy(a_d,a_h,nByte,cudaMemcpyHostToDevice);
  
  if(error == cudaSuccess){
    printf("Success\n");
  }

  sumArraysGPU<<<1,32>>>(a_d);


  free(a_h);
  cudaFreeHost(a_d);


  return 0;
}

There are some question about the code:

In my understanding, pinned-memory located in CPU DRAM which different from zero-copy memory has tow address in CPU DRAM and GPU DRAM. Why the statement : cudaMemcpy(a_d,a_h,nByte,cudaMemcpyHostToDevice)
not throw a error, because i tried to transfer data from CPU to CPU but used cudaMemcpyHostToDevice parameter.
In kernel function, i tried to print data, I found the data in a_d the same as a_h, so it’s means data transfer is successful, but I havn’t But I didn’t declare a_d to zero-copy memory.

Is there something wrong in my understanding?

Robert_Crovella · March 17, 2024, 7:44pm

In my opinion, we shouldn’t draw much of a distinction between “pinned” and “zero-copy”. They are roughly the same thing.

Pinning memory is the act of placing it in a kind of storage that the GPU can access directly. Unit 7 of this online training series discusses this in more detail.

Zero copy means to access this data directly in GPU kernel code, as if you were accessing device memory. Both device allocations and pinned allocations reside in the logical global space, and both types of allocations are directly accessible from GPU kernel code, without requiring cudaMemcpy, per se.

Yes, correct.

No, not different. If you are going to use a “zero-copy technique” the data you do that with must be in pinned memory.

Pinned memory is accessible from device code, so you could say it is device-accessible, and it is a legitimate target for cudaMemcpy. I think what you will find is that you could use either cudaMemcpyHostToHost or cudaMemcpyHostToDevice. Either will work. because pinned memory can be accessed either from host or device.

Rookie_programmer · March 18, 2024, 1:52am

In CUDA Runtime API documentation, they said must be set flag cudaHostAllocMapped in cudaHostAlloc to get mapped device memory for pinned memory.

cudaHostAllocMapped discription in documentation as follows:

cudaHostAllocMapped: Maps the allocation into the CUDA address space. The device pointer to the memory may be obtained by calling cudaHostGetDevicePointer().

In my test code, i didn’t set flag above, why it can accessed by device?

Robert_Crovella · March 18, 2024, 3:40am

Because in a 64-bit host OS, UVA is in effect, and in a UVA regime, all pinned allocations are automatically mapped, even if you don’t set the flag.

From here:

Automatic Mapping of Host Allocated Host Memory

All host memory allocated through all devices using cudaMallocHost() and cudaHostAlloc() is always directly accessible from all devices that support unified addressing. This is the case regardless of whether or not the flags cudaHostAllocPortable and cudaHostAllocMapped are specified.

The pointer value through which allocated host memory may be accessed in kernels on all devices that support unified addressing is the same as the pointer value through which that memory is accessed on the host. It is not necessary to call cudaHostGetDevicePointer() to get the device pointer for these allocations.

system · April 1, 2024, 3:41am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Is it possible to use pinned memory? Outside of CUDA CUDA Programming and Performance	14	6287	January 22, 2025
CUDA device memory access? CUDA Programming and Performance	11	15706	August 5, 2011
can I use pinned memory? CUDA Programming and Performance	6	2631	September 21, 2009
Pinned memory size problem CUDA Programming and Performance	4	3911	December 11, 2009
Pinned memory error invalid device pointer CUDA Programming and Performance	9	6080	April 10, 2009
Enabling Concurrent Managed Memory access on GPU CUDA Programming and Performance	10	108	February 12, 2025
Can I use Unified Memory in a soft real-time system? CUDA Programming and Performance	13	351	April 1, 2024
question about page locked memory CUDA Programming and Performance	2	8815	April 21, 2009
zero copy : Device 0 cannot map host memory! zero copy not working, unable to map host memory? CUDA Programming and Performance	4	6466	June 9, 2009
Problems with cudaHostAlloc and cudaMemcpyAsync CUDA Programming and Performance	5	4515	February 8, 2010

Does pinned memory can accessed by Device?

Related topics