Pinned memory that's not

rs277 · February 11, 2024, 7:28pm

Cuda 10.2 CC6.1

I allocate a couple of chunks of pinned memory, intending to cudaMemcpyAsync to them:

	    	//host mem	    	
	    	cudaMallocHost((void **) &Iv.t, 53 * sizeof(uint32_t));
	        cudaMallocHost((void **) &Args.can_h, 67000 * sizeof(uint64_t));

When I run the code and check the results in Nsight System, I find the Iv.t memcopy is behaving synchronously and the tooltip for the transfer states:
Begins: 1.57065s
Ends: 1.57065s (+1.248 μs)
DtoH memcpy 212 bytes
Source memory kind: Device
Destination memory kind: Pageable
Throughput: 162.002 MiB/s
Launched from thread: 4200
Latency: ←46.367 μs
Correlation ID: 1654
Stream: Stream 14

The memcopy for Args.can_h behaves as expected:
Begins: 1.57049s
Ends: 1.57065s (+156.326 μs)
DtoH memcpy 523,480 bytes
Source memory kind: Device
Destination memory kind: Pinned
Throughput: 3.11867 GiB/s
Launched from thread: 4200
Latency: ←10.176 μs
Correlation ID: 1650
Stream: Stream 15

Why is Iv.t not being pinned?

Robert_Crovella · February 11, 2024, 10:57pm

Perhaps you re-allocated the pointer somewhere else using an ordinary allocator.

Perhaps you passed &Iv.t to cudaMemcpyAsync rather than Iv.t

There may be other possibilities. A short, complete reproducer with full environment details may help to identify the reason.

rs277 · February 12, 2024, 12:01am

Thanks Robert and apologies for the noise, I declared an array, (t[53];), inside the host struct…

Topic		Replies	Views
Problems with cudaHostAlloc and cudaMemcpyAsync CUDA Programming and Performance	5	4636	February 8, 2010
cudaMemcpyAsync and pinned memory CUDA Programming and Performance	1	1219	August 31, 2021
Problem with asynchronous host to host memcpy CUDA Programming and Performance	1	4766	January 4, 2011
Async memory problems CUDA Programming and Performance	7	7369	February 11, 2011
Does cudaMemcpyAsync require host memory to be pinned? CUDA Programming and Performance cuda	1	482	October 6, 2022
Searching some infos on cudaStreams CUDA Programming and Performance	6	6222	January 26, 2012
cudaMemcpyDeviceToHost - slow performance using pinned memory CUDA Programming and Performance	6	2978	June 24, 2016
Does cudaMemcpyAsync require pinned memory? CUDA Programming and Performance	2	5518	November 25, 2015
Overlapping computation and data transfers must use pinned memory or UVA? CUDA Programming and Performance	1	677	August 13, 2018
transfer from pageable host memory to page-locked host memory? CUDA Programming and Performance	3	1166	June 1, 2012

Pinned memory that's not

Related topics