Pinned memory that's not

Cuda 10.2 CC6.1

I allocate a couple of chunks of pinned memory, intending to cudaMemcpyAsync to them:

	    	//host mem	    	
	    	cudaMallocHost((void **) &Iv.t, 53 * sizeof(uint32_t));
	        cudaMallocHost((void **) &Args.can_h, 67000 * sizeof(uint64_t));

When I run the code and check the results in Nsight System, I find the Iv.t memcopy is behaving synchronously and the tooltip for the transfer states:
Begins: 1.57065s
Ends: 1.57065s (+1.248 μs)
DtoH memcpy 212 bytes
Source memory kind: Device
Destination memory kind: Pageable
Throughput: 162.002 MiB/s
Launched from thread: 4200
Latency: ←46.367 μs
Correlation ID: 1654
Stream: Stream 14

The memcopy for Args.can_h behaves as expected:
Begins: 1.57049s
Ends: 1.57065s (+156.326 μs)
DtoH memcpy 523,480 bytes
Source memory kind: Device
Destination memory kind: Pinned
Throughput: 3.11867 GiB/s
Launched from thread: 4200
Latency: ←10.176 μs
Correlation ID: 1650
Stream: Stream 15

Why is Iv.t not being pinned?

Perhaps you re-allocated the pointer somewhere else using an ordinary allocator.

Perhaps you passed &Iv.t to cudaMemcpyAsync rather than Iv.t

There may be other possibilities. A short, complete reproducer with full environment details may help to identify the reason.

Thanks Robert and apologies for the noise, I declared an array, (t[53];), inside the host struct…

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.