Hi, I’m testing P2P memory access, I want to know when local GPU data cache miss, how many sectors will be transfered. So, I perform a kernel to read peer GPU memory and test lts__t_requests_srcunit_tex_aperture_peer metric, but I got a zero value. Did I misunderstand something? And how to test pe…

Problems with lts__t_requests_srcunit_tex_aperture_peer

jmarusarz September 28, 2022, 8:34pm 4

I didn’t realize you were using NVLINK, as opposed to PCIe. The peer metrics are for PCIe connected GPUs. The NVLINK section is what will show the communication between GPUs in your system. The only thing to be aware of is that they are device-wide, so you want to make sure that you don’t have traffic generated by other applications running at the same time.

In case of using peer memory, How can I measure the L1 or L2 cache's value on operating GPU?

Topic		Replies	Views
In case of using peer memory, How can I measure the L1 or L2 cache's value on operating GPU? Nsight Compute	3	237	March 18, 2025
cuda 4.0rc2 cudaMemcpyPeer(Async) performance issues CUDA Programming and Performance	11	13132	May 3, 2011
Peer Device GPU-Accelerated Libraries	1	1095	August 12, 2013
L2 read/write misses greater than requests CUDA Programming and Performance	11	3130	May 11, 2011
problem with cudaMemcpyPeer() - won't do copying CUDA Programming and Performance	0	1466	February 16, 2012
Peer-To-Peer Access with cudaPitchedPtr CUDA Programming and Performance	3	1130	October 19, 2011
cudaDeviceCanAccessPeer returns false CUDA Programming and Performance	1	7348	April 16, 2011
Peer-to-Peer access GPUDirect CUDA Programming and Performance	0	6146	August 8, 2011
Partial fail of peer access in 8 Volta GPU instance (p3.16xlarge) on AWS -> huge slowdown CUDA Programming and Performance	32	3794	March 10, 2018
identical code on multiple GPUs attached to the same board. how to do p2p memaccess? CUDA Programming and Performance	2	957	June 12, 2013

Problems with lts__t_requests_srcunit_tex_aperture_peer

Related topics