Data Transfers Optimization aka Pinned Host Memory utilization

I’ve read This Article and tried the corresponding code From Here

The results I’ve got are not as expected. I can’t see any significant bandwidth increase for the “pinned memory” case. In some test runs it’s even worse then for “usual” memory. Any idea, why, please?

./memtst

Device: NVIDIA GeForce GTX 1060 6GB                                                                                                                                                                                
Transfer size (MB): 16                                                                                                                                                                                             
                                                                                                                                                                                                                   
Pageable transfers                                                                                                                                                                                                 
  Host to Device bandwidth (GB/s): 0.388620                                                                                                                                                                        
  Device to Host bandwidth (GB/s): 0.417222                                                                                                                                                                        
                                                                                                                                                                                                                   
Pinned transfers                                                                                                                                                                                                   
  Host to Device bandwidth (GB/s): 0.359622                                                                                                                                                                        
  Device to Host bandwidth (GB/s): 0.419293                                                                                                                                                                        
                                                                                                                                                                                                                   
 ./memtst                                                                                                                                                                            
                                                                                                                                                                                                                   
Device: NVIDIA GeForce GTX 1060 6GB                                                                                                                                                                                
Transfer size (MB): 16                                                                                                                                                                                             
                                                                                                                                                                                                                   
Pageable transfers                                                                                                                                                                                                 
  Host to Device bandwidth (GB/s): 0.387992                                                                                                                                                                        
  Device to Host bandwidth (GB/s): 0.417588                                                                                                                                                                        
                                                                                                                                                                                                                   
Pinned transfers                                                                                                                                                                                                   
  Host to Device bandwidth (GB/s): 0.390906                                                                                                                                                                        
  Device to Host bandwidth (GB/s): 0.418569                                                                                                                                                                        
                                                                                                                                                                                                                   
 ./memtst                                                                                                                                                                            
                                                                                                                                                                                                                   
Device: NVIDIA GeForce GTX 1060 6GB                                                                                                                                                                                
Transfer size (MB): 16                                                                                                                                                                                             
                                                                                                                                                                                                                   
Pageable transfers                                                                                                                                                                                                 
  Host to Device bandwidth (GB/s): 0.387717                                                                                                                                                                        
  Device to Host bandwidth (GB/s): 0.416807                                                                                                                                                                        
                                                                                                                                                                                                                   
Pinned transfers                                                                                                                                                                                                   
  Host to Device bandwidth (GB/s): 0.390276                                                                                                                                                                        
  Device to Host bandwidth (GB/s): 0.419327                                                                                                                                                                        
                                                                                                                                                                                                                   
 ./memtst                                                                                                                                                                            
                                                                                                                                                                                                                   
Device: NVIDIA GeForce GTX 1060 6GB                                                                                                                                                                                
Transfer size (MB): 16                                                                                                                                                                                             
                                                                                                                                                                                                                   
Pageable transfers                                                                                                                                                                                                 
  Host to Device bandwidth (GB/s): 0.387582                                                                                                                                                                        
  Device to Host bandwidth (GB/s): 0.417603                                                                                                                                                                        
                                                                                                                                                                                                                   
Pinned transfers                                                                                                                                                                                                   
  Host to Device bandwidth (GB/s): 0.389954                                                                                                                                                                        
  Device to Host bandwidth (GB/s): 0.419298    

Did you run this code on Linux or Windows?

The throughput numbers reported are incredibly low. Using PCIe gen3 x16, one typically sees 12 to 13 GB/sec transferred across the interconnect (per direction). The numbers shown are below what I would expect even from a PCIe gen3 x1 connection. Check your system configuration, the GTX 1060 appears to be in the wrong PCIe slot.

Because of the extremely low PCIe bandwidth, the additional system memory copy necessary for pagable transfers (and which the pinned case avoids) has no noticeable performance impact, as it occurs with 40 GB/sec to 100 GB/sec (or more) depending on platform.

Here is what I get on a Gen2 system:

$ ./t7

Device: Tesla V100-PCIE-32GB
Transfer size (MB): 16

Pageable transfers
  Host to Device bandwidth (GB/s): 3.236626
  Device to Host bandwidth (GB/s): 3.196644

Pinned transfers
  Host to Device bandwidth (GB/s): 11.669516
  Device to Host bandwidth (GB/s): 12.332706

$

While the GPU is busy, run the command “nvidia-smi -q”. What are the results below the “GPU Link Info” section?

Linux

It’s gen2 in fact ))