New DLI Training: Accelerating CUDA C++ Applications with Multiple GPUs

Originally published at: https://developer.nvidia.com/blog/new-dli-training-accelerating-cuda-c-applications-with-multiple-gpus/

Computationally-intensive CUDA C++ applications in high performance computing, data science, bioinformatics, and deep learning can be accelerated by using multiple GPUs, which can increase throughput and/or decrease your total runtime. When combined with the concurrent overlap of computation and memory transfers, computation can be scaled across multiple GPUs without increasing the cost of memory transfers.…