programs in two GPU cards

welcoming · November 1, 2017, 12:19am

Hi, all:

I have a program written with OpenCL 1.2. This program runs at two GPU cards at the same time. In the CPU part, I used OpenMP or Pthread to start two threads, and each thread supports one GPU card. I have two machines, one machine has two K40 card, the other one has two P100 cards. The system environment of two machines is somehow different. But the compiler and run time library is the same. Both of them is intel compiler 2016.

Above is my program description.

First time, I run my program on two K40 cards. Everything is fine. The speed in double card is about 1.6 fold of single card.

Then, I run my program on two P100 cards. Interesting thing happens. The speed in double card is slower than the single card.

Here is an example.

the data is only on one card: the time is 150ms;
The data is on two cards, each card only processes half of the data:
When two cards runs paralleled, the total time is 220ms, that is each card runs a little less than 220ms.
when two cards runs sequentially, the time is 160ms, that is each card runs about 80ms.

(1)I double checked the overhead of CPU thread initialization, the overhead is less than 1ms. So it can be ignored.
(2) I also confirmed that it is not the data transfer problem between CPU and GPU. Because pure kernel running time in paralleled model is slower than sequential model.
(3) The problem happens at both multiple threading environment (OpenMP and Pthread).
(4) When I use K40 card, there is no such problem.

Do anyone has any idea? Many thanks.

Best,

Zhongjun

Topic		Replies	Views
simpleMultiGPU processing time slower on dual than single? CUDA Programming and Performance	4	2325	November 30, 2008
About weird performance of multiple GPUs CUDA Programming and Performance	0	4322	January 5, 2009
CUDA/OpenCL runs multiple GPUs sequentially CUDA Programming and Performance	16	19543	November 26, 2015
Multi GPU not working as expected - please comment CUDA Programming and Performance	11	38499	December 2, 2023
OpenMP Multi-GPU, not getting speedup expected CUDA Programming and Performance	5	5945	July 15, 2011
Performance with multiGPU ... and the 9800 GX2. CUDA Programming and Performance	4	8032	October 22, 2008
OpenMP & CUDA CUDA Programming and Performance	6	5279	September 22, 2008
Different execution time on multi gpu 4 equal cards, different execution time CUDA Programming and Performance	8	2042	March 28, 2011
GPU and CPU don't run in (pure) parallel ? CUDA Programming and Performance	24	20400	May 4, 2007
CUDA & openMP Problem with the SDK sample code CUDA Programming and Performance	11	14125	September 12, 2015

programs in two GPU cards

Related topics