cudaSetDevice in each device function call?

nightingx9 · August 20, 2009, 3:21am

Hi All,

I’m coding on CUDA 2.3 and found a little problem with function: cudaSetDevice. I did a profiler and list result in following:

Calls % Incl % Excl Depth Function Module Incl Time Excl Time

79,862 14.67 0.01 9 CuSubSetotprob HERest.OffInst.exe 8,256,686,016 6,742,320

79,862 1.42 0.00 10 RtlVirtualUnwind + 3 HERest.OffInst.exe 797,731,140 1,210,080

79,862 1.42 1.42 11 cudaConfigureCall cudart.dll 796,521,060 796,521,060

79,862 11.37 0.02 10 _my_device_function HERest.OffInst.exe 6,398,619,732 8,985,072

878,482 3.17 0.00 11 _cudaRegisterFunction + 3 HERest.OffInst.exe 1,784,407,788 0

79,862 8.18 0.01 11 cudaSetDevice + 3 HERest.OffInst.exe 4,605,226,872 3,018,540

159,724 1.87 0.01 10 _cudaUnregisterFatBinary HERest.OffInst.exe 1,053,592,824 3,624,420

You can see cudaSetDevice will be called in each device function and ocuppied 8.18% CPU time in whole program:

…

_my_device_function → _cudaRegisterFunction // will be called everytime but reasonable

_my_device_function → cudaSetDevice // will be called everytime and unreasonable

…

In fact, I already called cudaSetDevice in very beginning of my program. So I think these 8.18% is useless. Anyone can give me an answer on this?

nightingx9 · August 21, 2009, 10:16am

Any idea for this?

SPWorley · August 21, 2009, 3:30pm

It’s likely that that SetDevice call was just initializing the whole GPU context. If you didn’t use it, the next cuda function would do the init, and THAT would look slow.

nightingx9 · August 24, 2009, 1:54am

Hi SPWorley,

Base on my common programming experience, initalizing calls always happen few times. But in my profiling, cudaSetDevice will be called in each function call. Does it mean CUDA need to initialze GPU context every time? If we can avoid it?

Topic		Replies	Views
How many times does cudaSetDevice need to be called? CUDA Programming and Performance	4	2543	July 6, 2009
Setting and unsetting a device - cudaSetDevice() CUDA Programming and Performance	2	2435	March 29, 2009
multigpu CUDA Programming and Performance	2	3591	April 6, 2011
cudaSetDevice() too slow CUDA Programming and Performance	1	3213	March 31, 2009
cudaSetDevice() time, so weird! cudaSetDevice() take a long time. CUDA Programming and Performance	10	4620	August 2, 2010
cudaSetDevice question CUDA Programming and Performance	12	33268	February 3, 2009
Device initialization takes 60 Seconds CUDA Programming and Performance	7	619	July 24, 2023
Help understanding the changes in the CUDA runtime API cudaSetDeviceFlags from v10.2 to v11.0 CUDA Programming and Performance	1	537	October 26, 2021
compute-exclusive mode and cudaGetDevice(...) Always claims to be running on device 0. CUDA Programming and Performance	6	15919	July 27, 2009
cudaSetDevice always returns cudaSuccess CUDA Programming and Performance	2	1260	June 9, 2010

cudaSetDevice in each device function call?

Related topics