I’m experiencing a problem with running CUDA programs on linux system (x64 Tesla T10). Every program (even SDK samples) takes about 2-4 sec to run first CUDA command (init, sometimes memory allocation, etc.).
I guess CUDA runtime compiles PTX code for T10 architecture, but I’ve tried to include -arch and -code options to my nvcc command line and it didn’t help (googling an answer didn’t help either).
The problem gets annoying when I try to use 4 GPUs, because it takes about 12 sec to init all of them.
What’s more interesting: initializing one GPU slows down the others (memory allocation takes about 1,5 sec on each GPU and I assume it should be done in parallel, or shouldn’t?).
I’m experiencing a problem with running CUDA programs on linux system (x64 Tesla T10). Every program (even SDK samples) takes about 2-4 sec to run first CUDA command (init, sometimes memory allocation, etc.).
I guess CUDA runtime compiles PTX code for T10 architecture, but I’ve tried to include -arch and -code options to my nvcc command line and it didn’t help (googling an answer didn’t help either).
The problem gets annoying when I try to use 4 GPUs, because it takes about 12 sec to init all of them.
What’s more interesting: initializing one GPU slows down the others (memory allocation takes about 1,5 sec on each GPU and I assume it should be done in parallel, or shouldn’t?).
after every reboot to remove the annoying delay. I forgot if the file in that command line needs to exist
when done first time, create something that can be overwritten if so. 59 is my choice of # of seconds to re-run the smi utility (it’s part of the SDK or toolkit).
after every reboot to remove the annoying delay. I forgot if the file in that command line needs to exist
when done first time, create something that can be overwritten if so. 59 is my choice of # of seconds to re-run the smi utility (it’s part of the SDK or toolkit).
Thanks for the reminder about this trick! After updating the kernel on my Ubuntu 10.04 system last week, I also started seeing these very slow CUDA initialization times. Running deviceQuery required 4 seconds, but with nvidia-smi running in the background, it only takes 0.03 seconds.
Thanks for the reminder about this trick! After updating the kernel on my Ubuntu 10.04 system last week, I also started seeing these very slow CUDA initialization times. Running deviceQuery required 4 seconds, but with nvidia-smi running in the background, it only takes 0.03 seconds.
First i thought this worked, but i still have a delay of a few secondes at the beginning. Persistent Mode is enabled, the command shows that the cards are still in permament mode.
It is independent from the commands used, the first command in the source has this delay. Is there anything I can do, like deinitialize the cards at the end of my source or something like this? Any other ideas?
System consists of 2x Tesla C2050
EDIT: The problem appears evenif i launch the programm twice (the second immediately after the first).