I noticed that errors, freeze and system crash happens when launching too much applications simultaneously. Errors begin happening when I launch 80 times the application of the SDK adding 2 float vectors in a third one. Does someone knows what are the causes of that problem please ? There are my two leads :
1 ) If CUDA create a CUDA Stream for each app, it could be a difficulty managing 80 CUDA Streams simultaneously
2 ) If CUDA puts operations from all apps in the same default CUDA Stream, it could be a difficulty scheduling many operations (but as nothing is asynchronous, there are only at most 80 operations simultaneously, that’s not a big number).

I’d assume the problem is a shortage of device memory for all the contexts.
Just out of curiosity: Why do you want to run 80 CUDA apps on the same device in parallel?

Sorry for the late answer, I was sick. Some applications doesn’t have enough data to really efficiently use GPU. If they are all submited simultaneously, they take less time to execute. I’m also considering sharing GPU between multiple clients (like virtual machines).

If buying a card with a lot of memory doesn’t help (there are Geforce cards with 3GB of memory), maybe you can run a server process that does the GPU work on behalf of the other programs using just a single context.

You could also free the context in between, but that is probably too slow to gain any advantage over using just the CPU alone.

That’s what I’m doing, creating a “server” receiving gpu treatment requests and efficiently executing them on available GPUs. I was just trying to understand why running many CUDA applications crashed my system. Thanks for your help.