How to use CUDA in different CPU threads?

I have only a graphic card (and of course only one GPU device) on my computer. I found the program will crash whenever I use CUDA resources in different CPU threads(not necessarily execute them at the same time). How can I solve this problem?

There is a driver level api way of doing this, but unfortunately no support at the runtime api level.