cudaMalloc and sharing between CPU threads

I wish to know how can one CPU thread allocate device memory using cudaMalloc()/or some other API, and the other thread can use it?
I know that each host process has a different GPU context, but can’t this context be passed to other co-operative threads?

Does CUDA 2.2 support this ?