Cuda issues in MULTITHREADED HOST application


Can anybody help me in resolving the following problem?

My host application has two threads:
Thread1 creates D3D10 device
Thread2 creates D3D10 resources using the device, attempts to registers them in CUDA and process them, but all calls from the second thread to CUDA API fail with the error cudaErrorInvalidDevice.
When I invoke cuda API functions from the first thread, all calls are succeeded.
I call cudaD3D10SetDirect3DDevice() from both threads, and both calls succeed. And this is strange since the second thread is not able to use CUDA API except cudaThreadExit().

It seems that this problem can be solved by driver-level API functions cuCtxPopCurrent() and cuCtxPushCurrent(), but I use high-level API.

Probably the same problem is reported in…orInvalidDevice
But there are no any replies.