Two threads for two GPUs - what can be wrong ?

I have a main class CContainer that contains all the items that are to be evaluated on GPU. As I have two GPUs, two threads are created fro each of them. Schematically, it is done in this manner:

class CContainer

{

	static UINT ThreadFunc(void* pData);

	void Thread(); // Here are the calculations

}

UINT CContainer::ThreadFunc(void* pData)

{

   ((CCOntainer*)pData)->Thread();

}

I start ThreadFunc as a thread and everything works just fine.

However, when I decided to implement separate class for calculations, everything stops working with ‘unspecified launch failure’:

class CCalculator

{

	static UINT ThreadFunc(void* pData);

	virtual void Thread() = 0;

}

UINT CCalculator::ThreadFunc(void* pData)

{

   ((CCalculator*)pData)->Thread();

}

class CCalculator_GPU : public CCalculator

{

	virtual void Thread();

}

So, when I use the pure virtual function that each CCalculator child overloads - exactly the same code that worked in case one stops working. Thread() contains the same code in both CContainer and CCalculator_GPU.

Failure occurs when trying to cudaFree device memory or cudaUnbindTexture. It seems like something wrong with cuda thread contexts - but I have no ideas what exactly.

Any advice is appreciated.