cudaFree crash in destructor when exit() is called

Hi there,

I built a safe thread singleton class, which holds memory that is allocated using cudaMalloc().
The instance of the class is saved in the singleton using a C++ auto_ptr smartpointer.
So that whenever I leave the program, the destructor is automatically called.
In the destructor I make sure the memory is release using cudaFree().

When I call exit(0) the program will crash (EXC_BAD_ACCESS) on cudaFree().

If I manually release my singleton just before the call to exit(0), it will not crash.
The point is that manually releasing the singleton is not acceptable.

Any idea?

Can you post the singleton code and the cuda operations you do with it?

also does this happen when only one thread uses the singleton or more than one?

eyal

I tried a different strategy that does not involve dynamic allocation of the instance:

class T

{

........

........

........

public:

	static T* GetInstance()

	{

		if (!m_p_singleton)

		{

			pthread_mutex_lock(&m_singleton_mutex);

			if (!m_p_singleton)

			{

				static T instance;

				m_p_singleton = &instance;

				m_p_singleton->Initialise();

			}

			pthread_mutex_unlock(&m_singleton_mutex);

		}

		return m_p_singleton;

	}

	static void Destroy()

	{

		if (m_p_singleton)

		{

			pthread_mutex_lock(&m_singleton_mutex);

			if (m_p_singleton)

			{

				// Release the memory that was dynamically allocated (both on host and device)

				m_p_singleton->ClearMemory();

			}

			pthread_mutex_unlock(&m_singleton_mutex);

		}

	}

........

........

........

private:

	static pthread_mutex_t m_singleton_mutex;

	static T* m_p_singleton;

........

........

........

};

ClearMemory is calling CudaFree (with tests so that I only delete memory that is actually allocated).

It is automatically called within ~T().

And now it works without crash :-)

For information, the version that did not work was:

static T* GetInstance()

	{

		if (!m_p_singleton.get())

		{

			pthread_mutex_lock(&m_singleton_mutex);

			if (!m_p_singleton.get())

			{

				m_p_singleton.reset(new T);

			}

			pthread_mutex_unlock(&m_singleton_mutex);

		}

		return m_p_singleton.get();

	}

	static std::auto_ptr<T> m_p_singleton;

In this version, the program will crash in the destructor ~T (ClearMemory) on cudaFree when exit(0) is called.

The only explanation I imagine is that some kind of cuda context is erased first, then my auto_ptr is deleting my instance, calling the destructor that calls cudaFree.

But it is just a guess.