Working in Win7 x64, building a matlab mex using Visual Studio 2008
and CUDA 3.0 w/2 GTX 470’s in an i7 box. This code has worked on WinXP 32.
My code is structured with 3 threads.
Main thread performs resource loading.
The next 2 threads are for CUDA computation (one per CUDA device).
The code hangs while allocates memory (cudaMallocPitch/cudaHostAlloc) in the main
thread, while “simultaneously” allocating (same calls) or setting constants device for
the computation. I’ve yet to be able to replicate the bug in the stand-a-lone C++
version of the source code - which really disturbs me (running more tests on it).
Any direction would be greatly appreciated!
TIA,
Mark