Hi, I’m working in Cuda 3.2, with OpenGL, to decode multiple videos at the same time. I am doing this across threads and I am running into some serious problems, e.g. BSOD. I think the problems stem from me not really understanding the Cuda threading model.
What is the purpose of pushing the Cuda context? Do I need to pop the context after its creation? What is the difference between pushing the Cuda context and locking the Cuda context? What areas of the Cuda calls needs to be locked while being accessed, specifically the decoding code, source loading, and Cuda initialization?
I am currently creating a thread for each video to load, initialize, and decode and then have the Pixel Buffer object copied to the texture and rendered in the main thread. Is this a good idea? Should I copy the Pixel Buffer into the texture in the video thread and only render the texture in the main thread?
Would it be a better plan to have one thread for all the video loading and initialization, another thread for all the decoding and Pixel Buffer copying, and do all the rendering in the main thread?
If it also helps, I am using a Quadro 6000 and I can get a maximum of 6 videos at once in my application and 7 standalone. I use to be able to get 7 and 8 respectively, but the new nvidia drivers 275.65 have dropped that number.
Also, if I go over my maximum number of videos my system will blue screen. No matter what I do, I can’t seem to catch any errors or get any error codes before it crashes.