Concurrent CUDA kernel runs on Tx2

Hi Folks

We have two IMX274 cameras from leopard imaging on our Tx2/Tx1 systems. I am looking to run two instances of our CUDA kernel concurrently as two independent threads. It seems that one needs to take special care, or need to use ‘CUDA streams’ for this.

Could someone please point to an appropriate example for me to follow ? I have CUDA kernal wrapped in a c++ class, and I instantiate and run two objects of that class/kernel (one for each camera). I am getting ‘Bus error’ at times and on few runs entire system chokes up and needs reboot.

I am very sure I have not written kernels in thread safe manner. Would appreciate any pointer to educate myself about running multiple CUDA contexts on Tx1/Tx2.



Bus error usually occurs when reading data from the different process simultaneously.
Try to add ‘cudaThreadSynchronize()’ before reading memory.

Not sure which concurrent sample you prefer.
Here is some example for you first: