Jetson TK1 CUDA streams

Hi,

I am trying to use CUDA streams on Jetson TK1 for concurrent kernel execution. But according to the profiler, CUDA streams seems to be executing in a serial fashion. I found a post online about some software issues with Jeston Tk1 preventing from using CUDA streams. I am using CUDA version 6.5.

http://devblogs.nvidia.com/parallelforall/jetson-tk1-mobile-embedded-supercomputer-cuda-everywhere/

Has anyone tired CUDA streams on TK1? Could you please point or share a working example?

Thank you in advance.

regards,
Barath Ramesh

I don’t know if CUDA version matters, but which version are you using? Version 6.0 is a bit old, and runs on Jetson only with L4T R19.x. CUDA 6.5 runs on L4T R21.x. Which version are you using? if CUDA 6, could you try 6.5?

I am using version 6.5.