Hi,
Please noted that you need to handle CUDA context push/store in the multi-threading use case. An example can be found in the below topic:
Thanks.