I can only ask a general question since I cannot show the code here:
I run tensorRT 7.0 on some quite large problem, which involves computing a few billion inferences.
I run my code on a multi-gpu workstation (Titan Vs and Titan Xs, with tcc on).
I built engines for all GPU archs (6.1 and 7.0) on my workstation.
I call tensorRT in a multi-threading way, each thread creates its own runtime/context, loading its pre-built engines for its own pre-determined gpu device etc, and has its own pre-allocated device memory resource (through IGpuAllocator).
The problem is: my code can run and return correct results, but sometimes, actually more often than not, it will run for some time and then quit without any error, no tenor RT errors, no cuda errors, no return values, no debug info, and the program just quit?
It will take a few hours for my code to run to finish all the inferences, if not quit eariler.
Are there any possible explanations for this? any possibility?
My rag: tensorRT 7.0/WIN10 64/cuDNN 7.6.5/MSVC2019/CUDA 10.2, Titan V and Titan X(Pascal), code are compiled with compute/sm 6.1/7.0, engines are built with FP16 enabled for Titan V and FP32 for Titan X.