How does CUcontext affect TRT Inference?


I’m running inference engine of TRT in C++. I call the inference engine in a CPU thread.
However, if I used cuCtxCreate to provide a new context for that thread (before setting up engine, buffers etc), I spotted the outputs of TRT becomed all zeroes.
If I skipped calling cuCtxCreate before setting up TRT engine etc, the engine outputs results just fine - valid outputs.

What’s more, when I’m running both Tensorflow and TRT inference on the same GPU, Tensorflow inference is just broken - whether Tensorflow inference and TRT runs in different CUcontexts or not (they are always on different threads).
The error from Tensorflow is:

[Switching to Thread 0x7ffc1de50700 (LWP 26137)]
Cuda API error detected: cudaPointerGetAttributes returned (0x1)
(cuda-gdb) bt
#0  0x00007ffca7b41130 in cudbgReportDriverApiError () from /usr/lib/x86_64-linux-gnu/
#1  0x00007ffca7b440ca in cudbgReportDriverInternalError () from /usr/lib/x86_64-linux-gnu/
#2  0x00007ffca7b475a3 in cudbgApiDetach () from /usr/lib/x86_64-linux-gnu/
#3  0x00007ffca7d8abe3 in cudbgMain () from /usr/lib/x86_64-linux-gnu/
#4  0x00007ffca7d9d5a3 in cudbgMain () from /usr/lib/x86_64-linux-gnu/
#5  0x00007fffcbc60d99 in cudaPointerGetAttributes () from /home/.cache/_solib_local/_U@tensorflow_S_S_Ctensorflow___Uexternal_Stensorflow_Slib/
#6  0x00007fffccbe2c24 in void stream_executor::gpu::(anonymous namespace)::CheckPointerIsValid<void const*>(void const*, absl::string_view) [clone .constprop.161] ()
   from /home/.cache/_solib_local/_U@tensorflow_S_S_Ctensorflow___Uexternal_Stensorflow_Slib/
  • Seems like 2 Tensorflow models works well with separated CUcontext, but not Tensorflow with TRT model/engines, so I’m suspecting TRT engine during setup changed something of that device.

So my questions are:

  • Can TRT inference work with CUcontext (non primary context on a device)?
  • Does TRT inference engine changed anything during setup, related to CUcontext?


TensorRT Version: 7.0
GPU Type: 2080Ti
Nvidia Driver Version: 440
CUDA Version: 10.2
Operating System + Version: Ubuntu 16.04
TensorFlow Version (if applicable): 1.15

Any updates on this?

From my side, very rarely I was able to run these 2 models together (TF with TRT). So I’m guessing this a race condition between the framework.

But why TRT inference doesn’t work correctly under new CUcontext is still very surprising to me.

Hi @golden0080gba,

Could you please share the model and script files to reproduce this issue so we can help better?


Sorry I couldn’t provide details about models.
I managed to make TRT works with new CUcontext, with calls to cudaSetDevice before inference on threads.

But I couldn’t make TF and TRT working together in C++.