Description
I’m running inference engine of TRT in C++. I call the inference engine in a CPU thread.
However, if I used cuCtxCreate
to provide a new context for that thread (before setting up engine, buffers etc), I spotted the outputs of TRT becomed all zeroes.
If I skipped calling cuCtxCreate
before setting up TRT engine etc, the engine outputs results just fine - valid outputs.
What’s more, when I’m running both Tensorflow and TRT inference on the same GPU, Tensorflow inference is just broken - whether Tensorflow inference and TRT runs in different CUcontexts or not (they are always on different threads).
The error from Tensorflow is:
[Switching to Thread 0x7ffc1de50700 (LWP 26137)]
Cuda API error detected: cudaPointerGetAttributes returned (0x1)
(cuda-gdb) bt
#0 0x00007ffca7b41130 in cudbgReportDriverApiError () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#1 0x00007ffca7b440ca in cudbgReportDriverInternalError () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#2 0x00007ffca7b475a3 in cudbgApiDetach () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#3 0x00007ffca7d8abe3 in cudbgMain () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#4 0x00007ffca7d9d5a3 in cudbgMain () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#5 0x00007fffcbc60d99 in cudaPointerGetAttributes () from /home/.cache/_solib_local/_U@tensorflow_S_S_Ctensorflow___Uexternal_Stensorflow_Slib/libtensorflow_cc.so.1
#6 0x00007fffccbe2c24 in void stream_executor::gpu::(anonymous namespace)::CheckPointerIsValid<void const*>(void const*, absl::string_view) [clone .constprop.161] ()
from /home/.cache/_solib_local/_U@tensorflow_S_S_Ctensorflow___Uexternal_Stensorflow_Slib/libtensorflow_cc.so.1
...
- Seems like 2 Tensorflow models works well with separated CUcontext, but not Tensorflow with TRT model/engines, so I’m suspecting TRT engine during setup changed something of that device.
So my questions are:
- Can TRT inference work with CUcontext (non primary context on a device)?
- Does TRT inference engine changed anything during setup, related to CUcontext?
Environment
TensorRT Version: 7.0
GPU Type: 2080Ti
Nvidia Driver Version: 440
CUDA Version: 10.2
Operating System + Version: Ubuntu 16.04
TensorFlow Version (if applicable): 1.15