Nvinferserver apps crashing just by importing torch

Please provide complete information as applicable to your setup.

**• Hardware Platform: GPU **
**• DeepStream Version: 6.4 **
**• NVIDIA GPU Driver Version :535.104.12 **
**• Issue Type: Bug **
**• How to reproduce the issue ?

  1. Image:
FROM nvcr.io/nvidia/deepstream:6.4-triton-multiarch 
# ( also deepstream:6.4-gc-triton-devel has the same issue)
RUN ./user_deepstream_python_apps_install.sh --build-bindings
RUN pip3 install torch torchvision torchaudio
  1. Any python test app that uses nvinferserver, for example:
    deepstream_python_apps/apps/deepstream-ssd-parser
    i. follow instructions to prepare the model repo.
    ii in the test app folder, run: python3 deepstream_ssd_parser.py /opt/nvidia/deepstream/deepstream/samples/streams/sample_720p.h264
    runs successfully.
    iii. adding just ‘import torch’ to deepstream_ssd_parser.py the run fails with the following error:
Triton: failed to create repo server, triton_err_str:Not found, err_msg:unable to load shared library: /opt/tritonserver/backends/pytorch/libtorchtrt_runtime.so: undefined symbol:

Full log since starting pipeline:

Starting pipeline 

WARNING: infer_proto_utils.cpp:201 backend.trt_is is deprecated. updated it to backend.triton
I0125 12:04:42.911446 1450 metrics.cc:810] Collecting metrics for GPU 0: NVIDIA GeForce RTX 2080 Ti
I0125 12:04:42.911583 1450 metrics.cc:703] Collecting CPU metrics
I0125 12:04:42.911684 1450 tritonserver.cc:2435] 
+----------------------------------+----------------------------------------------------------------------------------------------------------------------+
| Option                           | Value                                                                                                                |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------+
| server_id                        | triton                                                                                                               |
| server_version                   | 2.37.0                                                                                                               |
| server_extensions                | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration sys |
|                                  | tem_shared_memory cuda_shared_memory binary_tensor_data parameters statistics trace logging                          |
| model_repository_path[0]         | /opt/nvidia/deepstream/deepstream-6.4/samples/triton_model_repo                                                      |
| model_control_mode               | MODE_EXPLICIT                                                                                                        |
| strict_model_config              | 0                                                                                                                    |
| rate_limit                       | OFF                                                                                                                  |
| pinned_memory_pool_byte_size     | 268435456                                                                                                            |
| min_supported_compute_capability | 6.0                                                                                                                  |
| strict_readiness                 | 1                                                                                                                    |
| exit_timeout                     | 30                                                                                                                   |
| cache_enabled                    | 0                                                                                                                    |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------+

I0125 12:04:42.911702 1450 server.cc:302] No server context available. Exiting immediately.
ERROR: infer_trtis_server.cpp:994 Triton: failed to create repo server, triton_err_str:Not found, err_msg:unable to load shared library: /opt/tritonserver/backends/pytorch/libtorchtrt_runtime.so: undefined symbol: _ZN3c106detail14torchCheckFailEPKcS2_jRKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
ERROR: infer_trtis_server.cpp:840 failed to initialize trtserver on repo dir: root: "/opt/nvidia/deepstream/deepstream-6.4/samples/triton_model_repo"
log_level: 2
tf_gpu_memory_fraction: 0.4

0:00:00.193890512  1450 0x561017ec0d00 ERROR          nvinferserver gstnvinferserver.cpp:408:gst_nvinfer_server_logger:<primary-inference> nvinferserver[UID 5]: Error in createNNBackend() <infer_trtis_context.cpp:256> [UID = 5]: model:ssd_inception_v2_coco_2018_01_28 get triton server instance failed. repo:root: "/opt/nvidia/deepstream/deepstream-6.4/samples/triton_model_repo"
log_level: 2
tf_gpu_memory_fraction: 0.4

0:00:00.193915178  1450 0x561017ec0d00 ERROR          nvinferserver gstnvinferserver.cpp:408:gst_nvinfer_server_logger:<primary-inference> nvinferserver[UID 5]: Error in initialize() <infer_base_context.cpp:79> [UID = 5]: create nn-backend failed, check config file settings, nvinfer error:NVDSINFER_TRITON_ERROR
0:00:00.193921810  1450 0x561017ec0d00 WARN           nvinferserver gstnvinferserver_impl.cpp:592:start:<primary-inference> error: Failed to initialize InferTrtIsContext
0:00:00.193926659  1450 0x561017ec0d00 WARN           nvinferserver gstnvinferserver_impl.cpp:592:start:<primary-inference> error: Config file path: dstest_ssd_nopostprocess.txt
0:00:00.194430776  1450 0x561017ec0d00 WARN           nvinferserver gstnvinferserver.cpp:518:gst_nvinfer_server_start:<primary-inference> error: gstnvinferserver_impl start failed
Error: gst-resource-error-quark: Failed to initialize InferTrtIsContext (1): gstnvinferserver_impl.cpp(592): start (): /GstPipeline:pipeline0/GstNvInferServer:primary-inference:
Config file path: dstest_ssd_nopostprocess.txt

Eventually we wish to use torch in probe functions.
We have verified the same issue happens with the latest 3 versions of torch (latest being v2.1.2)
We hope you can shed some light on this issue.

Thanks for the sharing! I can reproduce this issue in nvcr.io/nvidia/deepstream:6.4-triton-multiarch docker container. we are investigating! It seems that nvinferserver will use a new so if enabling ‘import torch’ .

sorry for the late reply. the corresponding pytorch version in nvcr.io/nvidia/deepstream:6.4-triton-multiarch is 2.0. please refer to this link. please use “pip install torch==2.0.1” to reinstall torch.

sorry for the late reply. the corresponding pytorch version in nvcr.io/nvidia/deepstream:6.4-triton-multiarch is 2.0. please refer to this link . please use “pip install torch==2.0.1” to reinstall torch.

Thanks for checking @fanzh .
However, I have done exactly the above, and I triple checked, but still - got the same results: using nvcr.io/nvidia/deepstream:6.4-triton-multiarch with pip install torch==2.0.1 resulted that same error log originally posted. there must be other things we should fix.

please uninstall torch first. after reinstalled, please check if the current version is 2.0.1.

The still issue occurs after previous uninstalling torch, and reinstallation 2.0.1, and validated 2.0.1 version.

To make sure, I reproduced using the steps below, but this time attached the successful run log output, hoping this could help identify the issue.

  1. Dockerfile:

FROM nvcr.io/nvidia/deepstream:6.4-triton-multiarch
RUN ./user_deepstream_python_apps_install.sh --build-bindings

  1. Checked pip list | grep torch - no torch is installed.
  2. Installing torch: pip install torch==2.0.1
  3. Installed model repo using: prepare_ds_triton_model_repo.sh
    • (from /opt/nvidia/deepstream/deepstream-6.4/samples/)
  4. Running ssd parser test app successfully: python3 deepstream_ssd_parser.py /opt/nvidia/deepstream/deepstream/samples/streams/sample_720p.h264
    • (from /opt/nvidia/deepstream/deepstream-6.4/sources/deepstream_python_apps/apps/deepstream-ssd-parser)
/opt/nvidia/deepstream/deepstream-6.4/sources/deepstream_python_apps/apps/deepstream-ssd-parser# python3 deepstream_ssd_parser.py /opt/nvidia/deepstream/deepstream/samples/streams/sample_720p.h264 

(gst-plugin-scanner:313): GStreamer-WARNING **: 09:08:34.273: Failed to load plugin '/usr/lib/x86_64-linux-gnu/gstreamer-1.0/deepstream/libnvdsgst_udp.so': librivermax.so.1: cannot open shared object file: No such file or directory
Creating Pipeline 
 
Creating Source
Creating H264Parser
Creating Decoder
Creating NvStreamMux
Creating Nvinferserver
Creating Nvvidconv
Creating OSD (nvosd)
Creating Queue
Creating Converter 2 (nvvidconv2)
Creating capsfilter
Creating Encoder
Creating Code Parser
Creating Container
Creating Sink
Playing file /opt/nvidia/deepstream/deepstream/samples/streams/sample_720p.h264 
Adding elements to Pipeline 

Linking elements in the Pipeline 

/opt/nvidia/deepstream/deepstream-6.4/sources/deepstream_python_apps/apps/deepstream-ssd-parser/deepstream_ssd_parser.py:402: DeprecationWarning: Gst.Element.get_request_pad is deprecated
  sinkpad = streammux.get_request_pad("sink_0")
Starting pipeline 

WARNING: infer_proto_utils.cpp:201 backend.trt_is is deprecated. updated it to backend.triton
I0214 09:08:34.783982 309 libtorch.cc:2507] TRITONBACKEND_Initialize: pytorch
I0214 09:08:34.784002 309 libtorch.cc:2517] Triton TRITONBACKEND API version: 1.15
I0214 09:08:34.784005 309 libtorch.cc:2523] 'pytorch' TRITONBACKEND API version: 1.15
I0214 09:08:34.862168 309 pinned_memory_manager.cc:241] Pinned memory pool is created at '0x7f9a10000000' with size 268435456
I0214 09:08:34.862430 309 cuda_memory_manager.cc:107] CUDA memory pool is created on device 0 with size 67108864
I0214 09:08:34.862984 309 server.cc:604] 
+------------------+------+
| Repository Agent | Path |
+------------------+------+
+------------------+------+

I0214 09:08:34.863004 309 server.cc:631] 
+---------+---------------------------------------------------------+--------+
| Backend | Path                                                    | Config |
+---------+---------------------------------------------------------+--------+
| pytorch | /opt/tritonserver/backends/pytorch/libtriton_pytorch.so | {}     |
+---------+---------------------------------------------------------+--------+

I0214 09:08:34.863017 309 server.cc:674] 
+-------+---------+--------+
| Model | Version | Status |
+-------+---------+--------+
+-------+---------+--------+

I0214 09:08:34.874873 309 metrics.cc:810] Collecting metrics for GPU 0: NVIDIA GeForce RTX 2080 Ti
I0214 09:08:34.875009 309 metrics.cc:703] Collecting CPU metrics
I0214 09:08:34.875085 309 tritonserver.cc:2435] 
+----------------------------------+------------------------------------------------------------------------------------------------+
| Option                           | Value                                                                                          |
+----------------------------------+------------------------------------------------------------------------------------------------+
| server_id                        | triton                                                                                         |
| server_version                   | 2.37.0                                                                                         |
| server_extensions                | classification sequence model_repository model_repository(unload_dependents) schedule_policy m |
|                                  | odel_configuration system_shared_memory cuda_shared_memory binary_tensor_data parameters stati |
|                                  | stics trace logging                                                                            |
| model_repository_path[0]         | /opt/nvidia/deepstream/deepstream-6.4/samples/triton_model_repo                                |
| model_control_mode               | MODE_EXPLICIT                                                                                  |
| strict_model_config              | 0                                                                                              |
| rate_limit                       | OFF                                                                                            |
| pinned_memory_pool_byte_size     | 268435456                                                                                      |
| cuda_memory_pool_byte_size{0}    | 67108864                                                                                       |
| min_supported_compute_capability | 6.0                                                                                            |
| strict_readiness                 | 1                                                                                              |
| exit_timeout                     | 30                                                                                             |
| cache_enabled                    | 0                                                                                              |
+----------------------------------+------------------------------------------------------------------------------------------------+

I0214 09:08:34.875914 309 model_lifecycle.cc:462] loading: ssd_inception_v2_coco_2018_01_28:1
I0214 09:08:35.046489 309 tensorflow.cc:2577] TRITONBACKEND_Initialize: tensorflow
I0214 09:08:35.046513 309 tensorflow.cc:2587] Triton TRITONBACKEND API version: 1.15
I0214 09:08:35.046518 309 tensorflow.cc:2593] 'tensorflow' TRITONBACKEND API version: 1.15
I0214 09:08:35.046522 309 tensorflow.cc:2617] backend configuration:
{"cmdline":{"allow-soft-placement":"true","auto-complete-config":"true","backend-directory":"/opt/tritonserver/backends","gpu-memory-fraction":"0.400000","min-compute-capability":"6.000000","default-max-batch-size":"4"}}
I0214 09:08:35.046862 309 tensorflow.cc:2683] TRITONBACKEND_ModelInitialize: ssd_inception_v2_coco_2018_01_28 (version 1)
I0214 09:08:35.048143 309 tensorflow.cc:2732] TRITONBACKEND_ModelInstanceInitialize: ssd_inception_v2_coco_2018_01_28_0_0 (GPU device 0)
2024-02-14 09:08:35.053251: I tensorflow/core/platform/cpu_feature_guard.cc:183] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: SSE3 SSE4.1 SSE4.2 AVX, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-02-14 09:08:35.054007: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:995] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-02-14 09:08:35.055821: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:995] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-02-14 09:08:35.055956: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:995] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-02-14 09:08:35.056182: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:995] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-02-14 09:08:35.056318: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:995] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-02-14 09:08:35.056440: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:995] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-02-14 09:08:35.056559: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1636] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 4403 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 2080 Ti, pci bus id: 0000:07:00.0, compute capability: 7.5
2024-02-14 09:08:35.156000: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:375] MLIR V1 optimization pass is not enabled
I0214 09:08:35.198216 309 model_lifecycle.cc:819] successfully loaded 'ssd_inception_v2_coco_2018_01_28'
INFO: infer_trtis_backend.cpp:218 TrtISBackend id:5 initialized model: ssd_inception_v2_coco_2018_01_28
2024-02-14 09:08:37.422963: I tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:753] failed to allocate 4.30GiB (4617351936 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2024-02-14 09:08:37.578745: I tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:432] Loaded cuDNN version 8904
Frame Number=0 Number of Objects=5 Vehicle_count=2 Person_count=2
Frame Number=1 Number of Objects=5 Vehicle_count=2 Person_count=2
Frame Number=2 Number of Objects=5 Vehicle_count=2 Person_count=2
Frame Number=3 Number of Objects=5 Vehicle_count=2 Person_count=2
 . . . 
Frame Number=1438 Number of Objects=4 Vehicle_count=4 Person_count=0
Frame Number=1439 Number of Objects=5 Vehicle_count=4 Person_count=1
Frame Number=1440 Number of Objects=6 Vehicle_count=5 Person_count=1
Frame Number=1441 Number of Objects=0 Vehicle_count=0 Person_count=0
End-of-stream
I0214 09:09:05.946420 309 tensorflow.cc:2770] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0214 09:09:05.946431 309 server.cc:305] Waiting for in-flight requests to complete.
I0214 09:09:05.946451 309 server.cc:321] Timeout 30: Found 0 model versions that have in-flight inferences
I0214 09:09:05.946456 309 server.cc:336] All models are stopped, unloading models
I0214 09:09:05.946463 309 server.cc:343] Timeout 30: Found 1 live models and 0 in-flight non-inference requests
I0214 09:09:05.946499 309 tensorflow.cc:2709] TRITONBACKEND_ModelFinalize: delete model state
I0214 09:09:05.963695 309 model_lifecycle.cc:604] successfully unloaded 'ssd_inception_v2_coco_2018_01_28' version 1
I0214 09:09:06.946560 309 server.cc:343] Timeout 29: Found 0 live models and 0 in-flight non-inference requests


  1. Edit .py file vi deepstream_ssd_parser.py: add import torch as first or last import.
  2. Rerun - fails immediately.

/opt/nvidia/deepstream/deepstream-6.4/sources/deepstream_python_apps/apps/deepstream-ssd-parser# python3 deepstream_ssd_parser.py /opt/nvidia/deepstream/deepstream/samples/streams/sample_720p.h264 
Creating Pipeline 
 
Creating Source
Creating H264Parser
Creating Decoder
Creating NvStreamMux
Creating Nvinferserver
Creating Nvvidconv
Creating OSD (nvosd)
Creating Queue
Creating Converter 2 (nvvidconv2)
Creating capsfilter
Creating Encoder
Creating Code Parser
Creating Container
Creating Sink
Playing file /opt/nvidia/deepstream/deepstream/samples/streams/sample_720p.h264 
Adding elements to Pipeline 

Linking elements in the Pipeline 

/opt/nvidia/deepstream/deepstream-6.4/sources/deepstream_python_apps/apps/deepstream-ssd-parser/deepstream_ssd_parser.py:403: DeprecationWarning: Gst.Element.get_request_pad is deprecated
  sinkpad = streammux.get_request_pad("sink_0")
Starting pipeline 

WARNING: infer_proto_utils.cpp:201 backend.trt_is is deprecated. updated it to backend.triton
I0214 09:15:00.493329 442 metrics.cc:810] Collecting metrics for GPU 0: NVIDIA GeForce RTX 2080 Ti
I0214 09:15:00.493481 442 metrics.cc:703] Collecting CPU metrics
I0214 09:15:00.493565 442 tritonserver.cc:2435] 
+----------------------------------+------------------------------------------------------------------------------------------------+
| Option                           | Value                                                                                          |
+----------------------------------+------------------------------------------------------------------------------------------------+
| server_id                        | triton                                                                                         |
| server_version                   | 2.37.0                                                                                         |
| server_extensions                | classification sequence model_repository model_repository(unload_dependents) schedule_policy m |
|                                  | odel_configuration system_shared_memory cuda_shared_memory binary_tensor_data parameters stati |
|                                  | stics trace logging                                                                            |
| model_repository_path[0]         | /opt/nvidia/deepstream/deepstream-6.4/samples/triton_model_repo                                |
| model_control_mode               | MODE_EXPLICIT                                                                                  |
| strict_model_config              | 0                                                                                              |
| rate_limit                       | OFF                                                                                            |
| pinned_memory_pool_byte_size     | 268435456                                                                                      |
| min_supported_compute_capability | 6.0                                                                                            |
| strict_readiness                 | 1                                                                                              |
| exit_timeout                     | 30                                                                                             |
| cache_enabled                    | 0                                                                                              |
+----------------------------------+------------------------------------------------------------------------------------------------+

I0214 09:15:00.493587 442 server.cc:302] No server context available. Exiting immediately.
ERROR: infer_trtis_server.cpp:994 Triton: failed to create repo server, triton_err_str:Not found, err_msg:unable to load shared library: /opt/tritonserver/backends/pytorch/libtorchtrt_runtime.so: undefined symbol: _ZN3c106detail14torchCheckFailEPKcS2_jRKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
ERROR: infer_trtis_server.cpp:840 failed to initialize trtserver on repo dir: root: "/opt/nvidia/deepstream/deepstream-6.4/samples/triton_model_repo"
log_level: 2
tf_gpu_memory_fraction: 0.4

0:00:00.121740825   442 0x556af32e8500 ERROR          nvinferserver gstnvinferserver.cpp:408:gst_nvinfer_server_logger:<primary-inference> nvinferserver[UID 5]: Error in createNNBackend() <infer_trtis_context.cpp:256> [UID = 5]: model:ssd_inception_v2_coco_2018_01_28 get triton server instance failed. repo:root: "/opt/nvidia/deepstream/deepstream-6.4/samples/triton_model_repo"
log_level: 2
tf_gpu_memory_fraction: 0.4

0:00:00.121762726   442 0x556af32e8500 ERROR          nvinferserver gstnvinferserver.cpp:408:gst_nvinfer_server_logger:<primary-inference> nvinferserver[UID 5]: Error in initialize() <infer_base_context.cpp:79> [UID = 5]: create nn-backend failed, check config file settings, nvinfer error:NVDSINFER_TRITON_ERROR
0:00:00.121772445   442 0x556af32e8500 WARN           nvinferserver gstnvinferserver_impl.cpp:592:start:<primary-inference> error: Failed to initialize InferTrtIsContext
0:00:00.121778817   442 0x556af32e8500 WARN           nvinferserver gstnvinferserver_impl.cpp:592:start:<primary-inference> error: Config file path: dstest_ssd_nopostprocess.txt
0:00:00.122079852   442 0x556af32e8500 WARN           nvinferserver gstnvinferserver.cpp:518:gst_nvinfer_server_start:<primary-inference> error: gstnvinferserver_impl start failed
Error: gst-resource-error-quark: Failed to initialize InferTrtIsContext (1): gstnvinferserver_impl.cpp(592): start (): /GstPipeline:pipeline0/GstNvInferServer:primary-inference:
Config file path: dstest_ssd_nopostprocess.txt

sorry for the late reply! I will try to reproduce again.

I can’t reproduce that “undefined symbol:” issue, please refer to the log.log.txt (10.8 KB)
here is my test steps:

  1. start docker, then exucute user_deepstream_python_apps_install.sh --version 1.1.10, then execute “pip install torch==2.0.1”.
  2. add ‘import torch’, then execute the application.