The still issue occurs after previous uninstalling torch, and reinstallation 2.0.1, and validated 2.0.1 version.
To make sure, I reproduced using the steps below, but this time attached the successful run log output, hoping this could help identify the issue.
- Dockerfile:
FROM nvcr.io/nvidia/deepstream:6.4-triton-multiarch
RUN ./user_deepstream_python_apps_install.sh --build-bindings
- Checked
pip list | grep torch
- no torch is installed.
- Installing torch:
pip install torch==2.0.1
- Installed model repo using:
prepare_ds_triton_model_repo.sh
- (from
/opt/nvidia/deepstream/deepstream-6.4/samples/
)
- Running ssd parser test app successfully:
python3 deepstream_ssd_parser.py /opt/nvidia/deepstream/deepstream/samples/streams/sample_720p.h264
- (from
/opt/nvidia/deepstream/deepstream-6.4/sources/deepstream_python_apps/apps/deepstream-ssd-parser
)
/opt/nvidia/deepstream/deepstream-6.4/sources/deepstream_python_apps/apps/deepstream-ssd-parser# python3 deepstream_ssd_parser.py /opt/nvidia/deepstream/deepstream/samples/streams/sample_720p.h264
(gst-plugin-scanner:313): GStreamer-WARNING **: 09:08:34.273: Failed to load plugin '/usr/lib/x86_64-linux-gnu/gstreamer-1.0/deepstream/libnvdsgst_udp.so': librivermax.so.1: cannot open shared object file: No such file or directory
Creating Pipeline
Creating Source
Creating H264Parser
Creating Decoder
Creating NvStreamMux
Creating Nvinferserver
Creating Nvvidconv
Creating OSD (nvosd)
Creating Queue
Creating Converter 2 (nvvidconv2)
Creating capsfilter
Creating Encoder
Creating Code Parser
Creating Container
Creating Sink
Playing file /opt/nvidia/deepstream/deepstream/samples/streams/sample_720p.h264
Adding elements to Pipeline
Linking elements in the Pipeline
/opt/nvidia/deepstream/deepstream-6.4/sources/deepstream_python_apps/apps/deepstream-ssd-parser/deepstream_ssd_parser.py:402: DeprecationWarning: Gst.Element.get_request_pad is deprecated
sinkpad = streammux.get_request_pad("sink_0")
Starting pipeline
WARNING: infer_proto_utils.cpp:201 backend.trt_is is deprecated. updated it to backend.triton
I0214 09:08:34.783982 309 libtorch.cc:2507] TRITONBACKEND_Initialize: pytorch
I0214 09:08:34.784002 309 libtorch.cc:2517] Triton TRITONBACKEND API version: 1.15
I0214 09:08:34.784005 309 libtorch.cc:2523] 'pytorch' TRITONBACKEND API version: 1.15
I0214 09:08:34.862168 309 pinned_memory_manager.cc:241] Pinned memory pool is created at '0x7f9a10000000' with size 268435456
I0214 09:08:34.862430 309 cuda_memory_manager.cc:107] CUDA memory pool is created on device 0 with size 67108864
I0214 09:08:34.862984 309 server.cc:604]
+------------------+------+
| Repository Agent | Path |
+------------------+------+
+------------------+------+
I0214 09:08:34.863004 309 server.cc:631]
+---------+---------------------------------------------------------+--------+
| Backend | Path | Config |
+---------+---------------------------------------------------------+--------+
| pytorch | /opt/tritonserver/backends/pytorch/libtriton_pytorch.so | {} |
+---------+---------------------------------------------------------+--------+
I0214 09:08:34.863017 309 server.cc:674]
+-------+---------+--------+
| Model | Version | Status |
+-------+---------+--------+
+-------+---------+--------+
I0214 09:08:34.874873 309 metrics.cc:810] Collecting metrics for GPU 0: NVIDIA GeForce RTX 2080 Ti
I0214 09:08:34.875009 309 metrics.cc:703] Collecting CPU metrics
I0214 09:08:34.875085 309 tritonserver.cc:2435]
+----------------------------------+------------------------------------------------------------------------------------------------+
| Option | Value |
+----------------------------------+------------------------------------------------------------------------------------------------+
| server_id | triton |
| server_version | 2.37.0 |
| server_extensions | classification sequence model_repository model_repository(unload_dependents) schedule_policy m |
| | odel_configuration system_shared_memory cuda_shared_memory binary_tensor_data parameters stati |
| | stics trace logging |
| model_repository_path[0] | /opt/nvidia/deepstream/deepstream-6.4/samples/triton_model_repo |
| model_control_mode | MODE_EXPLICIT |
| strict_model_config | 0 |
| rate_limit | OFF |
| pinned_memory_pool_byte_size | 268435456 |
| cuda_memory_pool_byte_size{0} | 67108864 |
| min_supported_compute_capability | 6.0 |
| strict_readiness | 1 |
| exit_timeout | 30 |
| cache_enabled | 0 |
+----------------------------------+------------------------------------------------------------------------------------------------+
I0214 09:08:34.875914 309 model_lifecycle.cc:462] loading: ssd_inception_v2_coco_2018_01_28:1
I0214 09:08:35.046489 309 tensorflow.cc:2577] TRITONBACKEND_Initialize: tensorflow
I0214 09:08:35.046513 309 tensorflow.cc:2587] Triton TRITONBACKEND API version: 1.15
I0214 09:08:35.046518 309 tensorflow.cc:2593] 'tensorflow' TRITONBACKEND API version: 1.15
I0214 09:08:35.046522 309 tensorflow.cc:2617] backend configuration:
{"cmdline":{"allow-soft-placement":"true","auto-complete-config":"true","backend-directory":"/opt/tritonserver/backends","gpu-memory-fraction":"0.400000","min-compute-capability":"6.000000","default-max-batch-size":"4"}}
I0214 09:08:35.046862 309 tensorflow.cc:2683] TRITONBACKEND_ModelInitialize: ssd_inception_v2_coco_2018_01_28 (version 1)
I0214 09:08:35.048143 309 tensorflow.cc:2732] TRITONBACKEND_ModelInstanceInitialize: ssd_inception_v2_coco_2018_01_28_0_0 (GPU device 0)
2024-02-14 09:08:35.053251: I tensorflow/core/platform/cpu_feature_guard.cc:183] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: SSE3 SSE4.1 SSE4.2 AVX, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-02-14 09:08:35.054007: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:995] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-02-14 09:08:35.055821: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:995] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-02-14 09:08:35.055956: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:995] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-02-14 09:08:35.056182: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:995] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-02-14 09:08:35.056318: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:995] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-02-14 09:08:35.056440: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:995] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-02-14 09:08:35.056559: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1636] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 4403 MB memory: -> device: 0, name: NVIDIA GeForce RTX 2080 Ti, pci bus id: 0000:07:00.0, compute capability: 7.5
2024-02-14 09:08:35.156000: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:375] MLIR V1 optimization pass is not enabled
I0214 09:08:35.198216 309 model_lifecycle.cc:819] successfully loaded 'ssd_inception_v2_coco_2018_01_28'
INFO: infer_trtis_backend.cpp:218 TrtISBackend id:5 initialized model: ssd_inception_v2_coco_2018_01_28
2024-02-14 09:08:37.422963: I tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:753] failed to allocate 4.30GiB (4617351936 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2024-02-14 09:08:37.578745: I tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:432] Loaded cuDNN version 8904
Frame Number=0 Number of Objects=5 Vehicle_count=2 Person_count=2
Frame Number=1 Number of Objects=5 Vehicle_count=2 Person_count=2
Frame Number=2 Number of Objects=5 Vehicle_count=2 Person_count=2
Frame Number=3 Number of Objects=5 Vehicle_count=2 Person_count=2
. . .
Frame Number=1438 Number of Objects=4 Vehicle_count=4 Person_count=0
Frame Number=1439 Number of Objects=5 Vehicle_count=4 Person_count=1
Frame Number=1440 Number of Objects=6 Vehicle_count=5 Person_count=1
Frame Number=1441 Number of Objects=0 Vehicle_count=0 Person_count=0
End-of-stream
I0214 09:09:05.946420 309 tensorflow.cc:2770] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0214 09:09:05.946431 309 server.cc:305] Waiting for in-flight requests to complete.
I0214 09:09:05.946451 309 server.cc:321] Timeout 30: Found 0 model versions that have in-flight inferences
I0214 09:09:05.946456 309 server.cc:336] All models are stopped, unloading models
I0214 09:09:05.946463 309 server.cc:343] Timeout 30: Found 1 live models and 0 in-flight non-inference requests
I0214 09:09:05.946499 309 tensorflow.cc:2709] TRITONBACKEND_ModelFinalize: delete model state
I0214 09:09:05.963695 309 model_lifecycle.cc:604] successfully unloaded 'ssd_inception_v2_coco_2018_01_28' version 1
I0214 09:09:06.946560 309 server.cc:343] Timeout 29: Found 0 live models and 0 in-flight non-inference requests
- Edit .py file
vi deepstream_ssd_parser.py
: add import torch
as first or last import.
- Rerun - fails immediately.
/opt/nvidia/deepstream/deepstream-6.4/sources/deepstream_python_apps/apps/deepstream-ssd-parser# python3 deepstream_ssd_parser.py /opt/nvidia/deepstream/deepstream/samples/streams/sample_720p.h264
Creating Pipeline
Creating Source
Creating H264Parser
Creating Decoder
Creating NvStreamMux
Creating Nvinferserver
Creating Nvvidconv
Creating OSD (nvosd)
Creating Queue
Creating Converter 2 (nvvidconv2)
Creating capsfilter
Creating Encoder
Creating Code Parser
Creating Container
Creating Sink
Playing file /opt/nvidia/deepstream/deepstream/samples/streams/sample_720p.h264
Adding elements to Pipeline
Linking elements in the Pipeline
/opt/nvidia/deepstream/deepstream-6.4/sources/deepstream_python_apps/apps/deepstream-ssd-parser/deepstream_ssd_parser.py:403: DeprecationWarning: Gst.Element.get_request_pad is deprecated
sinkpad = streammux.get_request_pad("sink_0")
Starting pipeline
WARNING: infer_proto_utils.cpp:201 backend.trt_is is deprecated. updated it to backend.triton
I0214 09:15:00.493329 442 metrics.cc:810] Collecting metrics for GPU 0: NVIDIA GeForce RTX 2080 Ti
I0214 09:15:00.493481 442 metrics.cc:703] Collecting CPU metrics
I0214 09:15:00.493565 442 tritonserver.cc:2435]
+----------------------------------+------------------------------------------------------------------------------------------------+
| Option | Value |
+----------------------------------+------------------------------------------------------------------------------------------------+
| server_id | triton |
| server_version | 2.37.0 |
| server_extensions | classification sequence model_repository model_repository(unload_dependents) schedule_policy m |
| | odel_configuration system_shared_memory cuda_shared_memory binary_tensor_data parameters stati |
| | stics trace logging |
| model_repository_path[0] | /opt/nvidia/deepstream/deepstream-6.4/samples/triton_model_repo |
| model_control_mode | MODE_EXPLICIT |
| strict_model_config | 0 |
| rate_limit | OFF |
| pinned_memory_pool_byte_size | 268435456 |
| min_supported_compute_capability | 6.0 |
| strict_readiness | 1 |
| exit_timeout | 30 |
| cache_enabled | 0 |
+----------------------------------+------------------------------------------------------------------------------------------------+
I0214 09:15:00.493587 442 server.cc:302] No server context available. Exiting immediately.
ERROR: infer_trtis_server.cpp:994 Triton: failed to create repo server, triton_err_str:Not found, err_msg:unable to load shared library: /opt/tritonserver/backends/pytorch/libtorchtrt_runtime.so: undefined symbol: _ZN3c106detail14torchCheckFailEPKcS2_jRKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
ERROR: infer_trtis_server.cpp:840 failed to initialize trtserver on repo dir: root: "/opt/nvidia/deepstream/deepstream-6.4/samples/triton_model_repo"
log_level: 2
tf_gpu_memory_fraction: 0.4
0:00:00.121740825 442 0x556af32e8500 ERROR nvinferserver gstnvinferserver.cpp:408:gst_nvinfer_server_logger:<primary-inference> nvinferserver[UID 5]: Error in createNNBackend() <infer_trtis_context.cpp:256> [UID = 5]: model:ssd_inception_v2_coco_2018_01_28 get triton server instance failed. repo:root: "/opt/nvidia/deepstream/deepstream-6.4/samples/triton_model_repo"
log_level: 2
tf_gpu_memory_fraction: 0.4
0:00:00.121762726 442 0x556af32e8500 ERROR nvinferserver gstnvinferserver.cpp:408:gst_nvinfer_server_logger:<primary-inference> nvinferserver[UID 5]: Error in initialize() <infer_base_context.cpp:79> [UID = 5]: create nn-backend failed, check config file settings, nvinfer error:NVDSINFER_TRITON_ERROR
0:00:00.121772445 442 0x556af32e8500 WARN nvinferserver gstnvinferserver_impl.cpp:592:start:<primary-inference> error: Failed to initialize InferTrtIsContext
0:00:00.121778817 442 0x556af32e8500 WARN nvinferserver gstnvinferserver_impl.cpp:592:start:<primary-inference> error: Config file path: dstest_ssd_nopostprocess.txt
0:00:00.122079852 442 0x556af32e8500 WARN nvinferserver gstnvinferserver.cpp:518:gst_nvinfer_server_start:<primary-inference> error: gstnvinferserver_impl start failed
Error: gst-resource-error-quark: Failed to initialize InferTrtIsContext (1): gstnvinferserver_impl.cpp(592): start (): /GstPipeline:pipeline0/GstNvInferServer:primary-inference:
Config file path: dstest_ssd_nopostprocess.txt