SSD Model, TRT, DeepStream, Triton: SegFault

xtianhb.glb · December 14, 2021, 3:02am

Hi!

I’m trying to export this TF model to TF-TRT, and run it with DeepStream NvInferServer i.e. Triton.

Platform: Jetson TX2.

The model I tried:
ssd_mobilenet_v1_coco_2018_01_28.tar.gz

The code to export the models to TF-TRT:
Tensorflow TensorRT Integration

Environment to export the models:
l4t-tensorflow:r32.6.1-tf2.5-py3

Environment to run the models:
deepstream-l4t:6.0-triton

When I start the deepstream-app, you can see the log below.
I tried to debug with gdb, and the segfault seems to occur at the libtriton.so

root@host:/opt/nvidia/deepstream/deepstream-6.0/projects/project/branch/apps/app# deepstream-app -c config.txt
(Argus) Error FileOperationFailed: Connecting to nvargus-daemon failed: No such file or directory (in src/rpc/socket/client/SocketClientDispatch.cpp, function openSocketConnection(), line 205)
(Argus) Error FileOperationFailed: Cannot create camera provider (in src/rpc/socket/client/SocketClientDispatch.cpp, function createCameraProvider(), line 106)

(gst-plugin-scanner:27): GStreamer-WARNING **: 00:00:43.783: Failed to load plugin '/usr/lib/aarch64-linux-gnu/gstreamer-1.0/deepstream/libnvdsgst_udp.so': librivermax.so.0: cannot open shared object file: No such file or directory

(deepstream-app:26): GLib-GObject-WARNING **: 00:00:44.137: g_object_set_is_valid_property: object class 'GstNvInferServer' has no property named 'input-tensor-meta'
0:00:01.573802028    26     0x1ae40360 WARN           nvinferserver gstnvinferserver_impl.cpp:284:validatePluginConfig:<primary_gie> warning: Configuration file batch-size reset to: 1
WARNING: backend.trt_is is deprecated. updated it to backend.triton
I1214 00:00:44.312613 26 shared_library.cc:108] OpenLibraryHandle: /opt/nvidia/deepstream/deepstream-6.0/lib/triton_backends/tensorflow1/libtriton_tensorflow1.so
2021-12-14 00:00:45.076182: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.2
I1214 00:00:45.486738 26 tensorflow.cc:2171] TRITONBACKEND_Initialize: tensorflow
I1214 00:00:45.486797 26 tensorflow.cc:2184] Triton TRITONBACKEND API version: 1.4
I1214 00:00:45.486829 26 tensorflow.cc:2190] 'tensorflow' TRITONBACKEND API version: 1.4
I1214 00:00:45.486853 26 tensorflow.cc:2211] backend configuration:
{"cmdline":{"allow-soft-placement":"true","gpu-memory-fraction":"0.250000"}}
I1214 00:00:45.487114 26 shared_library.cc:108] OpenLibraryHandle: /opt/nvidia/deepstream/deepstream-6.0/lib/triton_backends/onnxruntime/libtriton_onnxruntime.so
I1214 00:00:45.540140 26 onnxruntime.cc:1972] TRITONBACKEND_Initialize: onnxruntime
I1214 00:00:45.540347 26 onnxruntime.cc:1985] Triton TRITONBACKEND API version: 1.4
I1214 00:00:45.540481 26 onnxruntime.cc:1991] 'onnxruntime' TRITONBACKEND API version: 1.4
I1214 00:00:45.576490 26 pinned_memory_manager.cc:240] Pinned memory pool is created at '0x101070000' with size 67108864
I1214 00:00:45.576725 26 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with size 67108864
I1214 00:00:45.578204 26 backend_factory.h:45] Create TritonBackendFactory
I1214 00:00:45.578280 26 plan_backend_factory.cc:49] Create PlanBackendFactory
I1214 00:00:45.578306 26 plan_backend_factory.cc:56] Registering TensorRT Plugins
I1214 00:00:45.578357 26 logging.cc:52] Registered plugin creator - ::GridAnchor_TRT version 1
I1214 00:00:45.578391 26 logging.cc:52] Registered plugin creator - ::GridAnchorRect_TRT version 1
I1214 00:00:45.578424 26 logging.cc:52] Registered plugin creator - ::NMS_TRT version 1
I1214 00:00:45.578455 26 logging.cc:52] Registered plugin creator - ::Reorg_TRT version 1
I1214 00:00:45.578488 26 logging.cc:52] Registered plugin creator - ::Region_TRT version 1
I1214 00:00:45.578517 26 logging.cc:52] Registered plugin creator - ::Clip_TRT version 1
I1214 00:00:45.578549 26 logging.cc:52] Registered plugin creator - ::LReLU_TRT version 1
I1214 00:00:45.578604 26 logging.cc:52] Registered plugin creator - ::PriorBox_TRT version 1
I1214 00:00:45.578628 26 logging.cc:52] Registered plugin creator - ::Normalize_TRT version 1
I1214 00:00:45.578648 26 logging.cc:52] Registered plugin creator - ::ScatterND version 1
I1214 00:00:45.578670 26 logging.cc:52] Registered plugin creator - ::RPROI_TRT version 1
I1214 00:00:45.578708 26 logging.cc:52] Registered plugin creator - ::BatchedNMS_TRT version 1
I1214 00:00:45.578729 26 logging.cc:52] Registered plugin creator - ::BatchedNMSDynamic_TRT version 1
I1214 00:00:45.578751 26 logging.cc:52] Registered plugin creator - ::FlattenConcat_TRT version 1
I1214 00:00:45.578774 26 logging.cc:52] Registered plugin creator - ::CropAndResize version 1
I1214 00:00:45.578803 26 logging.cc:52] Registered plugin creator - ::DetectionLayer_TRT version 1
I1214 00:00:45.578834 26 logging.cc:52] Registered plugin creator - ::EfficientNMS_ONNX_TRT version 1
I1214 00:00:45.578864 26 logging.cc:52] Registered plugin creator - ::EfficientNMS_TRT version 1
I1214 00:00:45.578894 26 logging.cc:52] Registered plugin creator - ::Proposal version 1
I1214 00:00:45.578920 26 logging.cc:52] Registered plugin creator - ::ProposalLayer_TRT version 1
I1214 00:00:45.578941 26 logging.cc:52] Registered plugin creator - ::PyramidROIAlign_TRT version 1
I1214 00:00:45.578973 26 logging.cc:52] Registered plugin creator - ::ResizeNearest_TRT version 1
I1214 00:00:45.579003 26 logging.cc:52] Registered plugin creator - ::Split version 1
I1214 00:00:45.579027 26 logging.cc:52] Registered plugin creator - ::SpecialSlice_TRT version 1
I1214 00:00:45.579065 26 logging.cc:52] Registered plugin creator - ::InstanceNormalization_TRT version 1
I1214 00:00:45.579094 26 ensemble_backend_factory.cc:47] Create EnsembleBackendFactory
I1214 00:00:45.579179 26 server.cc:504] 
+------------------+------+
| Repository Agent | Path |
+------------------+------+
+------------------+------+

I1214 00:00:45.579309 26 server.cc:543] 
+-------------+------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------+
| Backend     | Path                                                                                           | Config                                                                       |
+-------------+------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------+
| tensorrt    | <built-in>                                                                                     | {}                                                                           |
| tensorflow  | /opt/nvidia/deepstream/deepstream-6.0/lib/triton_backends/tensorflow1/libtriton_tensorflow1.so | {"cmdline":{"allow-soft-placement":"true","gpu-memory-fraction":"0.250000"}} |
| onnxruntime | /opt/nvidia/deepstream/deepstream-6.0/lib/triton_backends/onnxruntime/libtriton_onnxruntime.so | {}                                                                           |
+-------------+------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------+

I1214 00:00:45.579354 26 model_repository_manager.cc:570] BackendStates()
I1214 00:00:45.579394 26 server.cc:586] 
+-------+---------+--------+
| Model | Version | Status |
+-------+---------+--------+
+-------+---------+--------+

I1214 00:00:45.579632 26 tritonserver.cc:1718] 
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Option                           | Value                                                                                                                                                                |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id                        | triton                                                                                                                                                               |
| server_version                   | 2.13.0                                                                                                                                                               |
| server_extensions                | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tens |
|                                  | or_data statistics                                                                                                                                                   |
| model_repository_path[0]         | /opt/nvidia/deepstream/deepstream-6.0/projects/project/myproject/models                                                                                              |
| model_control_mode               | MODE_EXPLICIT                                                                                                                                                        |
| strict_model_config              | 1                                                                                                                                                                    |
| pinned_memory_pool_byte_size     | 67108864                                                                                                                                                             |
| cuda_memory_pool_byte_size{0}    | 67108864                                                                                                                                                             |
| min_supported_compute_capability | 5.3                                                                                                                                                                  |
| strict_readiness                 | 1                                                                                                                                                                    |
| exit_timeout                     | 30                                                                                                                                                                   |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------+

I1214 00:00:45.579711 26 model_repository_manager.cc:570] BackendStates()
I1214 00:00:45.579758 26 model_repository_manager.cc:638] GetInferenceBackend() 'ssd_mobilenet_v1_trt' version 5
I1214 00:00:45.584830 26 model_repository_manager.cc:749] AsyncLoad() 'ssd_mobilenet_v1_trt'
I1214 00:00:45.585101 26 model_repository_manager.cc:988] TriggerNextAction() 'ssd_mobilenet_v1_trt' version 5: 1
I1214 00:00:45.585153 26 model_repository_manager.cc:1026] Load() 'ssd_mobilenet_v1_trt' version 5
I1214 00:00:45.585180 26 model_repository_manager.cc:1045] loading: ssd_mobilenet_v1_trt:5
I1214 00:00:45.686111 26 model_repository_manager.cc:1105] CreateInferenceBackend() 'ssd_mobilenet_v1_trt' version 5
I1214 00:00:45.689946 26 tensorflow.cc:2273] TRITONBACKEND_ModelInitialize: ssd_mobilenet_v1_trt (version 5)
I1214 00:00:45.701077 26 model_config_utils.cc:1524] ModelConfig 64-bit fields:
I1214 00:00:45.701188 26 model_config_utils.cc:1526] 	ModelConfig::dynamic_batching::default_queue_policy::default_timeout_microseconds
I1214 00:00:45.701261 26 model_config_utils.cc:1526] 	ModelConfig::dynamic_batching::max_queue_delay_microseconds
I1214 00:00:45.701328 26 model_config_utils.cc:1526] 	ModelConfig::dynamic_batching::priority_queue_policy::value::default_timeout_microseconds
I1214 00:00:45.701369 26 model_config_utils.cc:1526] 	ModelConfig::ensemble_scheduling::step::model_version
I1214 00:00:45.701411 26 model_config_utils.cc:1526] 	ModelConfig::input::dims
I1214 00:00:45.701448 26 model_config_utils.cc:1526] 	ModelConfig::input::reshape::shape
I1214 00:00:45.701490 26 model_config_utils.cc:1526] 	ModelConfig::instance_group::secondary_devices::device_id
I1214 00:00:45.701531 26 model_config_utils.cc:1526] 	ModelConfig::model_warmup::inputs::value::dims
I1214 00:00:45.701569 26 model_config_utils.cc:1526] 	ModelConfig::optimization::cuda::graph_spec::graph_lower_bound::input::value::dim
I1214 00:00:45.701605 26 model_config_utils.cc:1526] 	ModelConfig::optimization::cuda::graph_spec::input::value::dim
I1214 00:00:45.701645 26 model_config_utils.cc:1526] 	ModelConfig::output::dims
I1214 00:00:45.701678 26 model_config_utils.cc:1526] 	ModelConfig::output::reshape::shape
I1214 00:00:45.701712 26 model_config_utils.cc:1526] 	ModelConfig::sequence_batching::direct::max_queue_delay_microseconds
I1214 00:00:45.701744 26 model_config_utils.cc:1526] 	ModelConfig::sequence_batching::max_sequence_idle_microseconds
I1214 00:00:45.701776 26 model_config_utils.cc:1526] 	ModelConfig::sequence_batching::oldest::max_queue_delay_microseconds
I1214 00:00:45.701807 26 model_config_utils.cc:1526] 	ModelConfig::version_policy::specific::versions
I1214 00:00:45.702544 26 tensorflow.cc:1427] model configuration:
{
    "name": "ssd_mobilenet_v1_trt",
    "platform": "tensorflow_savedmodel",
    "backend": "tensorflow",
    "version_policy": {
        "latest": {
            "num_versions": 1
        }
    },
    "max_batch_size": 1,
    "input": [
        {
            "name": "image_tensor",
            "data_type": "TYPE_UINT8",
            "format": "FORMAT_NONE",
            "dims": [
                300,
                300,
                3
            ],
            "is_shape_tensor": false,
            "allow_ragged_batch": false
        }
    ],
    "output": [
        {
            "name": "num_detections",
            "data_type": "TYPE_FP32",
            "dims": [
                1
            ],
            "reshape": {
                "shape": []
            },
            "label_filename": "",
            "is_shape_tensor": false
        },
        {
            "name": "detection_classes",
            "data_type": "TYPE_FP32",
            "dims": [
                100
            ],
            "label_filename": "",
            "is_shape_tensor": false
        },
        {
            "name": "detection_scores",
            "data_type": "TYPE_FP32",
            "dims": [
                100
            ],
            "label_filename": "",
            "is_shape_tensor": false
        },
        {
            "name": "detection_boxes",
            "data_type": "TYPE_FP32",
            "dims": [
                100,
                4
            ],
            "label_filename": "",
            "is_shape_tensor": false
        }
    ],
    "batch_input": [],
    "batch_output": [],
    "optimization": {
        "priority": "PRIORITY_DEFAULT",
        "execution_accelerators": {
            "gpu_execution_accelerator": [
                {
                    "name": "tensorrt",
                    "parameters": {
                        "max_workspace_size_bytes": "268435456",
                        "precision_mode": "FP16",
                        "minimum_segment_size": "10"
                    }
                }
            ],
            "cpu_execution_accelerator": []
        },
        "input_pinned_memory": {
            "enable": true
        },
        "output_pinned_memory": {
            "enable": true
        },
        "gather_kernel_buffer_threshold": 0,
        "eager_batching": false
    },
    "instance_group": [
        {
            "name": "ssd_mobilenet_v1_trt_0",
            "kind": "KIND_GPU",
            "count": 1,
            "gpus": [
                0
            ],
            "secondary_devices": [],
            "profile": [],
            "passive": false,
            "host_policy": ""
        }
    ],
    "default_model_filename": "model.savedmodel",
    "cc_model_filenames": {},
    "metric_tags": {},
    "parameters": {},
    "model_warmup": []
}
I1214 00:00:45.703470 26 tensorflow.cc:2323] TRITONBACKEND_ModelInstanceInitialize: ssd_mobilenet_v1_trt_0 (GPU device 0)
I1214 00:00:45.704920 26 backend_model_instance.cc:110] Creating instance ssd_mobilenet_v1_trt_0 on GPU 0 (6.2) using artifact 'model.savedmodel'
I1214 00:00:45.706709 26 tensorflow.cc:895] TensorRT Execution Accelerator is set for ssd_mobilenet_v1_trt
2021-12-14 00:00:45.706889: I tensorflow/cc/saved_model/reader.cc:31] Reading SavedModel from: /opt/nvidia/deepstream/deepstream-6.0/projects/project/myproject/models/ssd_mobilenet_v1_trt/5/model.savedmodel
2021-12-14 00:00:45.873394: I tensorflow/cc/saved_model/reader.cc:54] Reading meta graph with tags { serve }
2021-12-14 00:00:46.073559: W tensorflow/core/platform/profile_utils/cpu_utils.cc:98] Failed to find bogomips in /proc/cpuinfo; cannot determine CPU frequency
2021-12-14 00:00:46.074131: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x1ab389a0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2021-12-14 00:00:46.074232: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2021-12-14 00:00:46.074756: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1
2021-12-14 00:00:46.075018: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1049] ARM64 does not support NUMA - returning NUMA node zero
2021-12-14 00:00:46.075299: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1666] Found device 0 with properties: 
name: NVIDIA Tegra X2 major: 6 minor: 2 memoryClockRate(GHz): 1.3
pciBusID: 0000:00:00.0
2021-12-14 00:00:46.075501: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.2
2021-12-14 00:00:46.075735: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.10
2021-12-14 00:00:46.078440: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2021-12-14 00:00:46.086887: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10
2021-12-14 00:00:46.101865: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10
2021-12-14 00:00:46.112217: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.10
2021-12-14 00:00:46.112483: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8
2021-12-14 00:00:46.112644: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1049] ARM64 does not support NUMA - returning NUMA node zero
2021-12-14 00:00:46.112889: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1049] ARM64 does not support NUMA - returning NUMA node zero
2021-12-14 00:00:46.112989: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1794] Adding visible gpu devices: 0
2021-12-14 00:00:50.389704: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1206] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-12-14 00:00:50.389806: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1212]      0 
2021-12-14 00:00:50.389850: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1225] 0:   N 
2021-12-14 00:00:50.390040: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1049] ARM64 does not support NUMA - returning NUMA node zero
2021-12-14 00:00:50.390272: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1049] ARM64 does not support NUMA - returning NUMA node zero
2021-12-14 00:00:50.390468: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1049] ARM64 does not support NUMA - returning NUMA node zero
2021-12-14 00:00:50.390680: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1351] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1962 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X2, pci bus id: 0000:00:00.0, compute capability: 6.2)
2021-12-14 00:00:50.399212: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7ee007e960 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2021-12-14 00:00:50.399305: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): NVIDIA Tegra X2, Compute Capability 6.2
2021-12-14 00:00:50.768844: I tensorflow/cc/saved_model/loader.cc:251] Restoring SavedModel bundle.
2021-12-14 00:00:51.502181: I tensorflow/compiler/tf2tensorrt/segment/segment.cc:486] There are 4 ops of 3 different types in the graph that are not converted to TensorRT: StatefulPartitionedCall, NoOp, Placeholder, (For more information see https://docs.nvidia.com/deeplearning/frameworks/tf-trt-user-guide/index.html#supported-ops).
2021-12-14 00:00:51.502330: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:647] Number of TensorRT candidate segments: 0
2021-12-14 00:00:51.663961: W tensorflow/compiler/tf2tensorrt/convert/trt_optimization_pass.cc:183] TensorRTOptimizer is probably called on funcdef! This optimizer must *NOT* be called on function objects.
2021-12-14 00:00:52.347699: I tensorflow/cc/saved_model/loader.cc:200] Running initialization op on SavedModel bundle at path: /opt/nvidia/deepstream/deepstream-6.0/projects/project/myproject/models/ssd_mobilenet_v1_trt/5/model.savedmodel
2021-12-14 00:00:52.923749: I tensorflow/compiler/tf2tensorrt/segment/segment.cc:486] There are 4 ops of 2 different types in the graph that are not converted to TensorRT: PartitionedCall, NoOp, (For more information see https://docs.nvidia.com/deeplearning/frameworks/tf-trt-user-guide/index.html#supported-ops).
2021-12-14 00:00:52.923876: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:647] Number of TensorRT candidate segments: 0
2021-12-14 00:00:53.090327: W tensorflow/compiler/tf2tensorrt/convert/trt_optimization_pass.cc:183] TensorRTOptimizer is probably called on funcdef! This optimizer must *NOT* be called on function objects.
2021-12-14 00:00:53.097324: W tensorflow/compiler/tf2tensorrt/convert/trt_optimization_pass.cc:183] TensorRTOptimizer is probably called on funcdef! This optimizer must *NOT* be called on function objects.
2021-12-14 00:00:53.213619: I tensorflow/cc/saved_model/loader.cc:379] SavedModel load for tags { serve }; Status: success. Took 7506710 microseconds.
I1214 00:00:53.214262 26 dynamic_batch_scheduler.cc:230] Starting dynamic-batch scheduler thread 0 at nice 5...
I1214 00:00:53.214447 26 model_repository_manager.cc:1212] successfully loaded 'ssd_mobilenet_v1_trt' version 5
I1214 00:00:53.214495 26 model_repository_manager.cc:988] TriggerNextAction() 'ssd_mobilenet_v1_trt' version 5: 0
I1214 00:00:53.214523 26 model_repository_manager.cc:1003] no next action, trigger OnComplete()
I1214 00:00:53.214696 26 model_repository_manager.cc:594] VersionStates() 'ssd_mobilenet_v1_trt'
I1214 00:00:53.214744 26 model_repository_manager.cc:594] VersionStates() 'ssd_mobilenet_v1_trt'
I1214 00:00:53.214773 26 model_repository_manager.cc:638] GetInferenceBackend() 'ssd_mobilenet_v1_trt' version 5
INFO: TrtISBackend id:1 initialized model: ssd_mobilenet_v1_trt
I1214 00:00:54.246747 26 model_repository_manager.cc:638] GetInferenceBackend() 'ssd_mobilenet_v1_trt' version 5
I1214 00:00:54.246859 26 infer_request.cc:524] prepared: [0x0x1ac7fda0] request id: 0, model: ssd_mobilenet_v1_trt, requested version: 5, actual version: 5, flags: 0x0, correlation id: 0, batch size: 1, priority: 0, timeout (us): 0
original inputs:
[0x0x1c2c5928] input: image_tensor, type: UINT8, original shape: [1,300,300,3], batch + shape: [1,300,300,3], shape: [300,300,3]
override inputs:
inputs:
[0x0x1c2c5928] input: image_tensor, type: UINT8, original shape: [1,300,300,3], batch + shape: [1,300,300,3], shape: [300,300,3]
original requested outputs:
detection_boxes
detection_classes
detection_scores
num_detections
requested outputs:
detection_boxes
detection_classes
detection_scores
num_detections

I1214 00:00:54.247060 26 tensorflow.cc:2394] model ssd_mobilenet_v1_trt, instance ssd_mobilenet_v1_trt_0, executing 1 requests
I1214 00:00:54.247175 26 tensorflow.cc:1567] TRITONBACKEND_ModelExecute: Running ssd_mobilenet_v1_trt_0 with 1 requests
I1214 00:00:54.250084 26 tensorflow.cc:1820] TRITONBACKEND_ModelExecute: input 'image_tensor' is GPU tensor: false
2021-12-14 00:00:55.052147: I tensorflow/compiler/tf2tensorrt/segment/segment.cc:486] There are 4 ops of 3 different types in the graph that are not converted to TensorRT: StatefulPartitionedCall, NoOp, Placeholder, (For more information see https://docs.nvidia.com/deeplearning/frameworks/tf-trt-user-guide/index.html#supported-ops).
2021-12-14 00:00:55.052295: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:647] Number of TensorRT candidate segments: 0
2021-12-14 00:00:55.198460: W tensorflow/compiler/tf2tensorrt/convert/trt_optimization_pass.cc:183] TensorRTOptimizer is probably called on funcdef! This optimizer must *NOT* be called on function objects.
2021-12-14 00:00:55.203888: W tensorflow/compiler/tf2tensorrt/convert/trt_optimization_pass.cc:183] TensorRTOptimizer is probably called on funcdef! This optimizer must *NOT* be called on function objects.
2021-12-14 00:00:55.599404: W tensorflow/compiler/tf2tensorrt/convert/trt_optimization_pass.cc:183] TensorRTOptimizer is probably called on funcdef! This optimizer must *NOT* be called on function objects.
./app1-jetson.sh: line 3:    26 Segmentation fault      (core dumped) deepstream-app -c app1_toplevel_jetson.txt

AastaLLL · December 14, 2021, 5:23am

Hi,

Do you convert the TF-TRT model on the TX2?

Please note that the TensorRT engine is not portable.
You will need to convert it on the target platform directly.

Thanks.

xtianhb.glb · December 14, 2021, 3:11pm

Hi @AastaLLL

Yes, I converted the model on the TX2.

I forgot to mention that the TX2 has

Python 3.6.9
JetPack 4.6
TensorRT 8.0.1
Cuda 10.2
CuDNN 8.2
GStreamer 1.14.5
Triton 2.13

As much as I would like to figure out what’s going on with Triton Sever and the model, however, I can’t solve it.
Here is the gdb output.

Thread 36 "deepstream-app" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ea4ff8d70 (LWP 54)]
0x0000007f0ff42070 in tensorflow::{lambda(tensorflow::shape_inference::InferenceContext*)#9}::_FUN(tensorflow::shape_inference::InferenceContext*) ()
   from /opt/nvidia/deepstream/deepstream-6.0/lib/triton_backends/tensorflow1/libtensorflow_triton.so.1
(gdb) bt
#0  0x0000007f0ff42070 in tensorflow::{lambda(tensorflow::shape_inference::InferenceContext*)#9}::_FUN(tensorflow::shape_inference::InferenceContext*) ()
    at /opt/nvidia/deepstream/deepstream-6.0/lib/triton_backends/tensorflow1/libtensorflow_triton.so.1
#1  0x0000007f09ce2a7c in std::_Function_handler<tensorflow::Status (tensorflow::shape_inference::InferenceContext*), tensorflow::Status (*)(tensorflow::shape_inference::InferenceContext*)>::_M_invoke(std::_Any_data const&, tensorflow::shape_inference::InferenceContext*&&) () at /opt/nvidia/deepstream/deepstream-6.0/lib/triton_backends/tensorflow1/libtensorflow_triton.so.1
#2  0x0000007f102a40d4 in tensorflow::shape_inference::InferenceContext::Run(std::function<tensorflow::Status (tensorflow::shape_inference::InferenceContext*)> const&) ()
    at /opt/nvidia/deepstream/deepstream-6.0/lib/triton_backends/tensorflow1/libtensorflow_triton.so.1
#3  0x0000007f1010ff38 in tensorflow::grappler::SymbolicShapeRefiner::InferShapes(tensorflow::NodeDef const&, tensorflow::grappler::SymbolicShapeRefiner::NodeContext*) ()
    at /opt/nvidia/deepstream/deepstream-6.0/lib/triton_backends/tensorflow1/libtensorflow_triton.so.1
#4  0x0000007f10115c74 in tensorflow::grappler::SymbolicShapeRefiner::UpdateNode(tensorflow::NodeDef const*, bool*) ()
    at /opt/nvidia/deepstream/deepstream-6.0/lib/triton_backends/tensorflow1/libtensorflow_triton.so.1
#5  0x0000007f10116e68 in tensorflow::grappler::GraphProperties::UpdateShapes(tensorflow::grappler::SymbolicShapeRefiner*, std::unordered_map<tensorflow::NodeDef const*, tensorflow::NodeDef const*, std::hash<tensorflow::NodeDef const*>, std::equal_to<tensorflow::NodeDef const*>, std::allocator<std::pair<tensorflow::NodeDef const* const, tensorflow::NodeDef const*> > > const&, tensorflow::NodeDef const*, bool*) const () at /opt/nvidia/deepstream/deepstream-6.0/lib/triton_backends/tensorflow1/libtensorflow_triton.so.1
#6  0x0000007f10117008 in tensorflow::grappler::GraphProperties::PropagateShapes(tensorflow::grappler::SymbolicShapeRefiner*, tensorflow::grappler::TopoQueue*, std::unordered_map<tensorflow::NodeDef const*, tensorflow::NodeDef const*, std::hash<tensorflow::NodeDef const*>, std::equal_to<tensorflow::NodeDef const*>, std::allocator<std::pair<tensorflow::NodeDef const* const, tensorflow::NodeDef const*> > > const&, int) const () at /opt/nvidia/deepstream/deepstream-6.0/lib/triton_backends/tensorflow1/libtensorflow_triton.so.1
#7  0x0000007f1011177c in tensorflow::grappler::GraphProperties::InferStatically(bool, bool, bool, bool) ()
    at /opt/nvidia/deepstream/deepstream-6.0/lib/triton_backends/tensorflow1/libtensorflow_triton.so.1
#8  0x0000007f100a70a0 in tensorflow::grappler::ConstantFolding::RunOptimizationPass(tensorflow::grappler::Cluster*, tensorflow::grappler::GrapplerItem const&, tensorflow::GraphDef*) ()
    at /opt/nvidia/deepstream/deepstream-6.0/lib/triton_backends/tensorflow1/libtensorflow_triton.so.1
#9  0x0000007f100a7cc8 in tensorflow::grappler::ConstantFolding::Optimize(tensorflow::grappler::Cluster*, tensorflow::grappler::GrapplerItem const&, tensorflow::GraphDef*) ()
    at /opt/nvidia/deepstream/deepstream-6.0/lib/triton_backends/tensorflow1/libtensorflow_triton.so.1
#10 0x0000007f0ffc1640 in tensorflow::grappler::MetaOptimizer::RunOptimizer(tensorflow::grappler::GraphOptimizer*, tensorflow::grappler::Cluster*, tensorflow::grappler::GrapplerItem*, tensorflow::GraphDef*, tensorflow::grappler::MetaOptimizer::GraphOptimizationResult*) () at /opt/nvidia/deepstream/deepstream-6.0/lib/triton_backends/tensorflow1/libtensorflow_triton.so.1
#11 0x0000007f0ffc279c in tensorflow::grappler::MetaOptimizer::OptimizeGraph(tensorflow::grappler::Cluster*, tensorflow::grappler::GrapplerItem const&, tensorflow::GraphDef*) ()
    at /opt/nvidia/deepstream/deepstream-6.0/lib/triton_backends/tensorflow1/libtensorflow_triton.so.1
#12 0x0000007f0ffc4b9c in tensorflow::grappler::MetaOptimizer::Optimize(tensorflow::grappler::Cluster*, tensorflow::grappler::GrapplerItem const&, tensorflow::GraphDef*) ()
    at /opt/nvidia/deepstream/deepstream-6.0/lib/triton_backends/tensorflow1/libtensorflow_triton.so.1
#13 0x0000007f0ffc58dc in tensorflow::grappler::RunMetaOptimizer(tensorflow::grappler::GrapplerItem const&, tensorflow::ConfigProto const&, tensorflow::DeviceBase*, tensorflow::grappler::Cluster*, tensorflow::GraphDef*) () at /opt/nvidia/deepstream/deepstream-6.0/lib/triton_backends/tensorflow1/libtensorflow_triton.so.1
#14 0x0000007f0ffb7f34 in tensorflow::GraphExecutionState::OptimizeGraph(tensorflow::BuildGraphOptions const&, std::unique_ptr<tensorflow::Graph, std::default_delete<tensorflow::Graph> >*, std::unique_ptr<tensorflow::FunctionLibraryDefinition, std::default_delete<tensorflow::FunctionLibraryDefinition> >*) ()
    at /opt/nvidia/deepstream/deepstream-6.0/lib/triton_backends/tensorflow1/libtensorflow_triton.so.1
#15 0x0000007f0ffb967c in tensorflow::GraphExecutionState::BuildGraph(tensorflow::BuildGraphOptions const&, std::unique_ptr<tensorflow::ClientGraph, std::default_delete<tensorflow::ClientGraph> >*) ()
    at /opt/nvidia/deepstream/deepstream-6.0/lib/triton_backends/tensorflow1/libtensorflow_triton.so.1
#16 0x0000007f0febf6c8 in tensorflow::DirectSession::CreateGraphs(tensorflow::BuildGraphOptions const&, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::unique_ptr<tensorflow::Graph, std::default_delete<tensorflow::Graph> >, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::unique_ptr<tensorflow::Graph, std::default_delete<tensorflow::Graph> > > > >*, std::unique_ptr<tensorflow::FunctionLibraryDefinition, std::default_delete<tensorflow::FunctionLibraryDefinition> >*, tensorflow::DirectSession::RunStateArgs*, absl::InlinedVector<tensorflow::DataType, 4ul, std::allocator<tensorflow::DataType> >*, absl::InlinedVector<tensorflow::DataType, 4ul, std::allocator<tensorflow::DataType> >*, long long*) ()
    at /opt/nvidia/deepstream/deepstream-6.0/lib/triton_backends/tensorflow1/libtensorflow_triton.so.1
#17 0x0000007f0fec05e0 in tensorflow::DirectSession::CreateExecutors(tensorflow::CallableOptions const&, std::unique_ptr<tensorflow::DirectSession::ExecutorsAndKeys, std::default_delete<tensorflow::DirectSession::ExecutorsAndKeys> >*, std::unique_ptr<tensorflow::DirectSession::FunctionInfo, std::default_delete<tensorflow::DirectSession::FunctionInfo> >*, tensorflow::DirectSession::RunStateArgs*) ()
    at /opt/nvidia/deepstream/deepstream-6.0/lib/triton_backends/tensorflow1/libtensorflow_triton.so.1
#18 0x0000007f0fec2020 in tensorflow::DirectSession::GetOrCreateExecutors(absl::Span<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const>, absl::Span<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const>, absl::Span<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const>, tensorflow::DirectSession::ExecutorsAndKeys**, tensorflow::DirectSession::RunStateArgs*) () at /opt/nvidia/deepstream/deepstream-6.0/lib/triton_backends/tensorflow1/libtensorflow_triton.so.1
#19 0x0000007f0fec30d4 in tensorflow::DirectSession::Run(tensorflow::RunOptions const&, std::vector<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, tensorflow::Tensor>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, tensorflow::Tensor> > > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, std::vector<tensorflow::Tensor, std::allocator<tensorflow::Tensor> >*, tensorflow::RunMetadata*) () at /opt/nvidia/deepstream/deepstream-6.0/lib/triton_backends/tensorflow1/libtensorflow_triton.so.1
#20 0x0000007f0fec2e98 in tensorflow::DirectSession::Run(std::vector<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, tensorflow::Tensor>, std::allocator<std::pai---Type <return> to continue, or q <return> to quit---
r<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, tensorflow::Tensor> > > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, std::vector<tensorflow::Tensor, std::allocator<tensorflow::Tensor> >*) ()
    at /opt/nvidia/deepstream/deepstream-6.0/lib/triton_backends/tensorflow1/libtensorflow_triton.so.1
#21 0x0000007f09d7d7c0 in (anonymous namespace)::ModelImpl::Run(tritontf_tensorlist_struct*, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, tritontf_tensorlist_struct**) ()
    at /opt/nvidia/deepstream/deepstream-6.0/lib/triton_backends/tensorflow1/libtensorflow_triton.so.1
#22 0x0000007f09d7dc94 in TRITONTF_ModelRun () at /opt/nvidia/deepstream/deepstream-6.0/lib/triton_backends/tensorflow1/libtensorflow_triton.so.1
#23 0x0000007f446fcf18 in triton::backend::tensorflow::ModelInstanceState::ProcessRequests(TRITONBACKEND_Request**, unsigned int) ()
    at /opt/nvidia/deepstream/deepstream-6.0/lib/triton_backends/tensorflow1/libtriton_tensorflow1.so
#24 0x0000007f446ff594 in TRITONBACKEND_ModelInstanceExecute () at /opt/nvidia/deepstream/deepstream-6.0/lib/triton_backends/tensorflow1/libtriton_tensorflow1.so
#25 0x0000007f4de7370c in std::_Function_handler<void (unsigned int, std::vector<std::unique_ptr<nvidia::inferenceserver::InferenceRequest, std::default_delete<nvidia::inferenceserver::InferenceRequest> >, std::allocator<std::unique_ptr<nvidia::inferenceserver::InferenceRequest, std::default_delete<nvidia::inferenceserver::InferenceRequest> > > >&&), nvidia::inferenceserver::TritonModel::Create(nvidia::inferenceserver::InferenceServer*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::vector<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::vector<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > > > > > const&, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > > > > > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, long, inference::ModelConfig const&, std::unique_ptr<nvidia::inferenceserver::TritonModel, std::default_delete<nvidia::inferenceserver::TritonModel> >*)::{lambda(unsigned int, std::vector<std::unique_ptr<nvidia::inferenceserver::InferenceRequest, std::default_delete<nvidia::inferenceserver::InferenceRequest> >, std::allocator<std::unique_ptr<nvidia::inferenceserver::InferenceRequest, std::default_delete<nvidia::inferenceserver::InferenceRequest> > > >&&)#2}>::_M_invoke(std::_Any_data const&, unsigned int&&, std::vector<std::unique_ptr<nvidia::inferenceserver::InferenceRequest, std::default_delete<nvidia::inferenceserver::InferenceRequest> >, std::allocator<std::unique_ptr<nvidia::inferenceserver::InferenceRequest, std::default_delete<nvidia::inferenceserver::InferenceRequest> > > >&&) () at /opt/nvidia/deepstream/deepstream-6.0/lib/libtritonserver.so
#26 0x0000007f4dcd35c8 in nvidia::inferenceserver::DynamicBatchScheduler::SchedulerThread(unsigned int, int, std::shared_ptr<std::atomic<bool> > const&, std::promise<bool>*) ()
    at /opt/nvidia/deepstream/deepstream-6.0/lib/libtritonserver.so
#27 0x0000007f84675e94 in  () at /usr/lib/aarch64-linux-gnu/libstdc++.so.6
#28 0x0000007f90331088 in start_thread () at /lib/aarch64-linux-gnu/libpthread.so.0
#29 0x0000007f901d0ffc in  () at /lib/aarch64-linux-gnu/libc.so.6

xtianhb.glb · December 15, 2021, 9:22pm

Hi @AastaLLL

I have an update.

On a dGPU container, the model reconverted there too, I run the Triton Server with

tritonserver --model-store models --strict-model-config=true --log-verbose=3 --model-control-mode=explicit --load-model=ssd_model

and stays waiting for a request with the model [READY].

Then, with a client based on this example simple_http_infer_client.py I request an inference with the model.

It segfaults. This time without DeepStream acting as a client.

I1215 21:14:43.937407 218 http_server.cc:2715] HTTP request: 2 /v2/models/ssd_model/infer
I1215 21:14:43.937438 218 model_repository_manager.cc:638] GetInferenceBackend() 'ssd_model' version -1
I1215 21:14:43.937446 218 model_repository_manager.cc:638] GetInferenceBackend() 'ssd_model' version -1
I1215 21:14:44.091723 218 infer_request.cc:524] prepared: [0x0x7f0d980025c0] request id: , model: ssd_model, requested version: -1, actual version: 5, flags: 0x0, correlation id: 0, batch size: 0, priority: 0, timeout (us): 0
original inputs:
[0x0x7f0d98002b98] input: image_tensor, type: UINT8, original shape: [1,432,768,3], batch + shape: [1,432,768,3], shape: [1,432,768,3]
override inputs:
inputs:
[0x0x7f0d98002b98] input: image_tensor, type: UINT8, original shape: [1,432,768,3], batch + shape: [1,432,768,3], shape: [1,432,768,3]
original requested outputs:
detection_boxes
detection_classes
detection_scores
requested outputs:
detection_boxes
detection_classes
detection_scores

I1215 21:14:44.091792 218 tensorflow.cc:2389] model ssd_model, instance ssd_model_0, executing 1 requests
I1215 21:14:44.091804 218 tensorflow.cc:1563] TRITONBACKEND_ModelExecute: Running ssd_model_0 with 1 requests
I1215 21:14:44.092192 218 tensorflow.cc:1815] TRITONBACKEND_ModelExecute: input 'image_tensor' is GPU tensor: false
2021-12-15 21:14:44.342181: I tensorflow/compiler/tf2tensorrt/segment/segment.cc:486] There are 4 ops of 3 different types in the graph that are not converted to TensorRT: StatefulPartitionedCall, Placeholder, NoOp, (For more information see https://docs.nvidia.com/deeplearning/frameworks/tf-trt-user-guide/index.html#supported-ops).
2021-12-15 21:14:44.342232: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:647] Number of TensorRT candidate segments: 0
2021-12-15 21:14:44.371577: W tensorflow/compiler/tf2tensorrt/convert/trt_optimization_pass.cc:183] TensorRTOptimizer is probably called on funcdef! This optimizer must *NOT* be called on function objects.
2021-12-15 21:14:44.372440: W tensorflow/compiler/tf2tensorrt/convert/trt_optimization_pass.cc:183] TensorRTOptimizer is probably called on funcdef! This optimizer must *NOT* be called on function objects.
2021-12-15 21:14:44.451053: W tensorflow/compiler/tf2tensorrt/convert/trt_optimization_pass.cc:183] TensorRTOptimizer is probably called on funcdef! This optimizer must *NOT* be called on function objects.
2021-12-15 21:14:44.512876: W tensorflow/compiler/tf2tensorrt/convert/trt_optimization_pass.cc:183] TensorRTOptimizer is probably called on funcdef! This optimizer must *NOT* be called on function objects.
Segmentation fault (core dumped)

So, this is not a Jetson device or DeepStream problem. It is a Triton Server, TensorFlow, TRT problem.

Thanks!!!

AastaLLL · December 20, 2021, 3:44am

Hi,

Thanks for the testing and feedback.
Would you mind trying following experiments as well.

First, it looks like you have enabled the TF-TRT config.
Please check if the model can work with the Triton server on the pure TensorFlow mode. (turn-off TRT optimization)

Next, we also have a prebuilt TensorFlow package for Jetson.
Could you try if the model can work with the TensorFlow library directly.

Thanks.

xtianhb.glb · December 20, 2021, 6:15pm

Hi @AastaLLL
I tested the suggested experiments:

I disabled the TF-TRT optimization by commenting on this section of the triton model config.
Is this the correct way to disable the TRT mode? I followed this page Triton optimization.
The result is still segfault.

#optimization { execution_accelerators {
#  gpu_execution_accelerator {
#    name : "tensorrt"
#    parameters { key: "precision_mode" value: "FP16" }
#} } }

Regarding TensorFlow package: I have found previously that this behavior (segfault) is common across dGPU and Jetson architectures. And is important to note that this model worked just fine before the TF-TRT optimization, both on dGPU and the Jetson combined with Triton. I think this issue is arch-independent, what do you think?
In addition, after the export script, I run a dummy inference test in the TensorFlow container, and I don’t have errors, I can see the output nodes.

Thanks!

AastaLLL · December 21, 2021, 8:16am

Yes. It sounds like some issue with the optimization.

We are going to reproduce this internally.
Will share more information with you later.

AastaLLL · December 21, 2021, 8:35am

Hi,

Would you mind also sharing the script to load the ssd_mobilenet_v1 and convert it into the TF-TRT model with us?

Thanks.

xtianhb.glb · December 21, 2021, 4:05pm

Hi @AastaLLL
I sent you the information in a private message.
For people reading this, the references and resources to reproduce the issue are:
tf1_detection_zoo.md
TensorRT/tree/main/quickstart
nvidia/containers/tensorflow
https://github.com/triton-inference-server/client
Thanks!

AastaLLL · December 22, 2021, 9:25am

Hi,

We test the ssd_mobilenet_v1_coco_2018_01_28 model with deepstream-l4t:6.0-triton container.
The pipeline can run correctly without segmentation fault on TX2.

Is there anything missing in our testing.
Below is our procedure for your reference.

$ export DISPLAY=:0
$ xhost +
$ sudo docker run -it --rm --runtime nvidia --network host -v nvcr.io/nvidia/deepstream-l4t:6.0-triton

$ mkdir -p /opt/nvidia/deepstream/deepstream-6.0/samples/triton_model_repo/ssd_mobilenet_v1_coco_2018_01_28/1
$ wget -O /tmp/ssd_mobilenet_v1_coco_2018_01_28.tar.gz http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v1_coco_2018_01_28.tar.gz
$ cd /tmp && tar xzf ssd_mobilenet_v1_coco_2018_01_28.tar.gz
$ mv /tmp/ssd_mobilenet_v1_coco_2018_01_28/frozen_inference_graph.pb /opt/nvidia/deepstream/deepstream-6.0/samples/triton_model_repo/ssd_mobilenet_v1_coco_2018_01_28/1/
$ rm -fr /tmp/ssd_mobilenet_v1_coco_2018_01_28.tar.gz  /tmp/ssd_mobilenet_v1_coco_2018_01_28
$ cd -

Update config: source1_primary_detector.txt (3.7 KB) config.pbtxt (2.6 KB)

$ export DISPLAY=:0
$ cd samples/configs/deepstream-app-triton/
$ deepstream-app -c source1_primary_detector.txt

Thanks.

xtianhb.glb · December 22, 2021, 1:25pm

Hi @AastaLLL

Very interesting…

In your procedure, you are not using a script in a specific step to convert the pure TF model to TF-TRT model.
I didn’t know that is possible to use a pure TF model and request Triton for TRT optimization online by adding this code as we saw earlier:

optimization { execution_accelerators {
  gpu_execution_accelerator {
    name : "tensorrt"
...
...

Maybe I was wrong in trying to optimize the model with a script.

Let me check if with your procedure works, and I will get back to you.

Thanks!!!

xtianhb.glb · December 22, 2021, 7:39pm

Thanks @AastaLLL , it works.

Q: Are the weights generated in this optimization procedure saved? Or this has to ‘warm-up’ every time?

Thanks

AastaLLL · December 28, 2021, 6:27am

Hi,

This follows the Triton mechanism.
You can use model warmup to avoid model startup/optimization slowdown:

github.com

triton-inference-server/server/blob/r21.12/docs/model_configuration.md#model-warmup

<!--
# Copyright (c) 2018-2020, NVIDIA CORPORATION. All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
# are met:
#  * Redistributions of source code must retain the above copyright
#    notice, this list of conditions and the following disclaimer.
#  * Redistributions in binary form must reproduce the above copyright
#    notice, this list of conditions and the following disclaimer in the
#    documentation and/or other materials provided with the distribution.
#  * Neither the name of NVIDIA CORPORATION nor the names of its
#    contributors may be used to endorse or promote products derived
#    from this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS ``AS IS'' AND ANY
# EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
# PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE COPYRIGHT OWNER OR
# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,

This file has been truncated. show original

Thanks.

system · January 18, 2022, 12:49am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Segmentation fault Monitoring/Assessment Tools tensorflow , jetson-inference	0	937	July 20, 2022
Jetson TX2 Tensorrt l4t-tensorflow NGC Segmentation fault at build trt graphconverterV2 Jetson TX2 tensorrt	4	495	May 17, 2023
Deepstream-gaze-app segmentation fault DeepStream SDK	10	284	February 13, 2024
Converting Tensorflow model to tensorRT model. Jetson TX2	6	4694	October 18, 2021
Deepstream Triton Inference Server Error, Segmentation fault (core dumped) DeepStream SDK	6	1311	January 20, 2023
Deploy Object Detection TF-TRT INT8 with DS Triton DeepStream SDK inference-server-triton	16	1350	October 12, 2021
TF-TRT issue Jetson TX2	26	3881	October 18, 2021
Error when using Triton Server for Inference on deepstream-imagedata-example DeepStream SDK	21	1896	October 12, 2021
Triton Inference through docker DeepStream SDK	6	1435	March 16, 2022
Deepstream-app segmentation fault with Triton Tensorflow backend DeepStream SDK	4	630	October 12, 2021

SSD Model, TRT, DeepStream, Triton: SegFault

Related topics