Error when using ensemble model with deepstream-5.1 : failed to get input buffer in CPU memory

avik.pal · June 29, 2021, 4:38pm

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU) GPU Tesla T4
• DeepStream Version 5.1
• JetPack Version (valid for Jetson only)
• TensorRT Version
• NVIDIA GPU Driver Version (valid for GPU only) 450.119.03
• Issue Type( questions, new requirements, bugs) questions
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)

Trying to implement an ensemble model consisting of 3 models - Face Detection, Python Backend model that preprocesses image according to bounding boxes, and a Multi-Label Classifier model, in this sequence.

This ensemble model runs fine in tritonserver:21.06-py3 container with client script that I run from tritonserver:21.06-py3-sdk container.

But on deploying in deepstream I get this error -
ERROR: infer_trtis_server.cpp:258 Triton: TritonServer response error received., triton_err_str:Unsupported, err_msg:in ensemble 'ENSEMBLE_MODEL', failed to get input buffer in CPU memory

The complete error log:

Creating Pipeline 
 
Creating streamux 
 
inputs: ['file:///opt/nvidia/deepstream/deepstream-5.1/sources/deepstream_python_apps/apps/DEEPSTREAM_RGB_INGESTION_PRIMARY_INFERENCE/SAMPLE_RGB_STREAM/1.avi']
Creating source_bin  0  
 
Creating source bin
source-bin-00
Creating Pgie 
 
Creating nvvidconv 
 
Creating nvvidconv2 
 
Creating nvosd 
 
Creating capsfilter 

WARNING: Overriding infer-config batch-size 0  with number of sources  1  

Creating mp4 Encoder 

Creating mp4 Container 

Creating Filesink 

Adding elements to Pipeline 


(python3:3240): GStreamer-WARNING **: 16:23:49.420: Name 'queue6' is not unique in bin 'pipeline0', not adding
Linking elements in the Pipeline 

In the probe
Now playing...
Number of Input streams: 1
0 :  file:///opt/nvidia/deepstream/deepstream-5.1/sources/deepstream_python_apps/apps/DEEPSTREAM_RGB_INGESTION_PRIMARY_INFERENCE/SAMPLE_RGB_STREAM/1.avi
Starting pipeline 

2021-06-29 16:23:52.625964: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2021-06-29 16:23:53.160714: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2499995000 Hz
2021-06-29 16:23:53.161177: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f29049caa70 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2021-06-29 16:23:53.161210: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2021-06-29 16:23:53.161323: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1
2021-06-29 16:23:53.161453: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-29 16:23:53.161984: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1665] Found device 0 with properties: 
name: Tesla T4 major: 7 minor: 5 memoryClockRate(GHz): 1.59
pciBusID: 0000:00:1e.0
2021-06-29 16:23:53.162017: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2021-06-29 16:23:53.162084: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11
2021-06-29 16:23:53.162112: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2021-06-29 16:23:53.162135: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10
2021-06-29 16:23:53.162167: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.11
2021-06-29 16:23:53.162194: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.11
2021-06-29 16:23:53.162222: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8
2021-06-29 16:23:53.162327: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-29 16:23:53.162856: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-29 16:23:53.163321: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1793] Adding visible gpu devices: 0
2021-06-29 16:23:53.163463: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1206] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-06-29 16:23:53.163484: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1212]      0 
2021-06-29 16:23:53.163494: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1225] 0:   N 
2021-06-29 16:23:53.163617: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-29 16:23:53.164180: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-29 16:23:53.164772: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-29 16:23:53.165289: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1351] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 9952 MB memory) -> physical GPU (device: 0, name: Tesla T4, pci bus id: 0000:00:1e.0, compute capability: 7.5)
2021-06-29 16:23:53.166889: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f29063c76f0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2021-06-29 16:23:53.166918: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Tesla T4, Compute Capability 7.5
2021-06-29 16:23:53.167432: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-29 16:23:53.167976: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1665] Found device 0 with properties: 
name: Tesla T4 major: 7 minor: 5 memoryClockRate(GHz): 1.59
pciBusID: 0000:00:1e.0
2021-06-29 16:23:53.168013: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2021-06-29 16:23:53.168033: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11
2021-06-29 16:23:53.168050: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2021-06-29 16:23:53.168067: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10
2021-06-29 16:23:53.168079: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.11
2021-06-29 16:23:53.168095: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.11
2021-06-29 16:23:53.168119: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8
2021-06-29 16:23:53.168198: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-29 16:23:53.168744: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-29 16:23:53.169216: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1793] Adding visible gpu devices: 0
2021-06-29 16:23:53.169246: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1206] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-06-29 16:23:53.169262: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1212]      0 
2021-06-29 16:23:53.169272: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1225] 0:   N 
2021-06-29 16:23:53.169380: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-29 16:23:53.169946: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-29 16:23:53.170454: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1351] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 9952 MB memory) -> physical GPU (device: 0, name: Tesla T4, pci bus id: 0000:00:1e.0, compute capability: 7.5)
INFO: infer_trtis_backend.cpp:206 TrtISBackend id:1 initialized model: ENSEMBLE_MODEL
2021-06-29 16:23:54.340138: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8
2021-06-29 16:23:56.259073: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11
ERROR: infer_trtis_server.cpp:258 Triton: TritonServer response error received., triton_err_str:Unsupported, err_msg:in ensemble 'ENSEMBLE_MODEL', failed to get input buffer in CPU memory
ERROR: infer_trtis_backend.cpp:586 TRTIS server failed to parse response with request-id:0 model:
ERROR: infer_trtis_backend.cpp:341 failed to specify dims after running inference failed on model:ENSEMBLE_MODEL, nvinfer error:NVDSINFER_TRTIS_ERROR
terminate called after throwing an instance of 'std::logic_error'
  what():  basic_string::_M_construct null not valid
Aborted (core dumped)

Any solution to this error?
Will be glad to send more information if needed.

• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

Fiona.Chen · June 30, 2021, 2:19am

Can you provide your triton config file for this model?

avik.pal · June 30, 2021, 4:10am

Hi, Thanks for the reply, following are the triton config files I am using, filename changed (from config.pbtxt) to differentiate easily.

config_detectionprocessing.pbtxt (755 Bytes)
config_classifier.pbtxt (399 Bytes)
config_facedetect.pbtxt (740 Bytes)
config_ensemblemodel.pbtxt (1.7 KB)

Fiona.Chen · June 30, 2021, 5:21am

Can you provide the deepstream nvinferserver config file too?

avik.pal · June 30, 2021, 5:26am

Here is my config for nvinferserver -
pgie_config_ensemble_model.txt (661 Bytes)

avik.pal · June 30, 2021, 6:52am

A similar issue is posted here -

github.com/triton-inference-server/server

python backend not support TRITONSERVER_MEMORY_GPU

opened 04:03PM - 29 Dec 20 UTC

closed 07:41PM - 11 Jan 21 UTC

Slyne

enhancement

**Description** Using ensemble model, I see this issue: https://github.com/trit…on-inference-server/python_backend/blob/9e89c1018ef0a9cbd29c3c45ec0baffa7ccf0bc8/src/python.cc#L481 **Triton Information** What version of Triton are you using? 20.10 Are you using the Triton container or did you build it yourself? container **To Reproduce** Ensemble model: one runs on GPU(onnx_runtime) and one built on python backend. The issue line seems to have a constraint on the input to have a memory type as TRITONSERVER_MEMORY_CPU. **Expected behavior** Should work with input tensor on GPU. Is this a bug or I misunderstand something?

Also, model.py script that I have as python backend, I haven’t done any data transfer between GPU memory and CPU memory or vice-versa. But if the issue itself is in this regard how do I do this transfer?

mchi · July 2, 2021, 3:22am

according to python backend not support TRITONSERVER_MEMORY_GPU · Issue #2369 · triton-inference-server/server · GitHub , “failed to get input buffer in CPU memory” is a bug in Triton and got fixed in Triton 21.02.
DS6.0 EA is using 21.02, could you apply DS6.0EA in https://developer.nvidia.com/deepstream-sdk#faq and try it on DS6.0EA?

system · September 4, 2021, 3:50am

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Deepstream Triton Ensemble Model Error DeepStream SDK inference-server-triton	8	1077	June 15, 2022
Issues we face when using triton ensemble model through grpc call DeepStream SDK deepstream	2	614	March 30, 2022
Triton ensemble model version Triton Inference Server (archived)	2	4059	August 31, 2021
Deepstream app with inference sever DeepStream SDK	2	303	October 12, 2021
Serialization Error in verifyHeader: 0 (Magic tag does not match) DeepStream SDK	4	2679	September 18, 2021
Triton ensemble model output wrong values when retrieving result from buffer through python bindings DeepStream SDK tensorrt , python , inference-server-triton	4	1186	March 2, 2023
Error when using Triton Server for Inference on deepstream-imagedata-example DeepStream SDK	21	1917	October 12, 2021
DeepStream SDK 5.0 with Nvidia GTX 960(4GB RAM) - deepstream-app throwing Bus error DeepStream SDK	9	768	October 12, 2021
Unexpected deadlock DeepStream SDK inference-server-triton , deepstream	9	1255	January 17, 2023
Loading model in deepstream-triton-5.0 DeepStream SDK inference-server-triton	4	1336	October 12, 2021

Error when using ensemble model with deepstream-5.1 : failed to get input buffer in CPU memory

Related topics