docker image:
FROM nvcr.io/nvidia/deepstream:6.1.1-devel
Hardware Setup:
(2) RTX 2080 TIs
using sample app: deepstream-transfer-learning with preprocessor configured.
If I run everything w/ gpu-id=0, everything works perfectly.
If I run with gpu-id=1, the nvinfer builds model successfully:
0:00:39.666998731 716 0x55e96d7dbd60 INFO nvinfer gstnvinfer.cpp:646:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1955> [UID = 1]: serialize cuda engine to file: /project/deepstream/prod_model/dn2_rs18/resnet18_detector_qat.etlt_b24_gpu1_int8.engine successfully
The pipeline appears to build correctly:
**PERF: FPS 0 (Avg) FPS 1 (Avg) FPS 2 (Avg) FPS 3 (Avg) FPS 4 (Avg) FPS 5 (Avg) FPS 6 (Avg) FPS 7 (Avg) FPS 8 (Avg) FPS 9 (Avg) FPS 10 (Avg) FPS 11 (Avg)
**PERF: 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00)
** INFO: <bus_callback:194>: Pipeline ready
then:
ERROR from preprocess0: Memory Compatibility Error:Input surface gpu-id doesnt match with configured gpu-id for element, please allocate input using unified memory, or use same gpu-ids OR, if same gpu-ids are used ensure appropriate Cuda memories are used
Debug info: gstnvdspreprocess.cpp(1248): gst_nvdspreprocess_on_frame (): /GstPipeline:pipeline/GstBin:preprocess_bin/GstNvDsPreProcess:preprocess0:
surface-gpu-id=1,preprocess0-gpu-id=0
here is my overall config:
[application]
enable-perf-measurement=1
perf-measurement-interval-sec=5
#gie-kitti-output-dir=streamscl
[tiled-display]
enable=1
rows=3
columns=4
width=2560
height=1440
gpu-id=1
#(0): nvbuf-mem-default - Default memory allocated, specific to particular platform
#(1): nvbuf-mem-cuda-pinned - Allocate Pinned/Host cuda memory applicable for Tesla
#(2): nvbuf-mem-cuda-device - Allocate Device cuda memory applicable for Tesla
#(3): nvbuf-mem-cuda-unified - Allocate Unified cuda memory applicable for Tesla
#(4): nvbuf-mem-surface-array - Allocate Surface Array memory, applicable for Jetson
nvbuf-memory-type=0
[source0]
enable=1
#Type - 1=CameraV4L2 2=URI 3=MultiURI 4=RTSP
#type=3
#uri=file:///opt/nvidia/deepstream/deepstream-6.1/samples/streams/sample_1080p_h264.mp4
type=4
uri=rtsp://admin:sT1nkeye@10.0.0.20:554//h264Preview_01_main
num-sources=8
#drop-frame-interval=2
gpu-id=1
#(0): memtype_device - Memory type Device
#(1): memtype_pinned - Memory type Host Pinned
#(2): memtype_unified - Memory type Unified
cudadec-memtype=0
[source1]
enable=1
type=4
uri=rtsp://
num-sources=8
#drop-frame-interval=2
gpu-id=1
cudadec-memtype=0
[source2]
enable=1
type=4
uri=rtsp://
num-sources=8
#drop-frame-interval=2
gpu-id=1
cudadec-memtype=0
[source3]
enable=1
type=4
uri=rtsp://
num-sources=8
#drop-frame-interval=2
gpu-id=1
cudadec-memtype=0
[source4]
enable=1
type=4
uri=rtsp://
num-sources=8
#drop-frame-interval=2
gpu-id=1
cudadec-memtype=0
[source5]
enable=1
type=4
uri=rtsp://
num-sources=8
#drop-frame-interval=2
gpu-id=1
cudadec-memtype=0
[source6]
enable=1
type=4
uri=rtsp://
num-sources=8
#drop-frame-interval=2
gpu-id=1
cudadec-memtype=0
[source7]
enable=1
type=4
uri=rtsp://
num-sources=8
#drop-frame-interval=2
gpu-id=1
cudadec-memtype=0
[source8]
enable=1
type=4
uri=rtsp://
num-sources=8
#drop-frame-interval=2
gpu-id=1
cudadec-memtype=0
[source9]
enable=1
type=4
uri=rtsp://
num-sources=8
#drop-frame-interval=2
gpu-id=1
cudadec-memtype=0
[source10]
enable=1
type=4
uri=rtsp://
num-sources=8
#drop-frame-interval=2
gpu-id=1
cudadec-memtype=0
[source11]
enable=1
type=4
uri=rtsp://
num-sources=8
#drop-frame-interval=2
gpu-id=1
cudadec-memtype=0
[sink0]
enable=1
#Type - 1=FakeSink 2=EglSink 3=File
type=2
sync=1
source-id=0
gpu-id=1
nvbuf-memory-type=0
[osd]
enable=1
gpu-id=1
border-width=1
text-size=15
text-color=1;1;1;1;
text-bg-color=0.3;0.3;0.3;1
font=Serif
show-clock=0
clock-x-offset=800
clock-y-offset=820
clock-text-size=12
clock-color=1;0;0;0
nvbuf-memory-type=0
[streammux]
gpu-id=1
##Boolean property to inform muxer that sources are live
live-source=1
batch-size=24
##time out in usec, to wait after the first buffer is available
##to push the batch even if the complete batch is not formed
batched-push-timeout=40000
##Set muxer output width and height
width=2560
height=1920
##Enable to maintain aspect ratio wrt source, and allow black borders, works
##along with width, height properties
enable-padding=0
nvbuf-memory-type=0
##If set to TRUE, system timestamp will be attached as ntp timestamp
##If set to FALSE, ntp timestamp from rtspsrc, if available, will be attached
#attach-sys-ts-as-ntp=1
[pre-process]
enable=1
config-file=G1_config_preprocess.txt
#config-file property is mandatory for any gie section.
#Other properties are optional and if set will override the properties set in
#the infer config file.
[primary-gie]
enable=1
gpu-id=1
#model-engine-file=/project/deepstream/prod_model/dn2_rs18/resnet18_detector_qat.etlt
#model-engine=/project/deepstream/nvidia/tao/tao-experiments/detectnet_v2/experiment_dir_final/resnet18_detector_qat.trt.int8
batch-size=24
#Required by the app for OSD, not a plugin property
bbox-border-color0=1;0;0;1
bbox-border-color1=0;1;1;1
bbox-border-color2=0;0;1;1
bbox-border-color3=0;1;0;1
interval=0
gie-unique-id=1
nvbuf-memory-type=0
#config-file=/project/deepstream/prod_model/dn2_rs18/config_infer_primary_detectnet_v2.txt
#config-file=/project/deepstream/nvidia/tao/tao-experiments/detectnet_v2/experiment_dir_final/config_infer_primary.yml
config-file=/project/deepstream/prod_model/dn2_rs18/G1_config_infer_primary.yml
input-tensor-meta=1
[tracker]
enable=1
#For NvDCF and DeepSORT tracker, tracker-width and tracker-height must be a multiple of 32, respectively
tracker-width=640
tracker-height=384
ll-lib-file=/opt/nvidia/deepstream/deepstream/lib/libnvds_nvmultiobjecttracker.so
ll-config-file=/opt/nvidia/deepstream/deepstream-6.1/samples/configs/deepstream-app/config_tracker_NvDCF_perf.yml
gpu-id=1
enable-batch-process=1
enable-past-frame=1
display-tracking-id=1
[tests]
file-loop=0
[img-save]
enable=1
output-folder-path=./output
save-img-cropped-obj=0
save-img-full-frame=1
frame-to-skip-rules-path=./configs/capture_time_rules.csv
second-to-skip-interval=03
min-confidence=0.4
max-confidence=1.0
min-box-width=5
min-box-height=5
I verified gpu-id=1 throughout.
here is my preprocess config:
[property]
enable=1
gpu-id=1
target-unique-ids=1
#0=NCHW, 1=NHWC, 2=CUSTOM
network-input-order=0
#if enabled maintain the aspect ratio while scaling
maintain-aspect-ratio=1
#if enabled pad symmetrically with maintain-aspect-ratio enabled
symmetric-padding=1
#processig width/height at which image scaled
processing-width=1248
processing-height=384
scaling-buf-pool-size=6
tensor-buf-pool-size=6
#tensor shape based on network-input-order
network-input-shape= 24;3;384;1248
#0=RGB, 1=BGR, 2=GRAY
network-color-format=0
#0=FP32, 1=UINT8, 2=INT8, 3=UINT32, 4=INT32, 5=FP16
tensor-data-type=0
tensor-name=input_1
#0=NVBUF_MEM_DEFAULT 1=NVBUF_MEM_CUDA_PINNED 2=NVBUF_MEM_CUDA_DEVICE 3=NVBUF_MEM_CUDA_UNIFIED
scaling-pool-memory-type=0
#scaling-pool-memory-type=3
#0=NvBufSurfTransformCompute_Default 1=NvBufSurfTransformCompute_GPU 2=NvBufSurfTransformCompute_VIC
scaling-pool-compute-hw=0
#scaling-pool-compute-hw=1
#Scaling Interpolation method
#0=NvBufSurfTransformInter_Nearest 1=NvBufSurfTransformInter_Bilinear 2=NvBufSurfTransformInter_Algo1
#3=NvBufSurfTransformInter_Algo2 4=NvBufSurfTransformInter_Algo3 5=NvBufSurfTransformInter_Algo4
#6=NvBufSurfTransformInter_Default
scaling-filter=0
custom-lib-path=/opt/nvidia/deepstream/deepstream/lib/gst-plugins/libcustom2d_preprocess.so
custom-tensor-preparation-function=CustomTensorPreparation
[user-configs]
pixel-normalization-factor=0.003921568
#mean-file=
#offsets=
[group-0]
gpu-id=1
src-ids=0;1;2;3;4;5;6;7;8;9;10;11
custom-input-transformation-function=CustomAsyncTransformation
process-on-roi=1
roi-params-src-0=40;125;1248;384;0;100;2496;768;0;900;2496;768
roi-params-src-1=1200;1000;1248;384;0;900;2496;768
roi-params-src-2=0;200;2496;768
roi-params-src-3=0;000;2496;768;0;750;2496;768
roi-params-src-4=600;125;1248;384;0;100;2496;768;0;900;2496;768
roi-params-src-5=0;100;2496;768;0;900;2496;768
roi-params-src-6=0;400;2496;768
roi-params-src-7=50;100;2496;768
roi-params-src-8=100;400;1248;384;680;800;1248;384
roi-params-src-9=0;600;2496;768
roi-params-src-10=0;700;1248;384;0;850;2496;768
roi-params-src-11=0;0;2496;768;0;700;2496;768
where is this preprocess-0-gpu-=0 coming from?
Debug info: gstnvdspreprocess.cpp(1248): gst_nvdspreprocess_on_frame (): /GstPipeline:pipeline/GstBin:preprocess_bin/GstNvDsPreProcess:preprocess0:
surface-gpu-id=1,preprocess0-gpu-id=0