Preprocess configured gpu-id doesnt match gpu-id for element

docker image:
FROM nvcr.io/nvidia/deepstream:6.1.1-devel

Hardware Setup:
(2) RTX 2080 TIs

using sample app: deepstream-transfer-learning with preprocessor configured.
If I run everything w/ gpu-id=0, everything works perfectly.

If I run with gpu-id=1, the nvinfer builds model successfully:
0:00:39.666998731 716 0x55e96d7dbd60 INFO nvinfer gstnvinfer.cpp:646:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1955> [UID = 1]: serialize cuda engine to file: /project/deepstream/prod_model/dn2_rs18/resnet18_detector_qat.etlt_b24_gpu1_int8.engine successfully

The pipeline appears to build correctly:
**PERF: FPS 0 (Avg) FPS 1 (Avg) FPS 2 (Avg) FPS 3 (Avg) FPS 4 (Avg) FPS 5 (Avg) FPS 6 (Avg) FPS 7 (Avg) FPS 8 (Avg) FPS 9 (Avg) FPS 10 (Avg) FPS 11 (Avg)
**PERF: 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00) 0.00 (0.00)
** INFO: <bus_callback:194>: Pipeline ready

then:
ERROR from preprocess0: Memory Compatibility Error:Input surface gpu-id doesnt match with configured gpu-id for element, please allocate input using unified memory, or use same gpu-ids OR, if same gpu-ids are used ensure appropriate Cuda memories are used
Debug info: gstnvdspreprocess.cpp(1248): gst_nvdspreprocess_on_frame (): /GstPipeline:pipeline/GstBin:preprocess_bin/GstNvDsPreProcess:preprocess0:
surface-gpu-id=1,preprocess0-gpu-id=0

here is my overall config:
[application]
enable-perf-measurement=1
perf-measurement-interval-sec=5
#gie-kitti-output-dir=streamscl

[tiled-display]
enable=1
rows=3
columns=4
width=2560
height=1440
gpu-id=1
#(0): nvbuf-mem-default - Default memory allocated, specific to particular platform
#(1): nvbuf-mem-cuda-pinned - Allocate Pinned/Host cuda memory applicable for Tesla
#(2): nvbuf-mem-cuda-device - Allocate Device cuda memory applicable for Tesla
#(3): nvbuf-mem-cuda-unified - Allocate Unified cuda memory applicable for Tesla
#(4): nvbuf-mem-surface-array - Allocate Surface Array memory, applicable for Jetson
nvbuf-memory-type=0

[source0]
enable=1
#Type - 1=CameraV4L2 2=URI 3=MultiURI 4=RTSP
#type=3
#uri=file:///opt/nvidia/deepstream/deepstream-6.1/samples/streams/sample_1080p_h264.mp4
type=4
uri=rtsp://admin:sT1nkeye@10.0.0.20:554//h264Preview_01_main
num-sources=8
#drop-frame-interval=2
gpu-id=1
#(0): memtype_device - Memory type Device
#(1): memtype_pinned - Memory type Host Pinned
#(2): memtype_unified - Memory type Unified
cudadec-memtype=0

[source1]
enable=1
type=4
uri=rtsp://
num-sources=8
#drop-frame-interval=2
gpu-id=1
cudadec-memtype=0

[source2]
enable=1
type=4
uri=rtsp://
num-sources=8
#drop-frame-interval=2
gpu-id=1
cudadec-memtype=0

[source3]
enable=1
type=4
uri=rtsp://
num-sources=8
#drop-frame-interval=2
gpu-id=1
cudadec-memtype=0

[source4]
enable=1
type=4
uri=rtsp://
num-sources=8
#drop-frame-interval=2
gpu-id=1
cudadec-memtype=0

[source5]
enable=1
type=4
uri=rtsp://
num-sources=8
#drop-frame-interval=2
gpu-id=1
cudadec-memtype=0

[source6]
enable=1
type=4
uri=rtsp://
num-sources=8
#drop-frame-interval=2
gpu-id=1
cudadec-memtype=0

[source7]
enable=1
type=4
uri=rtsp://
num-sources=8
#drop-frame-interval=2
gpu-id=1
cudadec-memtype=0

[source8]
enable=1
type=4
uri=rtsp://
num-sources=8
#drop-frame-interval=2
gpu-id=1
cudadec-memtype=0

[source9]
enable=1
type=4
uri=rtsp://
num-sources=8
#drop-frame-interval=2
gpu-id=1
cudadec-memtype=0

[source10]
enable=1
type=4
uri=rtsp://
num-sources=8
#drop-frame-interval=2
gpu-id=1
cudadec-memtype=0

[source11]
enable=1
type=4
uri=rtsp://
num-sources=8
#drop-frame-interval=2
gpu-id=1
cudadec-memtype=0

[sink0]
enable=1
#Type - 1=FakeSink 2=EglSink 3=File
type=2
sync=1
source-id=0
gpu-id=1
nvbuf-memory-type=0

[osd]
enable=1
gpu-id=1
border-width=1
text-size=15
text-color=1;1;1;1;
text-bg-color=0.3;0.3;0.3;1
font=Serif
show-clock=0
clock-x-offset=800
clock-y-offset=820
clock-text-size=12
clock-color=1;0;0;0
nvbuf-memory-type=0

[streammux]
gpu-id=1
##Boolean property to inform muxer that sources are live
live-source=1
batch-size=24
##time out in usec, to wait after the first buffer is available
##to push the batch even if the complete batch is not formed
batched-push-timeout=40000
##Set muxer output width and height
width=2560
height=1920
##Enable to maintain aspect ratio wrt source, and allow black borders, works
##along with width, height properties
enable-padding=0
nvbuf-memory-type=0
##If set to TRUE, system timestamp will be attached as ntp timestamp
##If set to FALSE, ntp timestamp from rtspsrc, if available, will be attached
#attach-sys-ts-as-ntp=1

[pre-process]
enable=1
config-file=G1_config_preprocess.txt

#config-file property is mandatory for any gie section.
#Other properties are optional and if set will override the properties set in
#the infer config file.
[primary-gie]
enable=1
gpu-id=1
#model-engine-file=/project/deepstream/prod_model/dn2_rs18/resnet18_detector_qat.etlt
#model-engine=/project/deepstream/nvidia/tao/tao-experiments/detectnet_v2/experiment_dir_final/resnet18_detector_qat.trt.int8
batch-size=24
#Required by the app for OSD, not a plugin property
bbox-border-color0=1;0;0;1
bbox-border-color1=0;1;1;1
bbox-border-color2=0;0;1;1
bbox-border-color3=0;1;0;1
interval=0
gie-unique-id=1
nvbuf-memory-type=0
#config-file=/project/deepstream/prod_model/dn2_rs18/config_infer_primary_detectnet_v2.txt
#config-file=/project/deepstream/nvidia/tao/tao-experiments/detectnet_v2/experiment_dir_final/config_infer_primary.yml
config-file=/project/deepstream/prod_model/dn2_rs18/G1_config_infer_primary.yml
input-tensor-meta=1

[tracker]
enable=1
#For NvDCF and DeepSORT tracker, tracker-width and tracker-height must be a multiple of 32, respectively
tracker-width=640
tracker-height=384
ll-lib-file=/opt/nvidia/deepstream/deepstream/lib/libnvds_nvmultiobjecttracker.so
ll-config-file=/opt/nvidia/deepstream/deepstream-6.1/samples/configs/deepstream-app/config_tracker_NvDCF_perf.yml
gpu-id=1
enable-batch-process=1
enable-past-frame=1
display-tracking-id=1

[tests]
file-loop=0

[img-save]
enable=1
output-folder-path=./output
save-img-cropped-obj=0
save-img-full-frame=1
frame-to-skip-rules-path=./configs/capture_time_rules.csv
second-to-skip-interval=03
min-confidence=0.4
max-confidence=1.0
min-box-width=5
min-box-height=5

I verified gpu-id=1 throughout.

here is my preprocess config:

[property]
enable=1
gpu-id=1
target-unique-ids=1
#0=NCHW, 1=NHWC, 2=CUSTOM
network-input-order=0
#if enabled maintain the aspect ratio while scaling
maintain-aspect-ratio=1
#if enabled pad symmetrically with maintain-aspect-ratio enabled
symmetric-padding=1
#processig width/height at which image scaled
processing-width=1248
processing-height=384
scaling-buf-pool-size=6
tensor-buf-pool-size=6
#tensor shape based on network-input-order
network-input-shape= 24;3;384;1248
#0=RGB, 1=BGR, 2=GRAY
network-color-format=0
#0=FP32, 1=UINT8, 2=INT8, 3=UINT32, 4=INT32, 5=FP16
tensor-data-type=0
tensor-name=input_1
#0=NVBUF_MEM_DEFAULT 1=NVBUF_MEM_CUDA_PINNED 2=NVBUF_MEM_CUDA_DEVICE 3=NVBUF_MEM_CUDA_UNIFIED
scaling-pool-memory-type=0
#scaling-pool-memory-type=3
#0=NvBufSurfTransformCompute_Default 1=NvBufSurfTransformCompute_GPU 2=NvBufSurfTransformCompute_VIC
scaling-pool-compute-hw=0
#scaling-pool-compute-hw=1
#Scaling Interpolation method
#0=NvBufSurfTransformInter_Nearest 1=NvBufSurfTransformInter_Bilinear 2=NvBufSurfTransformInter_Algo1
#3=NvBufSurfTransformInter_Algo2 4=NvBufSurfTransformInter_Algo3 5=NvBufSurfTransformInter_Algo4
#6=NvBufSurfTransformInter_Default
scaling-filter=0
custom-lib-path=/opt/nvidia/deepstream/deepstream/lib/gst-plugins/libcustom2d_preprocess.so
custom-tensor-preparation-function=CustomTensorPreparation

[user-configs]
pixel-normalization-factor=0.003921568
#mean-file=
#offsets=

[group-0]
gpu-id=1
src-ids=0;1;2;3;4;5;6;7;8;9;10;11
custom-input-transformation-function=CustomAsyncTransformation
process-on-roi=1
roi-params-src-0=40;125;1248;384;0;100;2496;768;0;900;2496;768
roi-params-src-1=1200;1000;1248;384;0;900;2496;768
roi-params-src-2=0;200;2496;768
roi-params-src-3=0;000;2496;768;0;750;2496;768
roi-params-src-4=600;125;1248;384;0;100;2496;768;0;900;2496;768
roi-params-src-5=0;100;2496;768;0;900;2496;768
roi-params-src-6=0;400;2496;768
roi-params-src-7=50;100;2496;768
roi-params-src-8=100;400;1248;384;680;800;1248;384
roi-params-src-9=0;600;2496;768
roi-params-src-10=0;700;1248;384;0;850;2496;768
roi-params-src-11=0;0;2496;768;0;700;2496;768

where is this preprocess-0-gpu-=0 coming from?
Debug info: gstnvdspreprocess.cpp(1248): gst_nvdspreprocess_on_frame (): /GstPipeline:pipeline/GstBin:preprocess_bin/GstNvDsPreProcess:preprocess0:
surface-gpu-id=1,preprocess0-gpu-id=0

How many RTX2080Ti in your system?

(2) two RTX 2080 TIs

±----------------------------------------------------------------------------+
| NVIDIA-SMI 515.65.01 Driver Version: 515.65.01 CUDA Version: 11.7 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce … On | 00000000:0B:00.0 On | N/A |
| 0% 32C P8 29W / 250W | 551MiB / 11264MiB | 2% Default |
| | | N/A |
±------------------------------±---------------------±---------------------+
| 1 NVIDIA GeForce … On | 00000000:0C:00.0 Off | N/A |
| 0% 32C P8 1W / 260W | 10MiB / 11264MiB | 0% Default |
| | | N/A |
±------------------------------±---------------------±---------------------+

I triple checked my configs.
I verified visually and with grep
I have gpu-id=1 throughout

my run command is:

./deepstream-transfer-learning-app -c ./G1_configs/G1_jmd_xfer_learn_preproc_DNet2_RN18.txt

so I checked:
./G1_configs/G1_jmd_xfer_learn_preproc_DNet2_RN18.txt
(that’s what is shown above)
and that config points to the config-preprocess.txt that is also show above

I want to run one DeepStream DAG on GPU 0 and another one independently on GPU 1

Group dose not support gpu-id, please refer to Gst-nvdspreprocess (Alpha) — DeepStream 6.1.1 Release documentation
please set gpu-id of nvpreprocess to 1.

I took gpu-id out of group section. (no change)
I tried it in user-configs (no change)
Here is the current config for preprocess. I still get the mismatch error

[property]
enable=1
gpu-id=1
target-unique-ids=1
#- 0=NCHW, 1=NHWC, 2=CUSTOM
network-input-order=0
#- if enabled maintain the aspect ratio while scaling
maintain-aspect-ratio=1
#- if enabled pad symmetrically with maintain-aspect-ratio enabled
symmetric-padding=1
#- processig width/height at which image scaled
processing-width=1248
processing-height=384
scaling-buf-pool-size=6
tensor-buf-pool-size=6
#- tensor shape based on network-input-order
network-input-shape= 24;3;384;1248
#- 0=RGB, 1=BGR, 2=GRAY
network-color-format=0
#- 0=FP32, 1=UINT8, 2=INT8, 3=UINT32, 4=INT32, 5=FP16
tensor-data-type=0
tensor-name=input_1
#- 0=NVBUF_MEM_DEFAULT 1=NVBUF_MEM_CUDA_PINNED 2=NVBUF_MEM_CUDA_DEVICE 3=NVBUF_MEM_CUDA_UNIFIED
#- scaling-pool-memory-type=0
scaling-pool-memory-type=1
#- scaling-pool-memory-type=2
#- scaling-pool-memory-type=3
#- 0=NvBufSurfTransformCompute_Default 1=NvBufSurfTransformCompute_GPU 2=NvBufSurfTransformCompute_VIC
scaling-pool-compute-hw=0
#- scaling-pool-compute-hw=1
#- Scaling Interpolation method
#- 0=NvBufSurfTransformInter_Nearest 1=NvBufSurfTransformInter_Bilinear 2=NvBufSurfTransformInter_Algo1
#- 3=NvBufSurfTransformInter_Algo2 4=NvBufSurfTransformInter_Algo3 5=NvBufSurfTransformInter_Algo4
#- 6=NvBufSurfTransformInter_Default
scaling-filter=0
custom-lib-path=/opt/nvidia/deepstream/deepstream/lib/gst-plugins/libcustom2d_preprocess.so
custom-tensor-preparation-function=CustomTensorPreparation

[user-configs]
pixel-normalization-factor=0.003921568
#mean-file=
#offsets=

[group-0]
src-ids=0;1;2;3;4;5;6;7;8;9;10;11
custom-input-transformation-function=CustomAsyncTransformation
process-on-roi=1
roi-params-src-0=40;125;1248;384;0;100;2496;768;0;900;2496;768
roi-params-src-1=1200;1000;1248;384;0;900;2496;768
roi-params-src-2=0;200;2496;768
roi-params-src-3=0;000;2496;768;0;750;2496;768
roi-params-src-4=600;125;1248;384;0;100;2496;768;0;900;2496;768
roi-params-src-5=0;100;2496;768;0;900;2496;768
roi-params-src-6=0;400;2496;768
roi-params-src-7=50;100;2496;768
roi-params-src-8=100;400;1248;384;680;800;1248;384
roi-params-src-9=0;600;2496;768
roi-params-src-10=0;700;1248;384;0;850;2496;768
roi-params-src-11=0;0;2496;768;0;700;2496;768

ERROR from preprocess0: Memory Compatibility Error:Input surface gpu-id doesnt match with configured gpu-id for element, please allocate input using unified memory, or use same gpu-ids OR, if same gpu-ids are used ensure appropriate Cuda memories are used
Debug info: gstnvdspreprocess.cpp(1248): gst_nvdspreprocess_on_frame (): /GstPipeline:pipeline/GstBin:preprocess_bin/GstNvDsPreProcess:preprocess0:
surface-gpu-id=1,preprocess0-gpu-id=0

when i use the same config files w/ gpu-id=0, it runs on GPU 0 as expected

There is no update from you for a period, assuming this is not an issue anymore.
Hence we are closing this topic. If need further support, please open a new one.
Thanks

  1. please set preprocess’s gpu-id by this way:
    g_object_set (G_OBJECT(preprocess), “gpu-id”, 1, NULL);
  2. please try cudadec-memtype = 2.