What is the meaning of "use_device_mem" variable(in deepstream sample: deepstream-infer-tensor-meta-test)?

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU) gpu
• DeepStream Version 5.0.0
• JetPack Version (valid for Jetson only)
• TensorRT Version 7.0
• NVIDIA GPU Driver Version (valid for GPU only)
• Issue Type( questions, new requirements, bugs)
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

Hi,
i’m reading the sample code of deepstream-infer-tensor-meta-test because I need a solution to do postprocess for one of my models which output’s type is not belong to (classifer,detector, segmentation). So I think do the postprocess in data probe might be a good idea. But the variable:“use_device_mem” in sample code makes me confused.


the output of model output layers is on device or host?
if its on device or host, why use a variable to determine the cudaMemcpy?

thank you !

Yeah, it’s just for a demo, actually, you can access both device and host memory.

Hi binCao,
Thanks for reply!
I already know that I can access the device/host memory using the NvDsInferTensorMeta .BUT I wonder to know the LOCATION of model inference result in THIS ‘NvDsInferTensorMeta’, if it on host, why need to do the cudamemcpy in this demo? If not,why do not call the cudaMemcpy for next term of calling the probe in this demo?

thanks

Good question. Actually, the inference result are both in host and device memory.

  1. DS use TensorRT to do inferrence and the inferrence result is in device memory
  2. DS copied the result from device to host, you can refer nvdsinfer_context_impl.cpp → NvDsInferContextImpl::queueInputBatch()
  3. For how to populate NvDsInferTensorMeta, you can refer gstnvinfer_meta_utils.cpp → attach_tensor_output_meta

ok. Thanks, so for normal use we dont need to do the cudaMemcpy, right?

Right.

1 Like