Cannot get tensor meta data from deepstream_infer_tensor_meta.cpp example

• Hardware Platform (Jetson / GPU) - Jetson Nano
• DeepStream Version - 5.0
• JetPack Version (valid for Jetson only) - 4.4
• TensorRT Version - 7+

Hi,
I am using deepstream_infer_tensor_meta.cpp example with facenent. i am able to start the app with yolo face as pgie and facenet as sgie1. the output tensor of facenet model has shape 128 and i am able to get this tensor in python test app 2 with the following line

ptr = ctypes.cast(pyds.get_ptr(layer.buffer), ctypes.POINTER(ctypes.c_float))
probs = np.array(np.ctypeslib.as_array(ptr, shape=(layer.dims.numElements,)), copy=True)

the shape of the output is 128. If i try to do the same in c++ i am not able to fetch the tensor, all i am able to get is the address of the object in memory.

        NvDsInferTensorMeta *meta =
        (NvDsInferTensorMeta *) user_meta->user_meta_data;
    //std::cout<<"size "<< *(unsigned int *)meta->num_output_layers;
    for (unsigned int i = 0; i < meta->num_output_layers; i++) {
      NvDsInferLayerInfo *info = &meta->output_layers_info[i];
      info->buffer = meta->out_buf_ptrs_host[i];
      int len = sizeof(meta->out_buf_ptrs_host)/sizeof(meta->out_buf_ptrs_host[0]);

i tried to see the length of meta->out_buf_ptrs_host, it is 1. i am not sure how can i get this 128d tensor

when i print std::cout<inferDims.numElements<<std::endl; i get 128

Can someone please address this issue? @Fiona.Chen

Hi,

The meta->out_buf_ptrs_host contains the pointer for each output buffer.

For example, detection has bbox and converge two output layers.
So the buffer pointer is located at meta->out_buf_ptrs_host[0] and meta->out_buf_ptrs_host[1] separately.

To access the value of an index=0 output buffer, please try the following:

double* ptr = (double*)meta->out_buf_ptrs_host[0];  # output layer 0
for( size_t i=0; i<info->inferDims.numElements; i++ )
{
    std::cout << "Tensor " << i << ": " << ptr[i] << std::endl;
}

Thanks.

1 Like

thanks for the response but when i do this i get error for illegal memory access,

float (*array)[130] = (float (*)[130]) info->buffer;
for(int m =0;m<128;m++){
      std::cout<<" m "<<m<<" "<< (*array)[m];
      }

this worked for me, may not be the most elegant way, help me to improve it

Thanks for the feedback.
Good to know it works now.

is there a better way to do this? @AastaLLL

I’ll see why this isn’t working and update

Hi,

I can get the tensor output with following modification:

diff --git a/deepstream_infer_tensor_meta_test.cpp b/deepstream_infer_tensor_meta_test.cpp
index 1f5cc17..0620d74 100644
--- a/deepstream_infer_tensor_meta_test.cpp
+++ b/deepstream_infer_tensor_meta_test.cpp
@@ -20,11 +20,11 @@
  * DEALINGS IN THE SOFTWARE.
  */
 
+#include <iostream>
 #include <gst/gst.h>
 #include <glib.h>
 
 #include <math.h>
-
 #include <stdio.h>
 #include <string.h>
 #include "cuda_runtime_api.h"
@@ -231,6 +231,11 @@ pgie_pad_buffer_probe (GstPad * pad, GstPadProbeInfo * info, gpointer u_data)
           cudaMemcpy (meta->out_buf_ptrs_host[i], meta->out_buf_ptrs_dev[i],
               info->inferDims.numElements * 4, cudaMemcpyDeviceToHost);
         }
+       double* ptr = (double*)info->buffer;
+        for( size_t i=0; i<info->inferDims.numElements; i++ )
+        {
+            std::cout << "Tensor " << i << ": " << ptr[i] << std::endl;
+        }
       }
       /* Parse output tensor and fill detection results into objectList. */
       std::vector < NvDsInferLayerInfo >

Could you check if any difference between us?
Thanks.

1 Like

Hello, @AastaLLL
I tried your solution.
the output of the below code looks like this

       double* ptr = (double*)meta->out_buf_ptrs_host[0];  // output layer 0
      std::cout<<"128d tensor"<<std::endl;
      for( size_t i=0; i<info->inferDims.numElements; i++ )
      {
        std::cout << i << ": " << ptr[i];
        }

      std::cout<<"outside for"<<std::endl;
      
      if (use_device_mem && meta->out_buf_ptrs_dev[i]) {
        cudaMemcpy (meta->out_buf_ptrs_host[i], meta->out_buf_ptrs_dev[i],
            info->inferDims.numElements * 4, cudaMemcpyDeviceToHost);
      }
      
    }

Output:
128d tensor
0: -1.27551e-081: -2.27742e-062: 4.25616e-053: -6.19286e-174: -0.03849015: -0.01932566: 9.29061e-067: 0.0001478238: -0.01419139: 4.45965e-1010: 0.03303511: -5.76624e-1012: -0.15514513: 0.0047787214: 9.00847e-0515: 0.00016798116: 2.06846e-0617: 2.14049e-1218: -0.00023378319: 0.0012690720: -5.39608e-1221: 4.47182e-0922: -0.02730623: 3.4926e-0724: 2.76525e-0525: -1.17586e-0626: 6.50356e-0527: -0.0029565728: -0.015065529: -3.71175e-1330: 0.0002087631: 3.9161132: 9.9588e-0533: -0.00054118234: -0.010515235: 6.06898e-0736: 0.0088954837: -7.48422e-0538: -0.0007131639: -5.11344e-0940: -2.79556e-0541: 7.7189e-1042: -0.0066417443: -0.097011244: 0.040547745: 0.0027350446: -2.18553e-0747: 8.0536548: 1.15911e-0749: -2.29102e-0750: 1.97968e-0651: -9.16086e-0752: 0.14389953: -0.0020358554: -0.4963555: -0.00011560256: 0.029020657: 4.03043e-1158: -0.0023073659: 1.02279e-1160: -1.9275e-0561: -0.00017212462: -2.52416e-1363: -0.00030093664: 065: 066: 067: 068: 069: 070: 071: 072: 073: 074: 075: 076: 077: 078: 079: 080: 081: 082: 083: 084: 085: 086: 087: 088: 089: 090: 091: 092: 093: 094: 095: 096: 097: 098: 099: 0100: 0101: 0102: 0103: 0104: 0105: 0106: 0107: 0108: 0109: 0110: 0111: 0112: 0113: 0114: 0115: 0116: 0117: 0118: 0119: 0120: 0121: 0122: 0123: 0124: 0125: 0126: 0127: 0outside for

As you can see, after 63, I am getting 0 for every value.

while, the output of below code

          std::cout<<"Shape "<<info->inferDims.numElements<<std::endl;
      std::cout<<"128d Tensor"<<std::endl;
      for(int m =0;m<info->inferDims.numElements;m++){
      std::cout<<" "<< (*array)[m];
      }
      std::cout<<"outside for"<<std::endl;

looks like this
Output:
Shape 128
128d Tensor
-0.110474 -0.198364 1.09863 -0.380859 0.859863 0.524414 -0.427002 -0.017746 1.00977 -1.2793 0.160156 -1.15332 1.76172 0.444336 -1.08203 0.637695 0.97998 -1.10156 -0.556152 0.124451 0.310791 1.25684 -0.905273 -0.128662 -0.136108 -1.5293 -0.142212 0.951172 0.455566 0.592773 -0.18335 0.648438 -0.693359 0.377441 0.291748 0.0638428 -0.383301 -0.682129 -0.988281 0.831055 0.463135 -0.0740967 0.521973 0.174805 0.438721 -1.21875 -0.222412 0.295898 1.55469 0.494141 -1.87793 -0.351318 -0.357178 0.566895 -2.17188 -0.906738 0.518066 -1.11621 -1.3418 -0.0532837 -2.03125 0.669434 0.648438 2.24023 -1.29688 0.602051 1.51855 -0.756836 1.75391 -1.04395 -0.0301819 0.321289 -0.219482 1.0166 -1.18945 -0.57666 0.0865479 -0.77832 0.515625 -0.177734 0.941406 -0.494873 0.339355 0.135254 1.43555 -0.981934 1.11523 -1.44336 -1.57715 1.28711 0.850098 0.899902 -0.598145 -0.276123 -0.412598 2.50195 -1.21875 0.249268 -0.81543 -0.278809 -0.225464 0.376221 -0.0220337 -0.341309 -0.608887 1.51855 -1.40625 -0.87793 -1.17578 -1.74805 0.89502 -0.618164 0.230957 1.23145 0.307373 0.0968018 1.27051 -0.88623 -1.33008 0.0813599 0.639648 -0.476562 0.34375 -0.650391 -0.0307617 -0.0512085 0.276855 -0.702148outside for

As you can see, i got all 128 outputs.

any update @AastaLLL

Hi,

Do you run the inference with half mode?
Since I parse the output with double (4byte) directly.
It seems that tensor data is stored in half mode (2byte).

But it should work for you now, is that correct?

Thanks.

half mode as in fp16?
What do you mean by half mode? @AastaLLL
i tested both codes with same settings
thanks

I tried, FP16 and FP32 and received similar results
thanks