Crowd Estimation Model giving density map as output seems to have persistent buffers

Please provide complete information as applicable to your setup.

**• Hardware Platform ** : GPU
• DeepStream Version: 6.2.0
• TensorRT Version: 8.2.0
• NVIDIA GPU Driver Version (valid for GPU only): 525.85.12
• Issue Type( bugs): I am running a crowd estimation model (SAS-Net) as pgie. It gives a 360x640 output buffer which is the crowd density over the whole frame. Now, when I try to generate heatmap from this crowd density output, there seems to be persistent values in the buffer, it seems like buffer holds values from the previous frame and is not reset for every frame. So, when i generate the heatmaps , it contains some abnormal pixels which seems similar to the trajectory a person has moved across the frames.
The model was originally trained in pytorch which i then converted to onnx and then to tensorRT engine using deepstream. I have tested pytorch and onnx models on same video and no issue comes. Its only when I run Deepstream, i face this issue. Attaching images for reference:




We don’t know what you have done from your description. Do you customized the postprocessing for your model? If so, how did you implement it?

For postprocessing, firstly i used this configuration for pgie :

## 0=Detector, 1=Classifier, 2=Segmentation, 100=Other
network-type=100
# Enable tensor metadata output
output-tensor-meta=1

Now, I am following following logic to process the tensor output :

/* Iterate user metadata in frames to search PGIE's tensor metadata */
    for (NvDsMetaList *l_user = frame_meta->frame_user_meta_list;
         l_user != NULL; l_user = l_user->next)
    {
      NvDsUserMeta *user_meta = (NvDsUserMeta *)l_user->data;
      if (user_meta->base_meta.meta_type != NVDSINFER_TENSOR_OUTPUT_META)
        continue;

      // no of elments in output layer: equal to that in input layer for crowd model
      int outLayerElements = 0;
      /* convert to tensor metadata */
      NvDsInferTensorMeta *meta =
          (NvDsInferTensorMeta *)user_meta->user_meta_data;
      for (unsigned int i = 0; i < meta->num_output_layers; i++)
      {
        NvDsInferLayerInfo *info = &meta->output_layers_info[i];
        info->buffer = meta->out_buf_ptrs_host[i];
        if (use_device_mem && meta->out_buf_ptrs_dev[i])
        {
          CUDA_CHECK(cudaMemset(meta->out_buf_ptrs_host[i], 0, info->inferDims.numElements * 4));
          CUDA_CHECK(cudaMemcpy(meta->out_buf_ptrs_host[i], meta->out_buf_ptrs_dev[i],
                     info->inferDims.numElements * 4, cudaMemcpyDeviceToHost));
          CUDA_CHECK(cudaMemset(meta->out_buf_ptrs_dev[i], 0, info->inferDims.numElements * 4));         
        }
        outLayerElements = info->inferDims.numElements;

        //cout<<"-------------------------------------------------------------------------------------"<<endl;
        //cout<<"Number of dimensions in the Output Layer - "<<i<<" "<<info->inferDims.numDims<<endl;
        //cout<<"Number of channnels = "<<info->inferDims.d[0]<<endl;
        //cout<<"Height ="<<info->inferDims.d[1]<<endl;
        //cout<<"Width ="<<info->inferDims.d[2]<<endl;
      }
      /* Parse output tensor and fill detection results into objectList. */
      std::vector<NvDsInferLayerInfo>
          outputLayersInfo(meta->output_layers_info,
                           meta->output_layers_info + meta->num_output_layers);


      //store the crowd model output layer buffer in a vector
      std::vector<float> crowdBuffer((float*)outputLayersInfo[0].buffer,
                                                (float*)outputLayersInfo[0].buffer + outLayerElements);
    }

I get the crowd output buffer as a 1-D vector in outputLayersInfo[0].buffer which i then store into a vector. Afterwards I translate it from 1-D to 2-D to get the heatmap.

How did you find out there are persistent values in the buffer?

I dumped the heatmaps and noticed that the glitches in the heatmaps followed the trajectory of the moving person’s head. Hence the buffer which I copied has persistent values.

std::vector<float> crowdBuffer((float*)outputLayersInfo[0].buffer,
                                                (float*)outputLayersInfo[0].buffer + outLayerElements);

Why did you use cudaMemset() to clean “meta->out_buf_ptrs_dev[i]” here? Please remove it.

Actually I used it to clear the device(GPU) buffers after doing cudaMemcpy() so that the buffers are reset to 0. I tried removing the cudeMemset() but the issue is still there.

Please do not do this. "meta->out_buf_ptrs_dev[i] is the address of the device memory but not the memory. The gst-nvinfer will handle the device memory internally. Please do not do anything to it.

Okay…got your point. So the issue of persistent buffers in this case is a bug in gst-nvinfer?

No. “meta->out_buf_ptrs_dev[i]” is managed inside gst-nvinfer, you don’t need to do anything to it. It is your codes’ bug.

No I meant after removing that cudaMemSet portion from my code, the issue of persistent buffers is still there. And there is nothing extra i am doing, just cudaMemCpy the buffers from the meta.

Can you provide the model and your codes to us to reproduce the issue?

This is the code I used in the callback gie_primary_processing_done_buf_prob() in deepstream_app.c to dump heatmaps :

  for (NvDsMetaList * l_frame = batch_meta->frame_meta_list; l_frame != NULL;
      l_frame = l_frame->next)
  {
    string filePrefix = "HeatMapDebug/";
    NvDsFrameMeta *frame_meta = (NvDsFrameMeta *) l_frame->data;
    // process crowd model metatdata
      /* Iterate user metadata in frames to search PGIE's tensor metadata */
      for (NvDsMetaList *l_user = frame_meta->frame_user_meta_list;
          l_user != NULL; l_user = l_user->next)
      {
        NvDsUserMeta *user_meta = (NvDsUserMeta *)l_user->data;
        if (user_meta->base_meta.meta_type != NVDSINFER_TENSOR_OUTPUT_META)
          continue;

        // no of elments in output layer: equal to that in input layer for crowd model
        int outLayerElements = 0;
        /* convert to tensor metadata */
        NvDsInferTensorMeta *meta =
            (NvDsInferTensorMeta *)user_meta->user_meta_data;
        for (unsigned int i = 0; i < meta->num_output_layers; i++)
        {
          //cout<<"meta->num_output_layers ="<<meta->num_output_layers<<endl;
          NvDsInferLayerInfo *info = &meta->output_layers_info[i];
          info->buffer = meta->out_buf_ptrs_host[i];
          if (use_device_mem && meta->out_buf_ptrs_dev[i])
          {
            cudaMemcpy(meta->out_buf_ptrs_host[i], meta->out_buf_ptrs_dev[i],
                      info->inferDims.numElements * 4, cudaMemcpyDeviceToHost);       
          }
          outLayerElements = info->inferDims.numElements;

          //cout<<"-------------------------------------------------------------------------------------"<<endl;
          //cout<<"Number of dimensions in the Output Layer - "<<i<<" "<<info->inferDims.numDims<<endl;
          //cout<<"Number of channnels = "<<info->inferDims.d[0]<<endl;
          //cout<<"Height ="<<info->inferDims.d[1]<<endl;
          //cout<<"Width ="<<info->inferDims.d[2]<<endl;
        }
        /* Parse output tensor and fill detection results into objectList. */
        std::vector<NvDsInferLayerInfo>
            outputLayersInfo(meta->output_layers_info,
                            meta->output_layers_info + meta->num_output_layers);

        //store the crowd model output layer buffer in a vector
        std::vector<float> crowdBuffer((float*)outputLayersInfo[0].buffer,
                                                  (float*)outputLayersInfo[0].buffer + outLayerElements);
        
        #if 1
        //#########################################################
        //Debug code to generate and dump heatmaps from here

        Mat denseCrowdFrameHeatMap;
        //640*360
        double densityArr[640][360];
        for(int i=0; i< 640; i++)
          for(int j=0; j<360; j++)
            if(crowdBuffer[i*360 + j] > 0)
              densityArr[i][j] = 255.0 * crowdBuffer[i*360 + j];

        cv::Mat densityMat(360, 640, CV_64F, densityArr);
        //scale to 8 bit
        densityMat.convertTo(densityMat, CV_8UC3);
        //Apply colorMap
        applyColorMap(densityMat, denseCrowdFrameHeatMap, COLORMAP_JET);
        resize(denseCrowdFrameHeatMap, denseCrowdFrameHeatMap, Size(1920,1080), INTER_LINEAR);

        //dump the heat map
        string heatMapImgName = filePrefix + "heatMap_" + to_string(frame_meta->frame_num) + ".jpg";
        imwrite(heatMapImgName, denseCrowdFrameHeatMap);
        crowdBuffer.clear();
        denseCrowdFrameHeatMap.release();
    }
}        

Model file(onnx) can be downloaded from here:
https://drive.google.com/file/d/13CtFYfvwqQDnVxUykYgGbXj7GxPaTS2F/view?usp=sharing

@Fiona.Chen Are you able to reproduce this issue?

@Fiona.Chen any update?

@Fiona.Chen ??

Please provide the complete project(all source code and configurations).