Segmentation fault in NvBufSurfTransform ()

My project is based on the “deepstream-app” and “runtime_source_add_delete” by using V100 and “deepstream:4.0.1-19.09-devel”. I got 2 problems:

  1. how to multiple the batch-size of streammux and nvinfer to identify with the number of sources, when adding and deleting sources? I tried to modify their property, but it seems only take effect during NULL state.
  2. I got a Segmentation fault in NvBufSurfTransform () when saving images by using a probe. but the Segmentation fault seems have some realtion to the batch-size property, I can’t get the point. can you give me some suggestions?
void CustomGstApp::gie_src_pad_buffer_probe(AppCtx * appCtx, GstBuffer * buf, NvDsBatchMeta * batch_meta, uint index){
    //structures of metalist: frame, user, display
    NvDsMetaList *l_frame = nullptr;
//    NvDsMetaList *l_obj = nullptr;
    NvDsMetaList *l_user = nullptr;
//    NvDsMetaList *l_display = nullptr;

    //structures contained in a frame meta: obj, user, display;
    NvDsFrameMeta *frame_meta = nullptr;
//    NvDsObjectMeta *obj_meta = nullptr;
    NvDsUserMeta *user_meta = nullptr;
//    NvDsDisplayMeta *display_meta = nullptr;

//    structures contained in up-layer(obj, user, display)
//    NvDsMetaList *l_obj_user = nullptr;
//    NvDsUserMeta *obj_user_meta = nullptr;
//    NvOSD_TextParams *text_params = nullptr;
//    NvOSD_RectParams *rect_params = nullptr;
//    NvDsMetaList *l_classifier = nullptr;
//    NvDsClassifierMeta *classifier_meta = nullptr;
//    NvDsLabelInfoList *l_label = nullptr;
//    NvDsLabelInfo *label_info = nullptr;

//structures of infer meta;
    NvDsInferTensorMeta *infer_meta = nullptr;
    NvDsInferLayerInfo *infer_info = nullptr;

    static int use_device_mem = 0;
    int source_id = 0;
    int batch_id = 0;
    int num_frames_in_batch = batch_meta->num_frames_in_batch;

    //get the frame data from a batch.
    for (l_frame = batch_meta->frame_meta_list; l_frame != nullptr; l_frame = l_frame->next){
        frame_meta = (NvDsFrameMeta *) (l_frame->data);
        source_id = frame_meta->source_id;
        batch_id = frame_meta->batch_id;

        //get each user data in the frame.
        int infer_model_count = 0;
        int infer_result_bool = false;
        for (l_user = frame_meta->frame_user_meta_list; l_user != nullptr; l_user = l_user->next){
            user_meta = (NvDsUserMeta *) (l_user->data);
            if (user_meta->base_meta.meta_type != NVDSINFER_TENSOR_OUTPUT_META)
                continue;

            //convert data into infer data.
            infer_meta = (NvDsInferTensorMeta *) user_meta->user_meta_data;
            for (int i = 0; i < (int)infer_meta->num_output_layers; i++) {
                infer_info = &infer_meta->output_layers_info[i];
                infer_info->buffer = infer_meta->out_buf_ptrs_host[i];
                if (use_device_mem){
                    cudaMemcpy(infer_meta->out_buf_ptrs_host[i], infer_meta->out_buf_ptrs_dev[i], infer_info->dims.numElements *4, cudaMemcpyDeviceToHost);
                }
            }

            // get the infer result and attach them to the custorm_post_params.
            NvDsInferDimsCHW dims;
            getDimsCHWFromDims (dims, infer_meta->output_layers_info[0].dims);
            int numClasses = dims.c != 1 ? dims.c : (dims.h != 1 ? dims.h :dims.w);
            auto *outputCoverageBuffer = (float *) infer_meta->output_layers_info[0].buffer;

            /**
             * specified model, specified results to collect.
             * waiting to change;using the nvdsinfer_custom_impl.c instead.
             */
            std::string model_name(appCtx->custom_post_params[source_id].model_infer_result[infer_model_count].infer_used_model_name);
            auto infer_result_post = &appCtx->custom_post_params[source_id].model_infer_result[infer_model_count];
            if (model_name == MODEL_SELECTED_PORN){
                infer_result_post->infer_classified_confidence = outputCoverageBuffer[1] + outputCoverageBuffer[3] + outputCoverageBuffer[4] + outputCoverageBuffer[6];
            } else if (model_name == MODEL_SELECTED_VIOLENCE){
                infer_result_post->infer_classified_confidence = outputCoverageBuffer[3] + outputCoverageBuffer[4] + outputCoverageBuffer[5] + outputCoverageBuffer[6];
            } else if (model_name == MODEL_SELECTED_POLITIC){
                infer_result_post->infer_classified_confidence = -1;
            } else {
                infer_result_post->infer_classified_confidence = 0;
            }
            if (infer_result_post->infer_classified_confidence > infer_result_post->infer_confidence_min_threshold){
                infer_result_bool = true;
            }
            // record the infer results in the log.
            NVGSTDS_INFO_MSG_V ("batch ID:%d/%d, source ID:%d, frame num:%d, prob:%f\n", batch_id, num_frames_in_batch, source_id, frame_meta->frame_num, infer_result_post->infer_classified_confidence);
            std::stringstream infer_result_string;
            for (int i = 0; i < numClasses; i++){
                infer_result_string << outputCoverageBuffer[i] << ";";
            }
            GST_INFO_OBJECT (nullptr, "source ID:%d, prob:%s\n", source_id, infer_result_string.str().c_str());
            infer_model_count++;
        }

        //save evidence and post to custom the result.
        if (infer_result_bool){
            int ret = ERROR_SUCCESS;
            ret = image_probe(appCtx, buf, batch_meta, batch_id, source_id);
//            if ( ret != ERROR_SUCCESS){
//                GST_ERROR_OBJECT(nullptr, "fail to save image evidence.");
//            }else{
//                ret = http_post_to_custom(appCtx, source_id);
//                if ( ret != ERROR_SUCCESS){
//                    GST_ERROR_OBJECT(nullptr, "fail to post to callback uri.");
//                }
//            }
            bzero(appCtx->custom_post_params[source_id].image_save_name, MAX_STRING_LENGTH);
        }
        //clear the infer_classified_confidence
        for (infer_model_count--; infer_model_count > -1; infer_model_count--){
            appCtx->custom_post_params[source_id].model_infer_result[infer_model_count].infer_classified_confidence = 0;
        }
    }
    use_device_mem = 1 - use_device_mem;
}

/**
 * function to copy the image data from GPU to CPU.
 * convert the image from NV12 to RGB and save it.
 * @param appCtx
 * @param buf
 * @param batch_meta
 * @param source_id
 */
int CustomGstApp::image_probe(AppCtx * appCtx, GstBuffer * buf, NvDsBatchMeta * batch_meta, uint frame_id, uint source_id){
    int ret = ERROR_SUCCESS;
    NvBufSurface *surface = nullptr;
    GstMapInfo in_map_info;
    cudaError_t cuda_err;
    int batch_size;
    std::string frame_name(appCtx->custom_request_params[source_id].unique_id);
    std::string frame_full_name(appCtx->custom_post_params[source_id].image_save_path);

    frame_name += get_current_time() + ".jpg";
    g_strlcat(appCtx->custom_post_params[source_id].image_save_name, frame_name.c_str(), MAX_STRING_LENGTH);
    frame_full_name += frame_name;

    memset (&in_map_info, 0, sizeof (in_map_info));
    if (!gst_buffer_map (buf, &in_map_info, GST_MAP_READ)) {
        GST_ERROR_OBJECT (nullptr, "Error: Failed to map gst buffer\n");
        gst_buffer_unmap (buf, &in_map_info);
        ret = ERROR_IMAGE_PROB;
        return ret;
    }

    NvBufSurfTransformRect src_rect, dst_rect;
    surface = (NvBufSurface *) in_map_info.data;

    batch_size= surface->batchSize;
    GST_DEBUG_OBJECT(nullptr, "Batch Size : %d, resolution : %dx%d \n",batch_size, surface->surfaceList[frame_id].width, surface->surfaceList[frame_id].height);

    src_rect.top   = 0;
    src_rect.left  = 0;
    src_rect.width = surface->surfaceList[frame_id].width;
    src_rect.height= surface->surfaceList[frame_id].height;

    dst_rect.top   = 0;
    dst_rect.left  = 0;
    dst_rect.width = surface->surfaceList[frame_id].width;
    dst_rect.height= surface->surfaceList[frame_id].height;

    NvBufSurfTransformParams nvbufsurface_params;
    nvbufsurface_params.src_rect = &src_rect;
    nvbufsurface_params.dst_rect = &dst_rect;
    nvbufsurface_params.transform_flag =  NVBUFSURF_TRANSFORM_CROP_SRC | NVBUFSURF_TRANSFORM_CROP_DST;
    nvbufsurface_params.transform_filter = NvBufSurfTransformInter_Default;

    NvBufSurface *dst_surface = nullptr;
    NvBufSurfaceCreateParams nvbufsurface_create_params;

    /* An intermediate buffer for NV12/RGBA to BGR conversion  will be
     * required. Can be skipped if custom algorithm can work directly on NV12/RGBA. */
    nvbufsurface_create_params.gpuId  = surface->gpuId;
    nvbufsurface_create_params.width  = surface->surfaceList[frame_id].width;
    nvbufsurface_create_params.height = surface->surfaceList[frame_id].height;
    nvbufsurface_create_params.size = 0;
    nvbufsurface_create_params.colorFormat = NVBUF_COLOR_FORMAT_RGBA;
    nvbufsurface_create_params.layout = NVBUF_LAYOUT_PITCH;
    nvbufsurface_create_params.memType = NVBUF_MEM_CUDA_UNIFIED;

    cuda_err = cudaSetDevice (surface->gpuId);

    cudaStream_t cuda_stream;

    cuda_err=cudaStreamCreate (&cuda_stream);

    ret = NvBufSurfaceCreate(&dst_surface,batch_size,&nvbufsurface_create_params);
    if (ret != ERROR_SUCCESS ){
        GST_ERROR_OBJECT (nullptr, "NvBufSurfaceCreate failed.\n");
        ret = ERROR_IMAGE_PROB;
        return ret;
    }

    NvBufSurfTransformConfigParams transform_config_params;
    NvBufSurfTransform_Error err;

    transform_config_params.compute_mode = NvBufSurfTransformCompute_Default;
    transform_config_params.gpu_id = surface->gpuId;
    transform_config_params.cuda_stream = cuda_stream;
    err = NvBufSurfTransformSetSessionParams (&transform_config_params);

    NvBufSurfaceMemSet (dst_surface, frame_id, 0, 0);
    err = NvBufSurfTransform (surface, dst_surface, &nvbufsurface_params);
    if (err != NvBufSurfTransformError_Success) {
        GST_ERROR_OBJECT (nullptr, "NvBufSurfTransform failed with error %d while converting buffer\n", err);
        ret = ERROR_IMAGE_PROB;
        return ret;
    }
    NvBufSurfaceMap (dst_surface, frame_id, 0, NVBUF_MAP_READ);
    NvBufSurfaceSyncForCpu (dst_surface, frame_id, 0);

cv::Mat bgr_frame = cv::Mat (cv::Size(nvbufsurface_create_params.width, nvbufsurface_create_params.height), CV_8UC3);
    cv::Mat in_mat = cv::Mat (nvbufsurface_create_params.height, nvbufsurface_create_params.width,
                     CV_8UC4, dst_surface->surfaceList[frame_id].mappedAddr.addr[0],
                     dst_surface->surfaceList[frame_id].pitch);
    cv::cvtColor (in_mat, bgr_frame, CV_RGBA2BGR);
    //GST_ERROR_OBJECT(nullptr, "%s", frame_full_name.c_str());
    cv::imwrite(frame_full_name, bgr_frame);
    in_mat.release();
    bgr_frame.release();
    //imshow("original", bgr_frame);
    //waitKey(1);

    NvBufSurfaceUnMap (dst_surface, frame_id, 0);
    NvBufSurfaceDestroy (dst_surface);
    cudaStreamDestroy (cuda_stream);
    gst_buffer_unmap (buf, &in_map_info);

    return ret;
}
[application]
enable-perf-measurement=0

[source0]
enable=0
type=2
uri=rtmp://
intra-decode-enable=0
gpu-id=6
drop-frame-interval=0
cudadec-memtype=0

[streammux]
gpu-id=6
live-source=1
batch-size=30
batched-push-timeout=40000
width=1920
height=1080
enable-padding=1
nvbuf-memory-type=0

[primary-gie]
enable=1
gie-unique-id=1
config-file=porn_resnet152_gie_config.txt
gpu-id=6
nvbuf-memory-type=0

[tracker]
enable=0

[secondary-gie0]
enable=1
gie-unique-id=2
config-file=violence_resnet101_gie_config.txt
operate-on-gie-id=1
gpu-id=6
nvbuf-memory-type=0

[tiled-display]
enable=0

[osd]
enable=0

[sink0]
enable=1
type=1
sync=0
qos=0
source-id=0
#gpu-id=6
container=3
codec=1
bitrate=4000000
output-file=rtmp://
#nvbuf-memory-type=0
#rtsp-port=8554
#udp-port=5400
#overlay-id=1
#width=1920
#height=1080
offset-x=0
offset-y=0
#display-id=0
iframeinterval=30
#msg-conv-config=msgconv_config.txt
#msg-broker-proto-lib=/home/ubuntu/libnvds_amqp_proto.so
#msg-broker-conn-str=foo.bar.com;80;dsapp
#topic=test-ds4
#msg-conv-payload-type=0

[tests]
file-loop=0


print_info.JPG
core.JPG
core_dump_info.JPG

  1. I got a Segmentation fault in NvBufSurfTransform () when saving images by using a probe
    Did you try batch-size=1? I can’t find the issue from reviewing the code. In gstnvinfer.cpp, gstdsexample.cpp, nvbufsurftransform.h, we have the sample of NvBufSurfTransform usage.
  1. how to multiple the batch-size of streammux and nvinfer to identify with the number of sources, when adding and deleting sources?

“frame_num” is set in decoder “gstnvcuvidbasedec.cpp -> pfrminfo_meta->frame_num = nvcudec->frame_cnt;” which is get from nvcuvid (nv video codec)
Can you share what’s “frame_num” used for ?

  1. I have tried batch-size=1, it works well. But I wonder if it will be more more efficient by using the batch-size identical with the number of sources. the segmentation fault randomly comes up when the batch-size is greater than 1.it reports in NvBufSurfTransform () or cv::imwrite().

1.“frame_num” isn’t used and the “frame_meta->frame_num” is used for printing information. I don’t get your point.
I want to confirm whether I can dynamically set the batch-size for “nvstreammux” and “nvinfer”, or directly set batch-size=1024 for the “nvstreammux”?

can anybody help me with this? Segmentation fault of NvBufSurfTransform
[img]

[/img]

https://docs.nvidia.com/metropolis/deepstream/plugin-manual/index.html#page/DeepStream_Plugin_Manual%2Fdeepstream_plugin_details.02.03.html

“The muxer pushes the batch downstream when the batch is filled or the batch formation timeout batched-pushed-timeout is reached. The timeout starts running when the first buffer for a new batch is collected.”

streammux batchsize can’t be set dynamically.

Can you give the test code which can reproduce the segmentation fault ?

@ChrisDing:
Thank you for your reply!

  1. if the streammux batchsize can’t be set dynamically, can I just set it to the maximum ,such as 1000?
    2.the segmentation fault comes up randomly.my project is a litte bit large, so I will try to write a simple pipeline for you to test.And there are some clues for you: my pipeline is based on “deepstream-app” and the probe shown in the #1 inserts in the sink of secondary_gie0_sink; the segmentation fault rarely comes up when the streammux batch-size is under 10~20, and it comes up every time when the batch-size is greater than 25. you can check the core dump information in the picture in the #1.
    the picture of the pipeline is attached below.

  1. 1000 is too big. I think you can set the maximum of capacity. For example your platform can do 30 sources, you can set streammux batch size 30.

What’s the resolution of source? Can you try “enable-padding=0” ?

@ChrisDing:
1.the program runs in a docker using V100. Since the pipeline is configured with drop-frame-interval and without render, the source of one pipeline absolutely larger than 30. I will work it out.
2.the resolution of sources are 1920 X 1080. I tried the “enable-padding=0, batch-size=30”, but it failed with segmentation fault.

It’s better you can give the code for us to reproduce the “segmentation fault” issue.