Issue: Bounding Boxes and Landmarks Not Passing Downstream in DeepStream Pipeline (DeepStream 6.1, test5 SDK)**

Issue: Bounding Boxes and Landmarks Not Passing Downstream in DeepStream Pipeline (DeepStream 6.1, test5 SDK)

• Hardware Platform (Jetson / GPU)
Mon Sep 16 11:34:03 2024
±----------------------------------------------------------------------------------------+
| NVIDIA-SMI 555.42.06 Driver Version: 555.42.06 CUDA Version: 12.5 |
|-----------------------------------------±-----------------------±---------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3060 … Off | 00000000:01:00.0 Off | N/A |
| N/A 40C P0 26W / 80W | 15MiB / 6144MiB | 0% Default |
| | | N/A |
±----------------------------------------±-----------------------±---------------------+

±----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 2187 G /usr/lib/xorg/Xorg 4MiB |
±----------------------------------------------------------------------------------------+

• DeepStream Version :6.1

I am currently using a custom RetinaFace parser in my DeepStream 6.1 pipeline with the test5 SDK. While the bounding boxes and landmarks are correctly extracted in the custom parser, they are not being passed to the downstream elements in the pipeline. I can see the bounding boxes and landmarks are processed within the code, but they don’t seem to propagate further for downstream processing (e.g., metadata for Kafka or rendering on the output stream).

I would appreciate any insights or guidance on ensuring that the bounding boxes and landmarks are correctly passed and accessible to downstream plugins or metadata for further processing in the DeepStream 6.1 pipeline with the test5 SDK. Here’s a brief overview of the parsing function I’m using:

  • Bounding boxes and landmarks are decoded, and confidence thresholds are applied.
  • Landmarks are verified to be inside the bounding boxes.
  • However, the data doesn’t seem to be passed down to downstream plugins or message converters.

Thank you in advance for your support!

below is my custom parser code, Please let em known anyother details do i need to share

/*

  • Copyright (c) 2018-2020, NVIDIA CORPORATION. All rights reserved.
  • Permission is hereby granted, free of charge, to any person obtaining a
  • copy of this software and associated documentation files (the “Software”),
  • to deal in the Software without restriction, including without limitation
  • the rights to use, copy, modify, merge, publish, distribute, sublicense,
  • and/or sell copies of the Software, and to permit persons to whom the
  • Software is furnished to do so, subject to the following conditions:
  • The above copyright notice and this permission notice shall be included in
  • all copies or substantial portions of the Software.
  • THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
  • IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
  • FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
  • THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
  • LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
  • FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
  • DEALINGS IN THE SOFTWARE.
    */

include “nvdsinfer_custom_impl.h”
include
include
include
include
include
include “cuda_runtime_api.h”
include “logging.h”
include “common.hpp”
include “calibrator.h”
include
include
include
include
include

// define USE_INT8 // set USE_INT8 or USE_FP16 or USE_FP32
define DEVICE 0 // GPU id
define BATCH_SIZE 1
define CONF_THRESH 0.75
define IOU_THRESH 0.4

// stuff we know about the network and the input/output blobs
static const int INPUT_H = decodeplugin::INPUT_H; // H, W must be able to be divided by 32.
static const int INPUT_W = decodeplugin::INPUT_W;;
static const int OUTPUT_SIZE = (INPUT_H / 8 * INPUT_W / 8 + INPUT_H / 16 * INPUT_W / 16 + INPUT_H / 32 * INPUT_W / 32) * 2 * 15 + 1;
const char* INPUT_BLOB_NAME = “data”;
const char* OUTPUT_BLOB_NAME = “prob”;

// extern “C” bool NvDsInferParseCustomRetinaface(
// std::vector const &outputLayersInfo,
// NvDsInferNetworkInfo const &networkInfo,
// NvDsInferParseDetectionParams const &detectionParams,
// std::vector &objectList);

extern “C” bool NvDsInferParseCustomRetinaface(
std::vector const &outputLayersInfo,
NvDsInferNetworkInfo const &networkInfo,
NvDsInferParseDetectionParams const &detectionParams,
std::vector &objectList);

static constexpr int LOCATIONS = 4;
static constexpr int ANCHORS = 10;

// struct alignas(float) Detection{
// float bbox[LOCATIONS];
// float score;
// float anchor[ANCHORS];
// };

// void create_anchor_retinaface(std::vector& res, float *output, float conf_thresh, int width, int height) {
// int det_size = sizeof(Detection) / sizeof(float);
// for (int i = 0; i < output[0]; i++){
// if (output[1 + det_size * i + 4] <= conf_thresh) continue;

// Detection det;
// memcpy(&det, &output[1 + det_size * i], det_size * sizeof(float));
// det.bbox[0] = CLIP(det.bbox[0], 0, width - 1);
// det.bbox[1] = CLIP(det.bbox[1] , 0, height -1);
// det.bbox[2] = CLIP(det.bbox[2], 0, width - 1);
// det.bbox[3] = CLIP(det.bbox[3], 0, height - 1);

// res.push_back(det);

// }
// }

// bool cmp(Detection& a, Detection& b) {
// return a.score > b.score;
// }

// float iou(float lbox[4], float rbox[4]) {
// float interBox = {
// std::max(lbox[0] - lbox[2]/2.f , rbox[0] - rbox[2]/2.f), //left
// std::min(lbox[0] + lbox[2]/2.f , rbox[0] + rbox[2]/2.f), //right
// std::max(lbox[1] - lbox[3]/2.f , rbox[1] - rbox[3]/2.f), //top
// std::min(lbox[1] + lbox[3]/2.f , rbox[1] + rbox[3]/2.f), //bottom
// };

// std::cout << interBox << std::endl;

// if(interBox[2] > interBox[3] || interBox[0] > interBox[1])
// return 0.0f;

// float interBoxS =(interBox[1]-interBox[0])*(interBox[3]-interBox[2]);
// return interBoxS/(lbox[2]*lbox[3] + rbox[2]*rbox[3] -interBoxS);
// }

// void nms_and_adapt(std::vector& det, std::vector& res, float nms_thresh, int width, int height) {
// std::sort(det.begin(), det.end(), cmp);
// for (const auto& d : det) {
// std::cout << "Score: " << d.score << std::endl;

// std::cout << "Bounding Box: ";
// for (size_t i = 0; i < LOCATIONS; ++i) {
// std::cout << d.bbox[i] << " ";
// }
// std::cout << std::endl;

// std::cout << "Anchors: ";
// for (size_t i = 0; i < ANCHORS; ++i) {
// std::cout << d.anchor[i] << " ";
// }
// std::cout << std::endl;

// std::cout << “-------------------” << std::endl;
// }
// for (unsigned int m = 0; m < det.size(); ++m) {
// auto& item = det[m];
// res.push_back(item);
// for (unsigned int n = m + 1; n < det.size(); ++n) {
// if (iou(item.bbox, det[n].bbox) > nms_thresh) {
// det.erase(det.begin()+n);
// --n;
// }
// }
// }

// // crop larger area for better alignment performance
// // there I choose to crop 20 more pixel
// for (unsigned int m = 0; m < res.size(); ++m) {
// res[m].bbox[0] = CLIP(res[m].bbox[0]-10, 0, width - 1);
// res[m].bbox[1] = CLIP(res[m].bbox[1]-10, 0, height -1);
// res[m].bbox[2] = CLIP(res[m].bbox[2]+20, 0, width - 1);
// res[m].bbox[3] = CLIP(res[m].bbox[3]+20, 0, height - 1);
// }
// }

// static float iou(float lbox[4], float rbox[4]) {
// float interBox = {
// std::max(lbox[0], rbox[0]), //left
// std::min(lbox[2], rbox[2]), //right
// std::max(lbox[1], rbox[1]), //top
// std::min(lbox[3], rbox[3]), //bottom
// };

// if(interBox[2] > interBox[3] || interBox[0] > interBox[1])
// return 0.0f;

// float interBoxS = (interBox[1] - interBox[0]) * (interBox[3] - interBox[2]);
// return interBoxS / ((lbox[2] - lbox[0]) * (lbox[3] - lbox[1]) + (rbox[2] - rbox[0]) * (rbox[3] - rbox[1]) -interBoxS + 0.000001f);
// }

// static bool cmp(const decodeplugin::Detection& a, const decodeplugin::Detection& b) {
// return a.class_confidence > b.class_confidence;
// }

// static inline void nms(std::vectordecodeplugin::Detection& res, float *output, float nms_thresh = 0.4) {
// std::vectordecodeplugin::Detection dets;
// for (int i = 0; i < output[0]; i++) {
// if (output[15 * i + 1 + 4] <= 0.1) continue;
// decodeplugin::Detection det;
// memcpy(&det, &output[15 * i + 1], sizeof(decodeplugin::Detection));
// dets.push_back(det);
// }
// std::sort(dets.begin(), dets.end(), cmp);
// for (size_t m = 0; m < dets.size(); ++m) {
// auto& item = dets[m];
// res.push_back(item);
// //std::cout << item.class_confidence << " bbox " << item.bbox[0] << ", " << item.bbox[1] << ", " << item.bbox[2] << ", " << item.bbox[3] << std::endl;
// for (size_t n = m + 1; n < dets.size(); ++n) {
// if (iou(item.bbox, dets[n].bbox) > nms_thresh) {
// dets.erase(dets.begin()+n);
// --n;
// }
// }
// }
// }

void printLayerInfo(const std::vector &outputLayersInfo) {
// Iterate through each layer info in the vector
for (size_t i = 0; i < outputLayersInfo.size(); ++i) {
const NvDsInferLayerInfo &layerInfo = outputLayersInfo[i];

    // Print basic information about the layer
    std::cout << "Layer " << i << ":" << std::endl;
    std::cout << "  Layer Name: " << (layerInfo.layerName ? layerInfo.layerName : "Unknown") << std::endl;

    // Print details about the buffer contents if the buffer is not null
    if (layerInfo.buffer != nullptr) {
        float *buffer = reinterpret_cast<float*>(layerInfo.buffer);
        
        // You might need to know the size or number of elements in the buffer
        // This is a placeholder value, replace it with actual logic if available
        int numElements = 10; // You need actual size or number of elements

        std::cout << "  Buffer Contents:" << std::endl;
        for (int j = 0; j < numElements; ++j) {  // Adjust numElements as necessary
            std::cout << "    " << buffer[j] << std::endl;
        }
    } else {
        std::cout << "  Buffer is null" << std::endl;
    }
}

}

// PriorBox class definition
class PriorBox {
public:
PriorBox(const std::vector<std::vector>& min_sizes,
const std::vector& steps, bool clip,
const std::vector& image_size)
: min_sizes(min_sizes), steps(steps), clip(clip), image_size(image_size) {
feature_maps.resize(steps.size());
for (size_t i = 0; i < steps.size(); ++i) {
feature_maps[i] = {static_cast(std::ceil(image_size[0] / steps[i])),
static_cast(std::ceil(image_size[1] / steps[i]))};
}
}

std::vector<std::vector<float>> forward() {
    std::vector<std::vector<float>> anchors;
    for (size_t k = 0; k < feature_maps.size(); ++k) {
        const auto& f = feature_maps[k];
        const auto& min_sizes = this->min_sizes[k];
        for (int i = 0; i < f[0]; ++i) {
            for (int j = 0; j < f[1]; ++j) {
                for (float min_size : min_sizes) {
                    float s_kx = min_size / image_size[1];
                    float s_ky = min_size / image_size[0];
                    float cx = (j + 0.5f) * steps[k] / image_size[1];
                    float cy = (i + 0.5f) * steps[k] / image_size[0];
                    anchors.push_back({cx, cy, s_kx, s_ky});
                }
            }
        }
    }
    if (clip) {
        for (auto& anchor : anchors) {
            anchor[0] = std::min(1.0f, std::max(0.0f, anchor[0]));
            anchor[1] = std::min(1.0f, std::max(0.0f, anchor[1]));
        }
    }
    return anchors;
}

private:
std::vector<std::vector> min_sizes;
std::vector steps;
bool clip;
std::vector image_size;
std::vector<std::vector> feature_maps;
};

// Function to scale and resize boxes
void scaleAndResizeBoxes(std::vector<std::vector>& boxes, const std::vector& scale, float resize) {
for (auto& box : boxes) {
if (box.size() != scale.size()) {
std::cerr << “Error: Box size does not match scale size.\n”;
return;
}
for (size_t i = 0; i < box.size(); ++i) {
box[i] = box[i] * scale[i] / resize;
}
}
}

// Function to decode bounding box predictions
std::vector<std::vector> decode(const std::vector<std::vector>& loc,
const std::vector<std::vector>& priors,
const std::vector& variances) {
size_t num_priors = priors.size();
size_t num_locations = loc.size();

if (num_locations != num_priors) {
    std::cerr << "Error: Number of priors does not match number of locations.\n";
    return {};
}

std::vector<std::vector<float>> boxes(num_priors, std::vector<float>(4));

for (size_t i = 0; i < num_priors; ++i) {
    const auto& prior = priors[i];
    const auto& l = loc[i];

    if (prior.size() != 4 || l.size() != 4) {
        std::cerr << "Error: Prior or location vector size is not 4.\n";
        return {};
    }

    float center_x = prior[0] + l[0] * variances[0] * prior[2];
    float center_y = prior[1] + l[1] * variances[0] * prior[3];
    float width = prior[2] * std::exp(l[2] * variances[1]);
    float height = prior[3] * std::exp(l[3] * variances[1]);

    boxes[i][0] = center_x - width / 2;
    boxes[i][1] = center_y - height / 2;
    boxes[i][2] = center_x + width / 2;
    boxes[i][3] = center_y + height / 2;
}

return boxes;

}

// Convert bounding boxes to NvDsInferParseObjectInfo format
std::vector convertToNvDsInferParseObjectInfo(const std::vector<std::vector>& decoded_boxes, float vis_thresh) {
std::vector objectList;

for (const auto& bbox : decoded_boxes) {
    // Dummy score for filtering, replace with actual score if available
    float score = 0.9f; // Replace with actual score

    // if (score <= vis_thresh) continue;

    NvDsInferParseObjectInfo oinfo;
    oinfo.classId = 0; // Assign appropriate class ID
    oinfo.left = static_cast<unsigned int>(std::round(bbox[0]));
    oinfo.top = static_cast<unsigned int>(std::round(bbox[1]));
    oinfo.width = static_cast<unsigned int>(std::round(bbox[2] - bbox[0]));
    oinfo.height = static_cast<unsigned int>(std::round(bbox[3] - bbox[1]));
    oinfo.detectionConfidence = score; // Uncomment if you have score data

    objectList.push_back(oinfo);
}

return objectList;

}

// void printObjectList(const std::vector& objectList) {
// for (const auto& obj : objectList) {
// std::cout << "Class ID: " << obj.classId << “\n”;
// std::cout << "Left: " << obj.left << “\n”;
// std::cout << "Top: " << obj.top << “\n”;
// std::cout << "Width: " << obj.width << “\n”;
// std::cout << "Height: " << obj.height << “\n”;
// std::cout << “------------------\n”;
// }
// }

void printObjectList(const std::vector& objectList) {
for (const auto& obj : objectList) {
std::cout << "Class ID: " << obj.classId << “\n”;
std::cout << "Left: " << obj.left << “\n”;
std::cout << "Top: " << obj.top << “\n”;
std::cout << "Width: " << obj.width << “\n”;
std::cout << "Height: " << obj.height << “\n”;
std::cout << "Confidence Score: " << std::setprecision(4) << obj.detectionConfidence << “\n”;
std::cout << “------------------\n”;
}
}

void CalDetectionCPU(
const float *input_boxes,
const float *input_conf,
const float *input_landm,
int num_elem,
int step,
int anchor,
std::vectordecodeplugin::Detection &output
) {
int h = decodeplugin::INPUT_H / step;
int w = decodeplugin::INPUT_W / step;
int total_grid = h * w;
int num_anchors = 2; // Assuming there are always 2 anchors

// Initialize output vector
output.clear();
output.reserve(num_elem); // Reserve space for output to avoid frequent reallocations

for (int idx = 0; idx < num_elem; ++idx) {
    int bn_idx = idx / total_grid;
    int local_idx = idx - bn_idx * total_grid;
    int y = local_idx / w;
    int x = local_idx % w;

    const float* cur_input = input_boxes + bn_idx * (4 + 2 + 10) * 2 * total_grid;
    const float *bbox_reg = &cur_input[0];
    const float *cls_reg = &cur_input[2 * 4 * total_grid];
    const float *lmk_reg = &cur_input[2 * 4 * total_grid + 2 * 2 * total_grid];

    for (int k = 0; k < num_anchors; ++k) {
        float conf1 = cls_reg[local_idx + k * total_grid * 2];
        float conf2 = cls_reg[local_idx + k * total_grid * 2 + total_grid];
        conf2 = expf(conf2) / (expf(conf1) + expf(conf2));
        if (conf2 <= 0.02) continue;

        decodeplugin::Detection det;

        float prior[4];
        prior[0] = ((float)x + 0.5f) / w;
        prior[1] = ((float)y + 0.5f) / h;
        prior[2] = (float)anchor * (k + 1) / decodeplugin::INPUT_W;
        prior[3] = (float)anchor * (k + 1) / decodeplugin::INPUT_H;

        // Location
        det.bbox[0] = prior[0] + bbox_reg[local_idx + k * total_grid * 4] * 0.1f * prior[2];
        det.bbox[1] = prior[1] + bbox_reg[local_idx + k * total_grid * 4 + total_grid] * 0.1f * prior[3];
        det.bbox[2] = prior[2] * expf(bbox_reg[local_idx + k * total_grid * 4 + total_grid * 2] * 0.2f);
        det.bbox[3] = prior[3] * expf(bbox_reg[local_idx + k * total_grid * 4 + total_grid * 3] * 0.2f);
        det.bbox[0] -= det.bbox[2] / 2;
        det.bbox[1] -= det.bbox[3] / 2;
        det.bbox[2] += det.bbox[0];
        det.bbox[3] += det.bbox[1];
        det.bbox[0] *= decodeplugin::INPUT_W;
        det.bbox[1] *= decodeplugin::INPUT_H;
        det.bbox[2] *= decodeplugin::INPUT_W;
        det.bbox[3] *= decodeplugin::INPUT_H;
        det.class_confidence = conf2;
        // std::cout << conf2 << std::endl;

        // for (int i = 0; i < 10; i += 2) {
        //     det.landmark[i] = prior[0] + lmk_reg[local_idx + k * total_grid * 10 + total_grid * i] * 0.1f * prior[2];
        //     det.landmark[i + 1] = prior[1] + lmk_reg[local_idx + k * total_grid * 10 + total_grid * (i + 1)] * 0.1f * prior[3];
        //     det.landmark[i] *= decodeplugin::INPUT_W;
        //     det.landmark[i + 1] *= decodeplugin::INPUT_H;
        // }

        // Add detection to output
        output.push_back(det);
    }
}

}

// static bool NvDsInferParseRetinaface(std::vector const &outputLayersInfo,
// NvDsInferNetworkInfo const &networkInfo,
// NvDsInferParseDetectionParams const &detectionParams,
// std::vector &objectList) {

// // printLayerInfo(outputLayersInfo);

// float* boxesData = reinterpret_cast<float*>(outputLayersInfo[0].buffer);
// float* confData = reinterpret_cast<float*>(outputLayersInfo[1].buffer);
// float* landmData = reinterpret_cast<float*>(outputLayersInfo[2].buffer);

// int step = 16;
// int anchor = 8;

// std::vector<std::vector> min_sizes = {{16, 32}, {64, 128}, {256, 512}};
// std::vector steps = {8, 16, 32};
// bool clip = true;
// std::vector image_size = {640, 640};

// PriorBox prior_box(min_sizes, steps, clip, image_size);
// auto prior_data = prior_box.forward();

// // Convert float* loc data to std::vector<std::vector>
// size_t num_priors = 16800; // Number of prior boxes
// size_t num_values = 4; // Number of values per box

// std::vector loc_data(num_priors * num_values);
// std::vector priors_data(num_priors * num_values); // Placeholder for priors data

// // Copy data from float* to std::vector
// std::copy(boxesData, boxesData + loc_data.size(), loc_data.begin());

// // Convert prior_data (std::vector<std::vector>) to flat vector for use
// for (size_t i = 0; i < num_priors; ++i) {
// for (size_t j = 0; j < num_values; ++j) {
// priors_data[i * num_values + j] = prior_data[i][j];
// }
// }

// std::vector<std::vector> loc(num_priors, std::vector(num_values));
// std::vector<std::vector> priors(num_priors, std::vector(num_values));

// for (size_t i = 0; i < num_priors; ++i) {
// for (size_t j = 0; j < num_values; ++j) {
// loc[i][j] = loc_data[i * num_values + j];
// priors[i][j] = priors_data[i * num_values + j];
// }
// }

// // Define variances
// std::vector variances = {0.1f, 0.2f};

// // Decode bounding box predictions
// auto decoded_boxes = decode(loc, priors, variances);
// // Decode bounding boxes on GPU
// // auto decoded_boxes = decodeGPU(loc, priors, variances);

// // Define image dimensions
// float img_width = 640.0f; // Example image width
// float img_height = 640.0f; // Example image height

// // Define scale and resize factors
// std::vector scale = {img_width, img_height, img_width, img_height};
// float resize = 1.0f;

// // Scale and resize boxes
// scaleAndResizeBoxes(decoded_boxes, scale, resize);

// // Convert to NvDsInferParseObjectInfo format
// float vis_thresh = 0.5f; // Visibility threshold
// objectList = convertToNvDsInferParseObjectInfo(decoded_boxes, vis_thresh);

// // Print object list
// std::cout << “Object List:\n”;
// printObjectList(objectList);

// return true;
// }

// Assuming that the following structure exists and is correctly declared
// struct NvDsInferObjectDetectionInfo { … };

//code-1 start
// New Structure to hold bounding box information
// struct BoundingBox {
// float x1, y1, x2, y2, score;
// };

// // New Function to calculate Intersection over Union (IoU)
// float IoU(const BoundingBox &a, const BoundingBox &b) {
// float inter_x1 = std::max(a.x1, b.x1);
// float inter_y1 = std::max(a.y1, b.y1);
// float inter_x2 = std::min(a.x2, b.x2);
// float inter_y2 = std::min(a.y2, b.y2);

// float inter_area = std::max(0.0f, inter_x2 - inter_x1 + 1) * std::max(0.0f, inter_y2 - inter_y1 + 1);
// float box_a_area = (a.x2 - a.x1 + 1) * (a.y2 - a.y1 + 1);
// float box_b_area = (b.x2 - b.x1 + 1) * (b.y2 - b.y1 + 1);

// return inter_area / (box_a_area + box_b_area - inter_area);
// }

// // New Function to perform Non-Maximum Suppression (NMS)
// std::vector nms(std::vector &boxes, float iou_threshold) {
// std::vector result;

// // Sort boxes by score in descending order
// std::sort(boxes.begin(), boxes.end(), (const BoundingBox &a, const BoundingBox &b) {
// return a.score > b.score;
// });

// std::vector suppressed(boxes.size(), false);

// for (size_t i = 0; i < boxes.size(); ++i) {
// if (suppressed[i]) continue;

// result.push_back(boxes[i]);

// for (size_t j = i + 1; j < boxes.size(); ++j) {
// if (IoU(boxes[i], boxes[j]) > iou_threshold) {
// suppressed[j] = true;
// }
// }
// }

// return result;
// }

// // New Comparator function to sort detections by score in descending order
// bool compareDetections(const BoundingBox &a, const BoundingBox &b) {
// return a.score > b.score;
// }

// // New Function to keep only the top-K detections
// std::vector keepTopKDetections(std::vector &detections, int top_k) {
// // Sort detections by score
// std::sort(detections.begin(), detections.end(), compareDetections);

// // Keep only the top-K detections
// if (detections.size() > static_cast<size_t>(top_k)) {
// detections.resize(top_k);
// }

// return detections;
// }

// // Function to clip bounding boxes to image dimensions
// void clipBoxes(std::vector<std::vector>& boxes, float img_width, float img_height) {
// for (auto& box : boxes) {
// box[0] = std::max(0.0f, std::min(box[0], img_width - 1));
// box[1] = std::max(0.0f, std::min(box[1], img_height - 1));
// box[2] = std::max(0.0f, std::min(box[2], img_width - 1));
// box[3] = std::max(0.0f, std::min(box[3], img_height - 1));
// }
// }

// // Existing NvDsInferParseRetinaface function with new additions integrated
// static bool NvDsInferParseRetinaface(std::vector const &outputLayersInfo,
// NvDsInferNetworkInfo const &networkInfo,
// NvDsInferParseDetectionParams const &detectionParams,
// std::vector &objectList) {

// // Remove unused variables if not needed
// float* boxesData = reinterpret_cast<float*>(outputLayersInfo[0].buffer);
// float* confData = reinterpret_cast<float*>(outputLayersInfo[1].buffer);
// // float* landmData = reinterpret_cast<float*>(outputLayersInfo[2].buffer); // Commented out if not used

// // int step = 16; // Commented out if not used
// // int anchor = 8; // Commented out if not used

// // Define the prior box configuration
// std::vector<std::vector> min_sizes = {{16, 32}, {64, 128}, {256, 512}};
// std::vector steps = {8, 16, 32};
// bool clip = true;
// std::vector image_size = {640, 640};

// PriorBox prior_box(min_sizes, steps, clip, image_size);
// auto prior_data = prior_box.forward();

// // Debugging: Print the first few prior boxes
// std::cout << “Generated Prior Boxes (first 5):” << std::endl;
// for (size_t i = 0; i < std::min(prior_data.size(), size_t(5)); ++i) {
// std::cout << "Prior " << i << “: (”
// << prior_data[i][0] << ", "
// << prior_data[i][1] << ", "
// << prior_data[i][2] << ", "
// << prior_data[i][3] << “)” << std::endl;
// }

// // Convert raw output data to usable vectors
// size_t num_priors = 16800; // Number of prior boxes
// size_t num_values = 4; // Number of values per box

// std::vector loc_data(num_priors * num_values);
// std::vector priors_data(num_priors * num_values); // Placeholder for priors data

// // Copy data from raw output
// std::copy(boxesData, boxesData + loc_data.size(), loc_data.begin());

// // Convert prior data to a flat vector
// for (size_t i = 0; i < num_priors; ++i) {
// for (size_t j = 0; j < num_values; ++j) {
// priors_data[i * num_values + j] = prior_data[i][j];
// }
// }

// std::vector<std::vector> loc(num_priors, std::vector(num_values));
// std::vector<std::vector> priors(num_priors, std::vector(num_values));

// for (size_t i = 0; i < num_priors; ++i) {
// for (size_t j = 0; j < num_values; ++j) {
// loc[i][j] = loc_data[i * num_values + j];
// priors[i][j] = priors_data[i * num_values + j];
// }
// }

// // Define variances
// std::vector variances = {0.1f, 0.2f};

// // Decode bounding box predictions
// auto decoded_boxes = decode(loc, priors, variances);

// // Debugging: Print the first few decoded bounding boxes
// std::cout << “Decoded Bounding Boxes (first 5):” << std::endl;
// for (size_t i = 0; i < std::min(decoded_boxes.size(), size_t(5)); ++i) {
// std::cout << "Decoded Box " << i << “: (”
// << decoded_boxes[i][0] << ", "
// << decoded_boxes[i][1] << ", "
// << decoded_boxes[i][2] << ", "
// << decoded_boxes[i][3] << “)” << std::endl;
// }

// // Define image dimensions
// float img_width = 640.0f; // Example image width
// float img_height = 640.0f; // Example image height

// // Define scale and resize factors
// std::vector scale = {img_width, img_height, img_width, img_height};
// float resize = 1.0f;

// // Scale and resize boxes
// scaleAndResizeBoxes(decoded_boxes, scale, resize);
// clipBoxes(decoded_boxes, img_width, img_height);
// // Debugging: Print the first few scaled and resized bounding boxes
// std::cout << “Scaled and Resized Bounding Boxes (first 5):” << std::endl;
// for (size_t i = 0; i < std::min(decoded_boxes.size(), size_t(5)); ++i) {
// std::cout << "Scaled Box " << i << “: (”
// << decoded_boxes[i][0] << ", "
// << decoded_boxes[i][1] << ", "
// << decoded_boxes[i][2] << ", "
// << decoded_boxes[i][3] << “)” << std::endl;
// }

// // Convert to BoundingBox structure for NMS and Top-K
// std::vector detections;
// for (size_t i = 0; i < decoded_boxes.size(); ++i) {
// BoundingBox box;
// box.x1 = decoded_boxes[i][0];
// box.y1 = decoded_boxes[i][1];
// box.x2 = decoded_boxes[i][2];
// box.y2 = decoded_boxes[i][3];
// box.score = confData[i]; // Assuming confData holds confidence scores
// detections.push_back(box);
// }

// // Apply Top-K selection
// int top_k = 750; // Adjust this value as needed
// detections = keepTopKDetections(detections, top_k);

// // Apply Non-Maximum Suppression (NMS)
// float nms_thresh = 0.4f; // Example threshold, adjust as needed
// auto nmsDetections = nms(detections, nms_thresh);

// // Debugging: Print the final detections after NMS
// std::cout << “Final Detections after NMS (first 5):” << std::endl;
// for (size_t i = 0; i < std::min(nmsDetections.size(), size_t(5)); ++i) {
// std::cout << "NMS Box " << i << “: (”
// << nmsDetections[i].x1 << ", "
// << nmsDetections[i].y1 << ", "
// << nmsDetections[i].x2 << ", "
// << nmsDetections[i].y2 << “)” << std::endl;
// }

// // Convert NMS results to NvDsInferObjectDetectionInfo format
// for (const auto &box : nmsDetections) {
// NvDsInferObjectDetectionInfo oinfo; // Fixed typo
// oinfo.classId = 0; // Assign appropriate class ID
// oinfo.left = static_cast(std::round(box.x1));
// oinfo.top = static_cast(std::round(box.y1));
// oinfo.width = static_cast(std::round(box.x2 - box.x1));
// oinfo.height = static_cast(std::round(box.y2 - box.y1));
// oinfo.detectionConfidence = box.score;

// objectList.push_back(oinfo);
// }

// // Optionally print the final object list for debugging
// std::cout << “Final Object List after NMS:\n”;
// printObjectList(objectList);

// return true;
// }
// codebase-1 stop

//codebase-2 start

// Define the structure to hold bounding box information
struct BoundingBox {
float x1, y1, x2, y2, score;
};

// // Function to calculate Intersection over Union (IoU)
// float IoU(const BoundingBox &a, const BoundingBox &b) {
// float inter_x1 = std::max(a.x1, b.x1);
// float inter_y1 = std::max(a.y1, b.y1);
// float inter_x2 = std::min(a.x2, b.x2);
// float inter_y2 = std::min(a.y2, b.y2);

// float inter_area = std::max(0.0f, inter_x2 - inter_x1 + 1) * std::max(0.0f, inter_y2 - inter_y1 + 1);
// float box_a_area = (a.x2 - a.x1 + 1) * (a.y2 - a.y1 + 1);
// float box_b_area = (b.x2 - b.x1 + 1) * (b.y2 - b.y1 + 1);

// return inter_area / (box_a_area + box_b_area - inter_area);
// }

// // Function to perform Non-Maximum Suppression (NMS)
// std::vector nms(std::vector &boxes, float iou_threshold) {
// std::vector result;

// // Sort boxes by score in descending order
// std::sort(boxes.begin(), boxes.end(), (const BoundingBox &a, const BoundingBox &b) {
// return a.score > b.score;
// });

// std::vector suppressed(boxes.size(), false);

// for (size_t i = 0; i < boxes.size(); ++i) {
// if (suppressed[i]) continue;

// result.push_back(boxes[i]);

// for (size_t j = i + 1; j < boxes.size(); ++j) {
// if (IoU(boxes[i], boxes[j]) > iou_threshold) {
// suppressed[j] = true;
// }
// }
// }

// return result;
// }

float IoU(const NvDsInferInstanceMaskInfo &a, const NvDsInferInstanceMaskInfo &b) {
float inter_x1 = std::max(a.left, b.left);
float inter_y1 = std::max(a.top, b.top);
float inter_x2 = std::min(a.left + a.width, b.left + b.width);
float inter_y2 = std::min(a.top + a.height, b.top + b.height);

float inter_area = std::max(0.0f, inter_x2 - inter_x1) * std::max(0.0f, inter_y2 - inter_y1);
float box_a_area = a.width * a.height;
float box_b_area = b.width * b.height;

return inter_area / (box_a_area + box_b_area - inter_area);

}

std::vector nms(std::vector &detections, float iou_threshold) {
std::vector result;

// Sort by detection confidence in descending order
std::sort(detections.begin(), detections.end(), [](const NvDsInferInstanceMaskInfo &a, const NvDsInferInstanceMaskInfo &b) {
    return a.detectionConfidence > b.detectionConfidence;
});

std::vector<bool> suppressed(detections.size(), false);

for (size_t i = 0; i < detections.size(); ++i) {
    if (suppressed[i]) continue;

    result.push_back(detections[i]);

    for (size_t j = i + 1; j < detections.size(); ++j) {
        if (IoU(detections[i], detections[j]) > iou_threshold) {
            suppressed[j] = true;
        }
    }
}

return result;

}

// // Function to keep only the top-K detections
// std::vector keepTopKDetections(std::vector &detections, int top_k) {
// // Sort detections by score
// std::sort(detections.begin(), detections.end(), (const BoundingBox &a, const BoundingBox &b) {
// return a.score > b.score;
// });

// // Keep only the top-K detections
// if (detections.size() > static_cast<size_t>(top_k)) {
// detections.resize(top_k);
// }

// return detections;
// }

// Function to keep only the top-K detections
std::vector keepTopKDetections(std::vector &detections, int top_k) {
// Sort detections by detectionConfidence (score)
std::sort(detections.begin(), detections.end(), (const NvDsInferInstanceMaskInfo &a, const NvDsInferInstanceMaskInfo &b) {
return a.detectionConfidence > b.detectionConfidence;
});

// Keep only the top-K detections
if (detections.size() > static_cast<size_t>(top_k)) {
    detections.resize(top_k);
}

return detections;

}

// Function to clip bounding boxes to image dimensions
void clipBoxes(std::vector<std::vector>& boxes, float img_width, float img_height) {
for (auto& box : boxes) {
box[0] = std::max(0.0f, std::min(box[0], img_width - 1));
box[1] = std::max(0.0f, std::min(box[1], img_height - 1));
box[2] = std::max(0.0f, std::min(box[2], img_width - 1));
box[3] = std::max(0.0f, std::min(box[3], img_height - 1));
}
}

// // The main parsing function with integrated Top-K and NMS filtering- code1
// static bool NvDsInferParseRetinaface(std::vector const &outputLayersInfo,
// NvDsInferNetworkInfo const &networkInfo,
// NvDsInferParseDetectionParams const &detectionParams,
// std::vector &objectList) {

// // Step 1: Access the layer output buffers
// float* boxesData = reinterpret_cast<float*>(outputLayersInfo[0].buffer);
// //float* confData = reinterpret_cast<float*>(outputLayersInfo[1].buffer);
// //printLayerInfo(outputLayersInfo);
// // Step 1: Access the confData buffer from Layer 2
// float* confData = reinterpret_cast<float*>(outputLayersInfo[2].buffer);

// // Step 2: Print details to ensure you’re accessing the right layer
// // std::cout << “Layer 2 (conf) Buffer Contents (first 10 pairs):” << std::endl;
// // for (size_t i = 0; i < 10; i += 2) {
// // std::cout << "Background Score: " << confData[i] << ", Object Score: " << confData[i + 1] << std::endl;
// // }

// // Step 3: Extract and print the object scores
// // std::vector objectScores;
// // std::cout << “Extracted Object Scores (first 10):” << std::endl;
// // for (size_t i = 0; i < 10; ++i) {
// // float objectScore = confData[i * 2 + 1]; // Every 2nd value should be the object score
// // objectScores.push_back(objectScore);
// // std::cout << “objectScores[” << i << "] = " << objectScores[i] << std::endl;
// // }

// // Step 2: Define the prior box configuration
// std::vector<std::vector> min_sizes = {{16, 32}, {64, 128}, {256, 512}};
// std::vector steps = {8, 16, 32};
// bool clip = true;
// std::vector image_size = {640, 640};

// PriorBox prior_box(min_sizes, steps, clip, image_size);
// auto prior_data = prior_box.forward();

// // Debugging: Print the first few prior boxes
// // std::cout << “Generated Prior Boxes (first 5):” << std::endl;
// // for (size_t i = 0; i < std::min(prior_data.size(), size_t(5)); ++i) {
// // std::cout << "Prior " << i << “: (”
// // << prior_data[i][0] << ", "
// // << prior_data[i][1] << ", "
// // << prior_data[i][2] << ", "
// // << prior_data[i][3] << “)” << std::endl;
// // }

// // Step 3: Convert raw output data to usable vectors
// size_t num_priors = 16800; // Number of prior boxes
// size_t num_values = 4; // Number of values per box

// std::vector loc_data(num_priors * num_values);
// std::vector priors_data(num_priors * num_values); // Placeholder for priors data

// // Copy data from raw output
// std::copy(boxesData, boxesData + loc_data.size(), loc_data.begin());

// // Convert prior data to a flat vector
// for (size_t i = 0; i < num_priors; ++i) {
// for (size_t j = 0; j < num_values; ++j) {
// priors_data[i * num_values + j] = prior_data[i][j];
// }
// }

// std::vector<std::vector> loc(num_priors, std::vector(num_values));
// std::vector<std::vector> priors(num_priors, std::vector(num_values));

// for (size_t i = 0; i < num_priors; ++i) {
// for (size_t j = 0; j < num_values; ++j) {
// loc[i][j] = loc_data[i * num_values + j];
// priors[i][j] = priors_data[i * num_values + j];
// }
// }

// // Step 4: Decode bounding box predictions
// std::vector variances = {0.1f, 0.2f};
// auto decoded_boxes = decode(loc, priors, variances);

// // // Debugging: Print the first few decoded bounding boxes
// // std::cout << “Decoded Bounding Boxes (first 5):” << std::endl;
// // for (size_t i = 0; i < std::min(decoded_boxes.size(), size_t(5)); ++i) {
// // std::cout << "Decoded Box " << i << “: (”
// // << decoded_boxes[i][0] << ", "
// // << decoded_boxes[i][1] << ", "
// // << decoded_boxes[i][2] << ", "
// // << decoded_boxes[i][3] << “)” << std::endl;
// // }

// // Step 5: Define image dimensions and scale factors
// float img_width = 640.0f; // Example image width
// float img_height = 640.0f; // Example image height

// std::vector scale = {img_width, img_height, img_width, img_height};
// float resize = 1.0f;

// // Step 6: Scale and resize boxes
// scaleAndResizeBoxes(decoded_boxes, scale, resize);
// clipBoxes(decoded_boxes, img_width, img_height);

// // Debugging: Print the first few scaled and resized bounding boxes
// // std::cout << “Scaled and Resized Bounding Boxes (first 5):” << std::endl;
// // for (size_t i = 0; i < std::min(decoded_boxes.size(), size_t(5)); ++i) {
// // std::cout << "Scaled Box " << i << “: (”
// // << decoded_boxes[i][0] << ", "
// // << decoded_boxes[i][1] << ", "
// // << decoded_boxes[i][2] << ", "
// // << decoded_boxes[i][3] << “)” << std::endl;
// // }

// // Step 7: Convert decoded boxes to BoundingBox structure for NMS and Top-K filtering
// std::vector detections;
// for (size_t i = 0; i < decoded_boxes.size(); ++i) {
// BoundingBox box;
// box.x1 = decoded_boxes[i][0];
// box.y1 = decoded_boxes[i][1];
// box.x2 = decoded_boxes[i][2];
// box.y2 = decoded_boxes[i][3];

// // Extract the object confidence score (assuming alternating pattern in confData)
// float object_confidence = confData[i * 2 + 1]; // Every 2nd value is the object score

// // Assign the extracted confidence score to the box
// box.score = object_confidence;

// detections.push_back(box);
// }

// // Step 8: Apply Top-K filtering before NMS
// int top_k = 750; // Adjust this value as needed
// detections = keepTopKDetections(detections, top_k);

// // Step 9: Apply Non-Maximum Suppression (NMS)
// float nms_thresh = 0.4f; // Example threshold, adjust as needed
// auto nmsDetections = nms(detections, nms_thresh);

// // Step 10: Apply Top-K filtering after NMS (if necessary)
// int keep_top_k = 750; // Adjust this value as needed
// nmsDetections = keepTopKDetections(nmsDetections, keep_top_k);

// // Debugging: Print the final detections after NMS
// // std::cout << “Final Detections after NMS (first 5):” << std::endl;
// // for (size_t i = 0; i < std::min(nmsDetections.size(), size_t(5)); ++i) {
// // std::cout << "NMS Box " << i << “: (”
// // << nmsDetections[i].x1 << ", "
// // << nmsDetections[i].y1 << ", "
// // << nmsDetections[i].x2 << ", "
// // << nmsDetections[i].y2 << “)” << std::endl;
// // }

// // Step 11: Convert NMS results to NvDsInferObjectDetectionInfo format
// // for (const auto &box : nmsDetections) {
// // NvDsInferObjectDetectionInfo oinfo;
// // oinfo.classId = 0; // Assign appropriate class ID
// // oinfo.left = static_cast(std::round(box.x1));
// // oinfo.top = static_cast(std::round(box.y1));
// // oinfo.width = static_cast(std::round(box.x2 - box.x1));
// // oinfo.height = static_cast(std::round(box.y2 - box.y1));
// // oinfo.detectionConfidence = box.score;

// // objectList.push_back(oinfo);
// // }

// // Step 11: Convert NMS results to NvDsInferObjectDetectionInfo format, applying confidence threshold filtering
// float confidence_threshold = 0.5f; // Example threshold, adjust as needed

// for (const auto &box : nmsDetections) {
// if (box.score >= confidence_threshold) { // Filter based on confidence threshold
// NvDsInferObjectDetectionInfo oinfo;
// oinfo.classId = 0; // Assign appropriate class ID
// oinfo.left = static_cast(std::round(box.x1));
// oinfo.top = static_cast(std::round(box.y1));
// oinfo.width = static_cast(std::round(box.x2 - box.x1));
// oinfo.height = static_cast(std::round(box.y2 - box.y1));
// oinfo.detectionConfidence = box.score;

// objectList.push_back(oinfo);
// }
// }

// // Optionally print the final object list for debugging
// // std::cout << “Final Object List after NMS:\n”;
// // printObjectList(objectList);

// return true;
// }

include
include
include
include

// Function to decode landmarks
std::vector<std::vector> decodeLandmarks(
const std::vector<std::vector>& landm,
const std::vector<std::vector>& priors,
const std::vector& variances)
{
size_t num_priors = priors.size();
size_t num_landmarks = landm.size();

if (num_landmarks != num_priors) {
    std::cerr << "Error: Number of priors does not match number of landmarks.\n";
    return {};
}

std::vector<std::vector<float>> decoded_landmarks(num_priors, std::vector<float>(10)); // 5 landmarks (x, y)

for (size_t i = 0; i < num_priors; ++i) {
    const auto& prior = priors[i];
    const auto& l = landm[i];

    for (size_t j = 0; j < 5; ++j) {
        float center_x = prior[0] + l[2 * j] * variances[0] * prior[2];
        float center_y = prior[1] + l[2 * j + 1] * variances[0] * prior[3];

        decoded_landmarks[i][2 * j] = center_x;
        decoded_landmarks[i][2 * j + 1] = center_y;
    }
}

return decoded_landmarks;

}

// Function to scale and resize landmarks
void scaleAndResizeLandmarks(std::vector<std::vector>& landmarks, const std::vector& scale, float resize) {
for (auto& landmark : landmarks) {
for (size_t i = 0; i < landmark.size(); ++i) {
// Apply scaling: Even indices (x coordinates) use width scaling, odd indices (y coordinates) use height scaling
landmark[i] = landmark[i] * scale[i % 2] / resize; // Alternate between width and height for scaling
}
}
}

// // Main parsing function
// static bool NvDsInferParseRetinaface(std::vector const &outputLayersInfo,
// NvDsInferNetworkInfo const &networkInfo,
// NvDsInferParseDetectionParams const &detectionParams,
// std::vector &objectList) {

// // Step 1: Access the layer output buffers
// float* boxesData = reinterpret_cast<float*>(outputLayersInfo[0].buffer);
// float* confData = reinterpret_cast<float*>(outputLayersInfo[2].buffer);
// float* landmData = reinterpret_cast<float*>(outputLayersInfo[1].buffer); // Assuming landmarks in the correct layer

// // Step 2: Define the prior box configuration
// std::vector<std::vector> min_sizes = {{16, 32}, {64, 128}, {256, 512}};
// std::vector steps = {8, 16, 32};
// bool clip = true;
// std::vector image_size = {640, 640};

// PriorBox prior_box(min_sizes, steps, clip, image_size);
// auto prior_data = prior_box.forward();

// size_t num_priors = 16800;
// size_t num_values = 4;

// std::vector loc_data(num_priors * num_values);
// std::vector priors_data(num_priors * num_values); // Placeholder for priors data

// std::copy(boxesData, boxesData + loc_data.size(), loc_data.begin());

// for (size_t i = 0; i < num_priors; ++i) {
// for (size_t j = 0; j < num_values; ++j) {
// priors_data[i * num_values + j] = prior_data[i][j];
// }
// }

// std::vector<std::vector> loc(num_priors, std::vector(num_values));
// std::vector<std::vector> priors(num_priors, std::vector(num_values));

// for (size_t i = 0; i < num_priors; ++i) {
// for (size_t j = 0; j < num_values; ++j) {
// loc[i][j] = loc_data[i * num_values + j];
// priors[i][j] = priors_data[i * num_values + j];
// }
// }

// std::vector variances = {0.1f, 0.2f};
// auto decoded_boxes = decode(loc, priors, variances);

// float img_width = 640.0f;
// float img_height = 640.0f;

// std::vector scale = {img_width, img_height, img_width, img_height};
// float resize = 1.0f;

// scaleAndResizeBoxes(decoded_boxes, scale, resize);
// clipBoxes(decoded_boxes, img_width, img_height);

// std::vector detections;
// for (size_t i = 0; i < decoded_boxes.size(); ++i) {
// BoundingBox box;
// box.x1 = decoded_boxes[i][0];
// box.y1 = decoded_boxes[i][1];
// box.x2 = decoded_boxes[i][2];
// box.y2 = decoded_boxes[i][3];

// float object_confidence = confData[i * 2 + 1];
// box.score = object_confidence;

// detections.push_back(box);
// }

// // Decode landmarks
// std::vector<std::vector> landm(num_priors, std::vector(10)); // Each landmark has 10 values
// for (size_t i = 0; i < num_priors; ++i) {
// for (size_t j = 0; j < 10; ++j) {
// landm[i][j] = landmData[i * 10 + j];
// }
// }
// auto decoded_landmarks = decodeLandmarks(landm, prior_data, variances);
// scaleAndResizeLandmarks(decoded_landmarks, scale, resize);

// // Optional: Print the decoded landmarks for debugging and check if they are inside the bounding box
// for (size_t i = 0; i < decoded_landmarks.size(); ++i) {
// const auto& landmark = decoded_landmarks[i];
// const auto& bbox = decoded_boxes[i];

// float bbox_left = bbox[0];
// float bbox_top = bbox[1];
// float bbox_right = bbox[2];
// float bbox_bottom = bbox[3];

// std::cout << "Landmark " << i + 1 << “:\n”;

// for (size_t j = 0; j < 5; ++j) { // 5 landmarks (x, y)
// float x = landmark[2 * j];
// float y = landmark[2 * j + 1];

// bool is_inside_bbox = (x >= bbox_left && x <= bbox_right && y >= bbox_top && y <= bbox_bottom);

// std::cout << "Point " << j + 1 << “: (” << x << ", " << y << ") ";
// if (is_inside_bbox) {
// std::cout << “is inside the bounding box.” << std::endl;
// } else {
// std::cout << “is outside the bounding box.” << std::endl;
// }
// }
// }

// int top_k = 750;
// detections = keepTopKDetections(detections, top_k);

// float nms_thresh = 0.4f;
// auto nmsDetections = nms(detections, nms_thresh);

// int keep_top_k = 750;
// nmsDetections = keepTopKDetections(nmsDetections, keep_top_k);

// float confidence_threshold = 0.5f;

// for (const auto &box : nmsDetections) {
// if (box.score >= confidence_threshold) {
// NvDsInferObjectDetectionInfo oinfo;
// oinfo.classId = 0;
// oinfo.left = static_cast(std::round(box.x1));
// oinfo.top = static_cast(std::round(box.y1));
// oinfo.width = static_cast(std::round(box.x2 - box.x1));
// oinfo.height = static_cast(std::round(box.y2 - box.y1));
// oinfo.detectionConfidence = box.score;

// objectList.push_back(oinfo);
// }
// }

// return true;
// }

// //codebase2-stop

// extern “C” bool NvDsInferParseCustomRetinaface(
// std::vector const &outputLayersInfo,
// NvDsInferNetworkInfo const &networkInfo,
// NvDsInferParseDetectionParams const &detectionParams,
// std::vector &objectList)
// {
// return NvDsInferParseRetinaface(
// outputLayersInfo, networkInfo, detectionParams, objectList);
// }

// /* Check that the custom function has been defined correctly */
// CHECK_CUSTOM_PARSE_FUNC_PROTOTYPE(NvDsInferParseCustomRetinaface);

// static bool NvDsInferParseRetinaface(std::vector const &outputLayersInfo,
// NvDsInferNetworkInfo const &networkInfo,
// NvDsInferParseDetectionParams const &detectionParams,
// std::vector &objectList) {

// // Step 1: Access the layer output buffers
// float* boxesData = reinterpret_cast<float*>(outputLayersInfo[0].buffer);
// float* confData = reinterpret_cast<float*>(outputLayersInfo[2].buffer);
// float* landmData = reinterpret_cast<float*>(outputLayersInfo[1].buffer); // Assuming landmarks in the correct layer

// // Step 2: Define the prior box configuration
// std::vector<std::vector> min_sizes = {{16, 32}, {64, 128}, {256, 512}};
// std::vector steps = {8, 16, 32};
// bool clip = true;
// std::vector image_size = {640, 640};

// PriorBox prior_box(min_sizes, steps, clip, image_size);
// auto prior_data = prior_box.forward();

// size_t num_priors = 16800;
// size_t num_values = 4;

// std::vector loc_data(num_priors * num_values);
// std::vector priors_data(num_priors * num_values); // Placeholder for priors data

// std::copy(boxesData, boxesData + loc_data.size(), loc_data.begin());

// for (size_t i = 0; i < num_priors; ++i) {
// for (size_t j = 0; j < num_values; ++j) {
// priors_data[i * num_values + j] = prior_data[i][j];
// }
// }

// std::vector<std::vector> loc(num_priors, std::vector(num_values));
// std::vector<std::vector> priors(num_priors, std::vector(num_values));

// for (size_t i = 0; i < num_priors; ++i) {
// for (size_t j = 0; j < num_values; ++j) {
// loc[i][j] = loc_data[i * num_values + j];
// priors[i][j] = priors_data[i * num_values + j];
// }
// }

// std::vector variances = {0.1f, 0.2f};
// auto decoded_boxes = decode(loc, priors, variances);

// float img_width = 640.0f;
// float img_height = 640.0f;

// std::vector scale = {img_width, img_height, img_width, img_height};
// float resize = 1.0f;

// scaleAndResizeBoxes(decoded_boxes, scale, resize);
// clipBoxes(decoded_boxes, img_width, img_height);

// // Prepare for storing detections
// std::vector detections;

// // Decode landmarks
// std::vector<std::vector> landm(num_priors, std::vector(10)); // Each landmark has 10 values
// for (size_t i = 0; i < num_priors; ++i) {
// for (size_t j = 0; j < 10; ++j) {
// landm[i][j] = landmData[i * 10 + j];
// }
// }
// auto decoded_landmarks = decodeLandmarks(landm, prior_data, variances);
// scaleAndResizeLandmarks(decoded_landmarks, scale, resize);

// // Add bounding boxes and landmarks to NvDsInferInstanceMaskInfo
// for (size_t i = 0; i < decoded_boxes.size(); ++i) {
// NvDsInferInstanceMaskInfo bbi;

// // Set bounding box
// bbi.left = decoded_boxes[i][0];
// bbi.top = decoded_boxes[i][1];
// bbi.width = decoded_boxes[i][2] - decoded_boxes[i][0];
// bbi.height = decoded_boxes[i][3] - decoded_boxes[i][1];

// // Set detection confidence (object score)
// float object_confidence = confData[i * 2 + 1];
// bbi.detectionConfidence = object_confidence;

// // Allocate memory for landmarks (10 elements: x1, y1, …, x5, y5)
// bbi.mask_size = 10 * sizeof(float);
// bbi.mask = new float[10];

// // Copy the landmarks (x1, y1, x2, y2, …, x5, y5)
// for (size_t j = 0; j < 10; ++j) {
// bbi.mask[j] = decoded_landmarks[i][j];
// }

// // Set mask dimensions to indicate it’s used for landmarks
// bbi.mask_width = 5; // 5 landmarks
// bbi.mask_height = 2; // Each has x and y (2 values per landmark)

// // Add the bounding box and landmarks to the detections list
// detections.push_back(bbi);
// }

// // Perform NMS on the bounding boxes
// float nms_thresh = 0.4f;
// auto nmsDetections = nms(detections, nms_thresh);

// // Keep top-k detections after NMS
// int keep_top_k = 750;
// nmsDetections = keepTopKDetections(nmsDetections, keep_top_k);

// // Add valid detections to the final objectList
// float confidence_threshold = 0.5f;
// for (const auto &detection : nmsDetections) {
// if (detection.detectionConfidence >= confidence_threshold) {
// objectList.push_back(detection);
// }
// }

// return true;
// }

// static bool NvDsInferParseRetinaface(std::vector const &outputLayersInfo,
// NvDsInferNetworkInfo const &networkInfo,
// NvDsInferParseDetectionParams const &detectionParams,
// std::vector &objectList) {

// // Step 1: Access the layer output buffers
// float* boxesData = reinterpret_cast<float*>(outputLayersInfo[0].buffer);
// float* confData = reinterpret_cast<float*>(outputLayersInfo[2].buffer);
// float* landmData = reinterpret_cast<float*>(outputLayersInfo[1].buffer); // Assuming landmarks are in this layer

// // Step 2: Define the prior box configuration
// std::vector<std::vector> min_sizes = {{16, 32}, {64, 128}, {256, 512}};
// std::vector steps = {8, 16, 32};
// bool clip = true;
// std::vector image_size = {640, 640};

// PriorBox prior_box(min_sizes, steps, clip, image_size);
// auto prior_data = prior_box.forward();

// size_t num_priors = 16800;
// size_t num_values = 4;

// // Prepare for bounding box data
// std::vector loc_data(num_priors * num_values);
// std::copy(boxesData, boxesData + loc_data.size(), loc_data.begin());

// std::vector<std::vector> loc(num_priors, std::vector(num_values));
// std::vector<std::vector> priors = prior_data; // Assume priors are correctly generated

// // Decode bounding boxes
// std::vector variances = {0.1f, 0.2f};
// auto decoded_boxes = decode(loc, priors, variances);

// // Image dimensions and scaling
// float img_width = 640.0f;
// float img_height = 640.0f;
// std::vector scale = {img_width, img_height, img_width, img_height};
// float resize = 1.0f;

// // Scale and resize the decoded boxes
// scaleAndResizeBoxes(decoded_boxes, scale, resize);
// clipBoxes(decoded_boxes, img_width, img_height);

// // Step 3: Prepare for storing detections
// std::vector detections;

// // Decode landmarks
// std::vector<std::vector> landm(num_priors, std::vector(10)); // Each landmark has 10 values
// for (size_t i = 0; i < num_priors; ++i) {
// for (size_t j = 0; j < 10; ++j) {
// landm[i][j] = landmData[i * 10 + j];
// }
// }

// auto decoded_landmarks = decodeLandmarks(landm, priors, variances);
// scaleAndResizeLandmarks(decoded_landmarks, scale, resize);

// // Step 4: Add bounding boxes and landmarks to NvDsInferInstanceMaskInfo
// for (size_t i = 0; i < decoded_boxes.size(); ++i) {
// NvDsInferInstanceMaskInfo bbi;

// // Set bounding box
// bbi.left = decoded_boxes[i][0];
// bbi.top = decoded_boxes[i][1];
// bbi.width = decoded_boxes[i][2] - decoded_boxes[i][0];
// bbi.height = decoded_boxes[i][3] - decoded_boxes[i][1];

// // Set detection confidence (object score)
// float object_confidence = confData[i * 2 + 1];
// bbi.detectionConfidence = object_confidence;

// // Allocate memory for landmarks (10 elements: x1, y1, …, x5, y5)
// bbi.mask_size = 10 * sizeof(float);
// bbi.mask = new float[10];

// // Copy the landmarks (x1, y1, x2, y2, …, x5, y5)
// for (size_t j = 0; j < 10; ++j) {
// bbi.mask[j] = decoded_landmarks[i][j];
// }

// // Set mask dimensions to indicate it’s used for landmarks
// bbi.mask_width = 5; // 5 landmarks
// bbi.mask_height = 2; // Each has x and y (2 values per landmark)

// // Add the bounding box and landmarks to the detections list
// detections.push_back(bbi);
// }

// // Step 5: Perform NMS on the bounding boxes
// float nms_thresh = 0.4f;
// auto nmsDetections = nms(detections, nms_thresh);

// // Step 6: Keep top-k detections after NMS
// int keep_top_k = 750;
// nmsDetections = keepTopKDetections(nmsDetections, keep_top_k);

// // Step 7: Add valid detections to the final objectList
// float confidence_threshold = 0.5f;
// for (const auto &detection : nmsDetections) {
// if (detection.detectionConfidence >= confidence_threshold) {
// objectList.push_back(detection);
// }
// }

// return true;
// }

// static bool NvDsInferParseRetinaface(std::vector const &outputLayersInfo,
// NvDsInferNetworkInfo const &networkInfo,
// NvDsInferParseDetectionParams const &detectionParams,
// std::vector &objectList) {

// // Step 1: Access the layer output buffers
// float* boxesData = reinterpret_cast<float*>(outputLayersInfo[0].buffer);
// float* confData = reinterpret_cast<float*>(outputLayersInfo[2].buffer);
// float* landmData = reinterpret_cast<float*>(outputLayersInfo[1].buffer); // Assuming landmarks are in this layer

// // Step 2: Define the prior box configuration
// std::vector<std::vector> min_sizes = {{16, 32}, {64, 128}, {256, 512}};
// std::vector steps = {8, 16, 32};
// bool clip = true;
// std::vector image_size = {640, 640};

// PriorBox prior_box(min_sizes, steps, clip, image_size);
// auto prior_data = prior_box.forward();

// size_t num_priors = 16800;
// size_t num_values = 4;

// // Prepare for bounding box data
// std::vector loc_data(num_priors * num_values);
// std::copy(boxesData, boxesData + loc_data.size(), loc_data.begin());

// std::vector<std::vector> loc(num_priors, std::vector(num_values));
// std::vector<std::vector> priors = prior_data; // Assume priors are correctly generated

// // Decode bounding boxes
// std::vector variances = {0.1f, 0.2f};
// auto decoded_boxes = decode(loc, priors, variances);

// // Image dimensions and scaling
// float img_width = 640.0f;
// float img_height = 640.0f;
// std::vector scale = {img_width, img_height, img_width, img_height};
// float resize = 1.0f;

// // Scale and resize the decoded boxes
// scaleAndResizeBoxes(decoded_boxes, scale, resize);
// clipBoxes(decoded_boxes, img_width, img_height);

// // Step 3: Prepare for storing detections
// std::vector detections;

// // Decode landmarks
// std::vector<std::vector> landm(num_priors, std::vector(10)); // Each landmark has 10 values
// for (size_t i = 0; i < num_priors; ++i) {
// for (size_t j = 0; j < 10; ++j) {
// landm[i][j] = landmData[i * 10 + j];
// }
// }

// auto decoded_landmarks = decodeLandmarks(landm, priors, variances);
// scaleAndResizeLandmarks(decoded_landmarks, scale, resize);

// // Step 4: Add bounding boxes and landmarks to NvDsInferInstanceMaskInfo
// for (size_t i = 0; i < decoded_boxes.size(); ++i) {
// NvDsInferInstanceMaskInfo bbi;

// // Set bounding box
// bbi.left = decoded_boxes[i][0];
// bbi.top = decoded_boxes[i][1];
// bbi.width = decoded_boxes[i][2] - decoded_boxes[i][0];
// bbi.height = decoded_boxes[i][3] - decoded_boxes[i][1];

// // Set detection confidence (object score)
// float object_confidence = confData[i * 2 + 1];
// bbi.detectionConfidence = object_confidence;

// // Print the bounding box information
// std::cout << "Bounding Box " << i + 1 << ": "
// << "Left: " << bbi.left << ", Top: " << bbi.top
// << ", Width: " << bbi.width << ", Height: " << bbi.height
// << ", Confidence: " << bbi.detectionConfidence << std::endl;

// // Allocate memory for landmarks (10 elements: x1, y1, …, x5, y5)
// bbi.mask_size = 10 * sizeof(float);
// bbi.mask = new float[10];

// // Copy the landmarks (x1, y1, x2, y2, …, x5, y5)
// for (size_t j = 0; j < 10; ++j) {
// bbi.mask[j] = decoded_landmarks[i][j];
// }

// // Print the landmark information
// std::cout << "Landmarks for Bounding Box " << i + 1 << ": ";
// for (size_t j = 0; j < 5; ++j) {
// std::cout << "Point " << j + 1 << “: (”
// << decoded_landmarks[i][2 * j] << ", "
// << decoded_landmarks[i][2 * j + 1] << ") ";
// }
// std::cout << std::endl;

// // Set mask dimensions to indicate it’s used for landmarks
// bbi.mask_width = 5; // 5 landmarks
// bbi.mask_height = 2; // Each has x and y (2 values per landmark)

// // Add the bounding box and landmarks to the detections list
// detections.push_back(bbi);
// }

// // Step 5: Perform NMS on the bounding boxes
// float nms_thresh = 0.4f;
// auto nmsDetections = nms(detections, nms_thresh);

// // Step 6: Keep top-k detections after NMS
// int keep_top_k = 750;
// nmsDetections = keepTopKDetections(nmsDetections, keep_top_k);

// // Step 7: Add valid detections to the final objectList
// float confidence_threshold = 0.5f;
// for (const auto &detection : nmsDetections) {
// if (detection.detectionConfidence >= confidence_threshold) {
// objectList.push_back(detection);
// }
// }

// return true;
// }

// //Codeset-1-raw
// static bool NvDsInferParseRetinaface(std::vector const &outputLayersInfo,
// NvDsInferNetworkInfo const &networkInfo,
// NvDsInferParseDetectionParams const &detectionParams,
// std::vector &objectList) {

// // Step 1: Access the layer output buffers
// float* boxesData = reinterpret_cast<float*>(outputLayersInfo[0].buffer);
// float* confData = reinterpret_cast<float*>(outputLayersInfo[2].buffer);
// float* landmData = reinterpret_cast<float*>(outputLayersInfo[1].buffer); // Assuming landmarks are in this layer

// // Step 2: Define the prior box configuration
// std::vector<std::vector> min_sizes = {{16, 32}, {64, 128}, {256, 512}};
// std::vector steps = {8, 16, 32};
// bool clip = true;
// std::vector image_size = {640, 640};

// PriorBox prior_box(min_sizes, steps, clip, image_size);
// auto prior_data = prior_box.forward();

// size_t num_priors = 16800;
// size_t num_values = 4;

// // Prepare for bounding box data
// std::vector loc_data(num_priors * num_values);
// std::copy(boxesData, boxesData + loc_data.size(), loc_data.begin());

// std::vector<std::vector> loc(num_priors, std::vector(num_values));
// std::vector<std::vector> priors = prior_data; // Assume priors are correctly generated

// // Decode bounding boxes
// std::vector variances = {0.1f, 0.2f};
// auto decoded_boxes = decode(loc, priors, variances);

// // Image dimensions and scaling
// float img_width = 640.0f;
// float img_height = 640.0f;
// std::vector scale = {img_width, img_height, img_width, img_height};
// float resize = 1.0f;

// // Scale and resize the decoded boxes
// scaleAndResizeBoxes(decoded_boxes, scale, resize);
// clipBoxes(decoded_boxes, img_width, img_height);

// // Step 3: Prepare for storing detections
// std::vector detections;

// // Decode landmarks
// std::vector<std::vector> landm(num_priors, std::vector(10)); // Each landmark has 10 values
// for (size_t i = 0; i < num_priors; ++i) {
// for (size_t j = 0; j < 10; ++j) {
// landm[i][j] = landmData[i * 10 + j];
// }
// }

// auto decoded_landmarks = decodeLandmarks(landm, priors, variances);
// scaleAndResizeLandmarks(decoded_landmarks, scale, resize);

// // Step 4: Add bounding boxes and landmarks to NvDsInferInstanceMaskInfo
// for (size_t i = 0; i < decoded_boxes.size(); ++i) {
// NvDsInferInstanceMaskInfo bbi;

// // Set bounding box
// bbi.left = decoded_boxes[i][0];
// bbi.top = decoded_boxes[i][1];
// bbi.width = decoded_boxes[i][2] - decoded_boxes[i][0];
// bbi.height = decoded_boxes[i][3] - decoded_boxes[i][1];

// // Set detection confidence (object score)
// float object_confidence = confData[i * 2 + 1];
// bbi.detectionConfidence = object_confidence;

// // Apply Confidence Threshold: Only consider boxes with a high enough confidence score
// float confidence_threshold = 0.5f; // Adjust this value as needed
// if (bbi.detectionConfidence < confidence_threshold) {
// continue; // Skip low-confidence detections
// }

// // Print the bounding box information with improved formatting
// std::cout << "Detection " << i + 1 << “:\n”;
// std::cout << " Bounding Box: "
// << "Left = " << bbi.left << ", Top = " << bbi.top
// << ", Width = " << bbi.width << ", Height = " << bbi.height
// << ", Confidence = " << bbi.detectionConfidence << “\n”;

// // Allocate memory for landmarks (10 elements: x1, y1, …, x5, y5)
// bbi.mask_size = 10 * sizeof(float);
// bbi.mask = new float[10];

// // Copy the landmarks (x1, y1, x2, y2, …, x5, y5)
// for (size_t j = 0; j < 10; ++j) {
// bbi.mask[j] = decoded_landmarks[i][j];
// }

// // Print the landmark information in a readable format and check if the points are inside the bounding box
// std::cout << " Landmarks:\n";
// for (size_t j = 0; j < 5; ++j) {
// float x = decoded_landmarks[i][2 * j];
// float y = decoded_landmarks[i][2 * j + 1];
// std::cout << " Point " << j + 1 << “: (”
// << x << ", "
// << y << ") ";

// // Check if the landmark is inside the bounding box
// bool is_inside_bbox = (x >= bbi.left && x <= (bbi.left + bbi.width) &&
// y >= bbi.top && y <= (bbi.top + bbi.height));

// if (is_inside_bbox) {
// std::cout << “(Inside BBox)\n”;
// } else {
// std::cout << “(Outside BBox)\n”;
// }
// }
// std::cout << std::endl;

// // Set mask dimensions to indicate it’s used for landmarks
// bbi.mask_width = 5; // 5 landmarks
// bbi.mask_height = 2; // Each has x and y (2 values per landmark)

// // Add the bounding box and landmarks to the detections list
// detections.push_back(bbi);
// }

// // Step 5: Perform NMS on the bounding boxes
// float nms_thresh = 0.4f; // Adjust this threshold if needed
// auto nmsDetections = nms(detections, nms_thresh);

// // Step 6: Keep top-k detections after NMS
// int keep_top_k = 750; // You can lower this number to limit the results
// nmsDetections = keepTopKDetections(nmsDetections, keep_top_k);

// // Step 7: Add valid detections to the final objectList
// for (const auto &detection : nmsDetections) {
// objectList.push_back(detection);
// }

// return true;
// }

static bool NvDsInferParseRetinaface(std::vector const &outputLayersInfo,
NvDsInferNetworkInfo const &networkInfo,
NvDsInferParseDetectionParams const &detectionParams,
std::vector &objectList) {

// Step 1: Access the layer output buffers
float* boxesData = reinterpret_cast<float*>(outputLayersInfo[0].buffer);
float* confData = reinterpret_cast<float*>(outputLayersInfo[2].buffer);
float* landmData = reinterpret_cast<float*>(outputLayersInfo[1].buffer);  // Assuming landmarks are in this layer

// Step 2: Define the prior box configuration
std::vector<std::vector<float>> min_sizes = {{16, 32}, {64, 128}, {256, 512}};
std::vector<float> steps = {8, 16, 32};
bool clip = true;
std::vector<int> image_size = {640, 640};

PriorBox prior_box(min_sizes, steps, clip, image_size);
auto prior_data = prior_box.forward();

size_t num_priors = 16800;
size_t num_values = 4;

// Prepare for bounding box data
std::vector<float> loc_data(num_priors * num_values);
std::copy(boxesData, boxesData + loc_data.size(), loc_data.begin());

std::vector<std::vector<float>> loc(num_priors, std::vector<float>(num_values));
std::vector<std::vector<float>> priors = prior_data;  // Assume priors are correctly generated

// Decode bounding boxes
std::vector<float> variances = {0.1f, 0.2f};
auto decoded_boxes = decode(loc, priors, variances);

// Image dimensions and scaling
float img_width = 640.0f;
float img_height = 640.0f;
std::vector<float> scale = {img_width, img_height, img_width, img_height};
float resize = 1.0f;

// Scale and resize the decoded boxes
scaleAndResizeBoxes(decoded_boxes, scale, resize);
clipBoxes(decoded_boxes, img_width, img_height);

// Step 3: Prepare for storing detections
std::vector<NvDsInferInstanceMaskInfo> detections;

// Decode landmarks
std::vector<std::vector<float>> landm(num_priors, std::vector<float>(10));  // Each landmark has 10 values
for (size_t i = 0; i < num_priors; ++i) {
    for (size_t j = 0; j < 10; ++j) {
        landm[i][j] = landmData[i * 10 + j];
    }
}

auto decoded_landmarks = decodeLandmarks(landm, priors, variances);
scaleAndResizeLandmarks(decoded_landmarks, scale, resize);

// Step 4: Add bounding boxes and landmarks to NvDsInferInstanceMaskInfo
for (size_t i = 0; i < decoded_boxes.size(); ++i) {
    NvDsInferInstanceMaskInfo bbi;

    // Set bounding box
    bbi.left = decoded_boxes[i][0];
    bbi.top = decoded_boxes[i][1];
    bbi.width = decoded_boxes[i][2] - decoded_boxes[i][0];
    bbi.height = decoded_boxes[i][3] - decoded_boxes[i][1];
    
    // Set detection confidence (object score)
    float object_confidence = confData[i * 2 + 1];
    bbi.detectionConfidence = object_confidence;

    // **Apply Confidence Threshold**: Only consider boxes with a high enough confidence score
    float confidence_threshold = 0.5f;  // Adjust this value as needed
    if (bbi.detectionConfidence < confidence_threshold) {
        continue;  // Skip low-confidence detections
    }

    // Check if all landmarks are inside the bounding box
    bool all_landmarks_inside = true;

    std::cout << "Detection " << i + 1 << ":\n";
    std::cout << "  Bounding Box: "
              << "Left = " << bbi.left << ", Top = " << bbi.top
              << ", Width = " << bbi.width << ", Height = " << bbi.height
              << ", Confidence = " << bbi.detectionConfidence << "\n";

    // Allocate memory for landmarks (10 elements: x1, y1, ..., x5, y5)
    bbi.mask_size = 10 * sizeof(float);
    bbi.mask = new float[10];

    // Check and copy the landmarks (x1, y1, x2, y2, ..., x5, y5)
    std::cout << "  Landmarks:\n";
    for (size_t j = 0; j < 5; ++j) {
        float x = decoded_landmarks[i][2 * j];
        float y = decoded_landmarks[i][2 * j + 1];
        bbi.mask[2 * j] = x;
        bbi.mask[2 * j + 1] = y;

        std::cout << "    Point " << j + 1 << ": (" << x << ", " << y << ") ";

        // Check if the landmark is inside the bounding box
        bool is_inside_bbox = (x >= bbi.left && x <= (bbi.left + bbi.width) &&
                               y >= bbi.top && y <= (bbi.top + bbi.height));

        if (!is_inside_bbox) {
            all_landmarks_inside = false;  // If any point is outside, set this flag to false
        }

        std::cout << (is_inside_bbox ? "(Inside BBox)\n" : "(Outside BBox)\n");
    }

    // Only add detection to the final objectList if all landmarks are inside the bounding box
    if (all_landmarks_inside) {
        // Set mask dimensions to indicate it's used for landmarks
        bbi.mask_width = 5;  // 5 landmarks
        bbi.mask_height = 2; // Each has x and y (2 values per landmark)

        // Add the bounding box and landmarks to the detections list
        detections.push_back(bbi);
        std::cout << "  => All landmarks are inside the bounding box. Added to final detections.\n";
    } else {
        delete[] bbi.mask;  // Free memory for mask if not used
        std::cout << "  => Not all landmarks are inside the bounding box. Skipped.\n";
    }

    std::cout << std::endl;
}

// Step 5: Perform NMS on the bounding boxes
float nms_thresh = 0.4f;  // Adjust this threshold if needed
auto nmsDetections = nms(detections, nms_thresh);

// Step 6: Keep top-k detections after NMS
int keep_top_k = 750;  // You can lower this number to limit the results
nmsDetections = keepTopKDetections(nmsDetections, keep_top_k);

// Step 7: Add valid detections to the final objectList
for (const auto &detection : nmsDetections) {
    objectList.push_back(detection);
}

return true;

}

// Wrapper function for the custom parser
extern “C” bool NvDsInferParseCustomRetinaface(
std::vector const &outputLayersInfo,
NvDsInferNetworkInfo const &networkInfo,
NvDsInferParseDetectionParams const &detectionParams,
std::vector &objectList)
{
return NvDsInferParseRetinaface(
outputLayersInfo, networkInfo, detectionParams, objectList);
}

/* Check that the custom function has been defined correctly */
CHECK_CUSTOM_INSTANCE_MASK_PARSE_FUNC_PROTOTYPE(NvDsInferParseCustomRetinaface);

You can refer to this community user’s example

When you store the output of the model into std::vector, nvinfer element will convert it to metadata for you.

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.