KLT NvMOT tracking

Hi,

After our previous question ( https://devtalk.nvidia.com/default/topic/1066252/deepstream-sdk/klt-nvmot-usage/ ) we are able to configure and process frames with the KLT tracker.

Our detector detects objects on every second frame, we would like to do image based tracking and get the objects’ estimated positions on the intermediate frames.

NvMOT_Process has three parameters:
NvMOT_Process( NvMOTContextHandle contextHandle, NvMOTProcessParams *pParams, NvMOTTrackedObjBatch *pTrackedObjectsBatch )

pParams has a frameList member where we can specify objectsIn for each input frame.

As far as we understand objectsIn should contain object boundaries to be tracked.

We set frame.objectsIn.detectionDone true if our detector ran, false on the intermediate frames.

We fill frame.objectsIn.list with the objects to track. We do not understand however object.doTracking field. Our guess is (in our use case) it should be always true (both the detected and intermediate frames).

pTrackedObjectsBatch has a list member for each batch (in our case it is 1), each batch contains a NvMOTTrackedObjList, we pre-allocate it for input object count (unfilled).

We call process, it runs without error. Output object list contains exactly the same amount of objects as the input. The age member of the output objects is always 0, associatedObjectIn contains one of the input objects, the confidence level is always 1.1 (?), bbox is exactly the same as the input bbox (even on the intermediate frames).

Could you help us please what should we fill
-on the frames where there is no detection
-on the frames where there is detection

Kind regards,
Adam

Is that possible you extract your source which can run in our environments? we would like to repro it locally and do a further check.

Yes, of course. You must mock some part of the code (input image and input objects). Is it ok with you?
https://pastebin.com/JQNqzCH9

Oh I can not open it, will try after back home

This site can’t be reached The connection was reset.
Try:

Checking the connection
Checking the proxy and the firewall
Running Windows Network Diagnostics
ERR_CONNECTION_RESET

Oh. I do not know what happened, I can still reach the link. Then I paste it here, I’m sorry it is a bit long snippet:

auto startPoint = std::chrono::high_resolution_clock::now();
    char path[] = "/opt/nvidia/deepstream/deepstream-4.0/samples/configs/deepstream-app/tracker_config.yml";
 
    NvMOTQuery query{};
    {
        const auto status = NvMOT_Query( sizeof( path ), path, &query );
        if( status != NvMOTStatus_OK ) {
            std::cout << "Error";
        }
    }
 
    /* INIT */
 
    void* yDevMem;
    const uint32_t width = static_cast< uint32_t >( m_currentFrame.cols );
    const uint32_t height = static_cast< uint32_t >( m_currentFrame.rows );
    const uint32_t pitch = static_cast< uint32_t >( m_currentFrame.step1() );
    //cudaMallocManaged( &yDevMem, width * height, cudaMemAttachGlobal );
    yDevMem = m_currentFrame.data;
    const uint32_t fullSize = pitch * height;
 
    static bool initCompleted{ false };
    static NvMOTContextHandle pContextHandle{};
 
    if( !initCompleted ) {
        // IN params
        NvMOTPerTransformBatchConfig batchConfig[ 1 ]{};
        batchConfig->bufferType = NVBUF_MEM_CUDA_UNIFIED;
        batchConfig->colorFormat = NVBUF_COLOR_FORMAT_GRAY8;
        batchConfig->maxHeight = height;
        batchConfig->maxPitch = pitch;
        batchConfig->maxSize = fullSize;
        batchConfig->maxWidth = width;
 
        NvMOTConfig pConfigIn{};
        pConfigIn.computeConfig = NVMOTCOMP_CPU;                    /**< Compute target. see NvMOTCompute */
        pConfigIn.maxStreams = 1;                                   /**< Maximum number of streams in a batch. */
        pConfigIn.numTransforms = 1;                                /**< Number of NvMOTPerTransformBatchConfig entries in perTransformBatchConfig */
        pConfigIn.perTransformBatchConfig = batchConfig;            /**< List of numTransform batch configs including type and resolution, one for each transform*/
        pConfigIn.miscConfig.gpuId = 0;                             /**< GPU to be used. */
        pConfigIn.miscConfig.maxObjPerBatch = 0;                    /**< Max number of objects to track per stream. 0 means no limit. */
        pConfigIn.miscConfig.maxObjPerStream = 0;                   /**< Max number of objects to track per batch. 0 means no limit. */
        pConfigIn.customConfigFilePathSize = sizeof( path ) ;       /**< The char length in customConfigFilePath */
        pConfigIn.customConfigFilePath = path;                      /**< Path to the tracker's custom config file. Null terminated */
 
        // OUT Params
        NvMOTConfigResponse pConfigResponse{};
 
        {
            const auto status = NvMOT_Init( &pConfigIn, &pContextHandle, &pConfigResponse );
            if( status != NvMOTStatus_OK ) {
                std::cout << "Error";
            } else {
                initCompleted = true;
            }
        }
    }
 
    /* PROCESS */
 
    // IN Params
    NvBufSurfaceParams bufferParam[ 1 ]{};
    bufferParam->width = width;                             /** width of buffer */
    bufferParam->height = height;                           /** height of buffer */
    bufferParam->pitch = pitch;                             /** pitch of buffer */
    bufferParam->colorFormat = NVBUF_COLOR_FORMAT_GRAY8;        /** color format */
    bufferParam->layout = NVBUF_LAYOUT_PITCH;               /** BL or PL for Jetson, ONLY PL in case of dGPU */
    bufferParam->dataSize = fullSize;                           /** size of allocated memory */
    bufferParam->dataPtr = yDevMem;                     /** pointer to allocated memory, Not valid for NVBUF_MEM_SURFACE_ARRAY and NVBUF_MEM_HANDLE */
    bufferParam->planeParams.num_planes = 1;                        /** Number of planes */
    bufferParam->planeParams.width[ 0 ] = width;                /** width of planes */
    bufferParam->planeParams.height[ 0 ] = height;              /** height of planes */
    bufferParam->planeParams.pitch[ 0 ] = pitch;                /** pitch of planes in bytes */
    bufferParam->planeParams.offset[ 0 ] = 0;                       /** offsets of planes in bytes */
    bufferParam->planeParams.psize[ 0 ] = pitch * height;   /** size of planes in bytes */
    bufferParam->planeParams.bytesPerPix[ 0 ] = 1;                  /** bytes taken for each pixel */
 
    bufferParam->mappedAddr.addr[ 0 ] = yDevMem;            /** pointers of mapped buffers. Null Initialized values.*/
    bufferParam->mappedAddr.eglImage = nullptr;
    //bufferParam->bufferDesc;  /** dmabuf fd in case of NVBUF_MEM_SURFACE_ARRAY and NVBUF_MEM_HANDLE type memory. Invalid for other types. */
 
    NvBufSurfaceParams* bufferParamPtr{ bufferParam };
 
    static std::vector< NvMOTObjToTrack > inObjectVec{};
    static std::vector< NvMOTTrackedObj > outObjectVec{};
 
    size_t currObjectCount{ 0 };
    for( const TMetadataV2ObjectHeader::TObjectType currType: objectTypes ) {
        currObjectCount += m_objects.at( currType ).size();
    }
    if( inObjectVec.size() < currObjectCount ) {
        inObjectVec.resize( currObjectCount );
    }
    if( outObjectVec.size() < currObjectCount ) {
        outObjectVec.resize( currObjectCount );
    }
 
    size_t currInObjectIndex{ 0 };
 
    /* Iterate through the possible object types */
    /* The inner loop operates on the vector of tracked objects from that type */
    /* You can mock this section with generated objects */
    for( const TMetadataV2ObjectHeader::TObjectType currObjType: objectTypes ) {
        for( const auto& currObj: m_objects.at( currObjType ) ) {
            NvMOTObjToTrack& currInObject = inObjectVec[ currInObjectIndex++ ];
            currInObject.classId = static_cast< uint16_t >( currObjType );              /**< Class of the object to be tracked. */
            currInObject.bbox.x = currObj->getBBoxImage().x;                /**< Bounding box. */
            currInObject.bbox.y = currObj->getBBoxImage().y;
            currInObject.bbox.width = currObj->getBBoxImage().width;
            currInObject.bbox.height = currObj->getBBoxImage().height;
            currInObject.confidence = 1.f;          /**< Detection confidence of the object. */
            currInObject.doTracking = true;         /**< True: track this object.  False: do not initiate  tracking on this object. */
        }
    }
 
    NvMOTFrame frame{};
    frame.streamID = 0;                             /**< The stream source for this frame. */
    frame.frameNum = m_frameNum;                                /**< Frame number sequentially identifying the frame within a stream. */
    frame.timeStamp = m_currFrameTimestamp;                         /**< Timestamp of the frame at the time of capture. */
    frame.timeStampValid = true;                    /**< The timestamp value is properly populated. */
    frame.doTracking = true;                        /**< True: track objects in this frame; False: do not track this frame. */
    frame.reset = false;                            /**< True: reset tracking for the stream. */
    frame.numBuffers = 1;                           /**< Number of entries in bufferList. */
    frame.bufferList = &bufferParamPtr;             /**< Array of pointers to buffer params. */
    frame.objectsIn.detectionDone = ( m_frameNum % 2 == 0 ); // We detect on every second image
    frame.objectsIn.numAllocated = inObjectVec.size();
    frame.objectsIn.numFilled = currObjectCount;
    frame.objectsIn.list = inObjectVec.data();
 
    NvMOTProcessParams processParams{};
    processParams.numFrames = 1;
    processParams.frameList = &frame;
 
    // OUT Params
    NvMOTTrackedObjBatch outTrackedBatch{};
    NvMOTTrackedObjList outBatchObjects{};
    outBatchObjects.list = outObjectVec.data();
    outBatchObjects.streamID = 0;      /**< Stream associated with objects in the list. */
    outBatchObjects.frameNum = m_frameNum;    /**< Frame number for objects in the list. */
    outBatchObjects.valid = true;             /**< This entry in the batch is valid */
    outBatchObjects.numAllocated = outObjectVec.size();  /**< Number of blocks allocated for the list. */
    outBatchObjects.numFilled = /*outObjVec.size()*/0;     /**< Number of populated blocks in the list. */
 
    outTrackedBatch.numAllocated = 1;
    outTrackedBatch.numFilled = 1;
    outTrackedBatch.list = &outBatchObjects;
 
    {
        const auto status = NvMOT_Process( pContextHandle, &processParams, &outTrackedBatch );
        if( status != NvMOTStatus_OK ) {
            std::cout << "Error";
        }
    }
 
    for( size_t outIndex = 0; outIndex < outBatchObjects.numFilled; ++outIndex ) {
        const auto& currOutObj{ outObjectVec[ outIndex ] };
        const auto currOutAssociated{ currOutObj.associatedObjectIn };
        if( currOutAssociated != nullptr ) {
            std::cout << "Ref [x: " << currOutAssociated->bbox.x << " y: " << currOutAssociated->bbox.y
                      << " w: " << currOutAssociated->bbox.width << " h: " << currOutAssociated->bbox.height << "] "
                      << " Tracked [x: " << currOutObj.bbox.x << " y: " << currOutObj.bbox.y
                      << " w: " << currOutObj.bbox.width << " h: " << currOutObj.bbox.height << "]" << std::endl;
        } else {
            std::cout << "No association" << std::endl;
        }
    }
 
    auto endPoint = std::chrono::high_resolution_clock::now();
    std::chrono::duration<float, std::milli> fdur = endPoint - startPoint;
    std::cout << "Count: " << inObjectVec.size() << " Runtime: " << fdur.count() << " Avg runtime: " << fdur.count() / inObjectVec.size() << std::endl;

Thanks, but can you paste the whole which can compile and run?

Unfortunately the remaining part is our production code, I cannot share it. But I created a mocked version.

You need to adjust:
-m_objects map, you can add/remove car/person typed objects (it contains the output of our object detector, we would like to do image based tracking on them on every second frame).
-the first parameter of processFrame should be a frame from the video stream
-the second parameter, the frame number
-the third parameter, the frame timestamp

Is it ok with you?

#include <iostream>
#include <nvdstracker.h>
#include <cuda_runtime_api.h>
#include <chrono>
#include <vector>
#include <array>
#include <opencv2/core/mat.hpp>
#include <unordered_map>


namespace TMetadataV2ObjectHeader {
	enum class TObjectType {
		car,
		person
	};
}

struct Object {
	int x;
	int y;
	int width;
	int height;
};

static std::unordered_map< TMetadataV2ObjectHeader::TObjectType, std::vector< Object > > m_objects{
    { // CAR
     TMetadataV2ObjectHeader::TObjectType::car,	// type
     std::vector< Object >{ // vector (x,y,width,height)
         Object{ 54, 34, 40, 30 }, // please fill
         Object{ 70, 30, 40, 65 }  // car typed objects
     }
    },
    { // PERSON
     TMetadataV2ObjectHeader::TObjectType::person,	// type
     std::vector< Object >{ // vector (x,y,width,height)
         Object{ 100, 320, 20, 130 }, // please fill
         Object{ 170, 130, 40, 200 }  // person typed objects
     }
    }
};

void processFrame( cv::Mat currentFrame, uint32_t frameNum, int64_t frameTimestamp )
{
	static constexpr std::array< TMetadataV2ObjectHeader::TObjectType, 2 > objectTypes{ {
		TMetadataV2ObjectHeader::TObjectType::car,
		TMetadataV2ObjectHeader::TObjectType::person
	} };

	auto startPoint = std::chrono::high_resolution_clock::now();
	char path[] = "/opt/nvidia/deepstream/deepstream-4.0/samples/configs/deepstream-app/tracker_config.yml";

	NvMOTQuery query{};
	{
		const auto status = NvMOT_Query( sizeof( path ), path, &query );
		if( status != NvMOTStatus_OK ) {
			std::cout << "Error";
		}
	}

	/* INIT */

	void* yDevMem;
	const uint32_t width = static_cast< uint32_t >( currentFrame.cols );
	const uint32_t height = static_cast< uint32_t >( currentFrame.rows );
	const uint32_t pitch = static_cast< uint32_t >( currentFrame.step1() );
	//cudaMallocManaged( &yDevMem, width * height, cudaMemAttachGlobal );
	yDevMem = currentFrame.data;
	const uint32_t fullSize = pitch * height;

	static bool initCompleted{ false };
	static NvMOTContextHandle pContextHandle{};

	if( !initCompleted ) {
		// IN params
		NvMOTPerTransformBatchConfig batchConfig[ 1 ]{};
		batchConfig->bufferType = NVBUF_MEM_CUDA_UNIFIED;
		batchConfig->colorFormat = NVBUF_COLOR_FORMAT_GRAY8;
		batchConfig->maxHeight = height;
		batchConfig->maxPitch = pitch;
		batchConfig->maxSize = fullSize;
		batchConfig->maxWidth = width;

		NvMOTConfig pConfigIn{};
		pConfigIn.computeConfig = NVMOTCOMP_CPU;                    /**< Compute target. see NvMOTCompute */
		pConfigIn.maxStreams = 1;                                   /**< Maximum number of streams in a batch. */
		pConfigIn.numTransforms = 1;                                /**< Number of NvMOTPerTransformBatchConfig entries in perTransformBatchConfig */
		pConfigIn.perTransformBatchConfig = batchConfig;            /**< List of numTransform batch configs including type and resolution, one for each transform*/
		pConfigIn.miscConfig.gpuId = 0;                             /**< GPU to be used. */
		pConfigIn.miscConfig.maxObjPerBatch = 0;                    /**< Max number of objects to track per stream. 0 means no limit. */
		pConfigIn.miscConfig.maxObjPerStream = 0;                   /**< Max number of objects to track per batch. 0 means no limit. */
		pConfigIn.customConfigFilePathSize = sizeof( path ) ;       /**< The char length in customConfigFilePath */
		pConfigIn.customConfigFilePath = path;                      /**< Path to the tracker's custom config file. Null terminated */

		// OUT Params
		NvMOTConfigResponse pConfigResponse{};

		{
			const auto status = NvMOT_Init( &pConfigIn, &pContextHandle, &pConfigResponse );
			if( status != NvMOTStatus_OK ) {
				std::cout << "Error";
			} else {
				initCompleted = true;
			}
		}
	}

	/* PROCESS */

	// IN Params
	NvBufSurfaceParams bufferParam[ 1 ]{};
	bufferParam->width = width;                             /** width of buffer */
	bufferParam->height = height;                           /** height of buffer */
	bufferParam->pitch = pitch;                             /** pitch of buffer */
	bufferParam->colorFormat = NVBUF_COLOR_FORMAT_GRAY8;        /** color format */
	bufferParam->layout = NVBUF_LAYOUT_PITCH;               /** BL or PL for Jetson, ONLY PL in case of dGPU */
	bufferParam->dataSize = fullSize;                           /** size of allocated memory */
	bufferParam->dataPtr = yDevMem;                     /** pointer to allocated memory, Not valid for NVBUF_MEM_SURFACE_ARRAY and NVBUF_MEM_HANDLE */
	bufferParam->planeParams.num_planes = 1;                        /** Number of planes */
	bufferParam->planeParams.width[ 0 ] = width;                /** width of planes */
	bufferParam->planeParams.height[ 0 ] = height;              /** height of planes */
	bufferParam->planeParams.pitch[ 0 ] = pitch;                /** pitch of planes in bytes */
	bufferParam->planeParams.offset[ 0 ] = 0;                       /** offsets of planes in bytes */
	bufferParam->planeParams.psize[ 0 ] = pitch * height;   /** size of planes in bytes */
	bufferParam->planeParams.bytesPerPix[ 0 ] = 1;                  /** bytes taken for each pixel */

	bufferParam->mappedAddr.addr[ 0 ] = yDevMem;            /** pointers of mapped buffers. Null Initialized values.*/
	bufferParam->mappedAddr.eglImage = nullptr;
	//bufferParam->bufferDesc;  /** dmabuf fd in case of NVBUF_MEM_SURFACE_ARRAY and NVBUF_MEM_HANDLE type memory. Invalid for other types. */

	NvBufSurfaceParams* bufferParamPtr{ bufferParam };

	static std::vector< NvMOTObjToTrack > inObjectVec{};
	static std::vector< NvMOTTrackedObj > outObjectVec{};

	size_t currObjectCount{ 0 };
	for( const TMetadataV2ObjectHeader::TObjectType currType: objectTypes ) {
		currObjectCount += m_objects.at( currType ).size();
	}
	if( inObjectVec.size() < currObjectCount ) {
		inObjectVec.resize( currObjectCount );
	}
	if( outObjectVec.size() < currObjectCount ) {
		outObjectVec.resize( currObjectCount );
	}

	size_t currInObjectIndex{ 0 };

	/* Iterate through the possible object types */
	/* The inner loop operates on the vector of tracked objects from that type */
	/* You can mock this section with generated objects */
	for( const TMetadataV2ObjectHeader::TObjectType currObjType: objectTypes ) {
		for( const auto& currObj: m_objects.at( currObjType ) ) {
			NvMOTObjToTrack& currInObject = inObjectVec[ currInObjectIndex++ ];
			currInObject.classId = static_cast< uint16_t >( currObjType );              /**< Class of the object to be tracked. */
			currInObject.bbox.x = currObj.x;                /**< Bounding box. */
			currInObject.bbox.y = currObj.y;
			currInObject.bbox.width = currObj.width;
			currInObject.bbox.height = currObj.height;
			currInObject.confidence = 1.f;          /**< Detection confidence of the object. */
			currInObject.doTracking = true;         /**< True: track this object.  False: do not initiate  tracking on this object. */
		}
	}

	NvMOTFrame frame{};
	frame.streamID = 0;                             /**< The stream source for this frame. */
	frame.frameNum = frameNum;                                /**< Frame number sequentially identifying the frame within a stream. */
	frame.timeStamp = frameTimestamp;                         /**< Timestamp of the frame at the time of capture. */
	frame.timeStampValid = true;                    /**< The timestamp value is properly populated. */
	frame.doTracking = true;                        /**< True: track objects in this frame; False: do not track this frame. */
	frame.reset = false;                            /**< True: reset tracking for the stream. */
	frame.numBuffers = 1;                           /**< Number of entries in bufferList. */
	frame.bufferList = &bufferParamPtr;             /**< Array of pointers to buffer params. */
	frame.objectsIn.detectionDone = ( frameNum % 2 == 0 ); // We detect on every second image
	frame.objectsIn.numAllocated = inObjectVec.size();
	frame.objectsIn.numFilled = currObjectCount;
	frame.objectsIn.list = inObjectVec.data();

	NvMOTProcessParams processParams{};
	processParams.numFrames = 1;
	processParams.frameList = &frame;

	// OUT Params
	NvMOTTrackedObjBatch outTrackedBatch{};
	NvMOTTrackedObjList outBatchObjects{};
	outBatchObjects.list = outObjectVec.data();
	outBatchObjects.streamID = 0;      /**< Stream associated with objects in the list. */
	outBatchObjects.frameNum = frameNum;    /**< Frame number for objects in the list. */
	outBatchObjects.valid = true;             /**< This entry in the batch is valid */
	outBatchObjects.numAllocated = outObjectVec.size();  /**< Number of blocks allocated for the list. */
	outBatchObjects.numFilled = /*outObjVec.size()*/0;     /**< Number of populated blocks in the list. */

	outTrackedBatch.numAllocated = 1;
	outTrackedBatch.numFilled = 1;
	outTrackedBatch.list = &outBatchObjects;

	{
		const auto status = NvMOT_Process( pContextHandle, &processParams, &outTrackedBatch );
		if( status != NvMOTStatus_OK ) {
			std::cout << "Error";
		}
	}

	for( size_t outIndex = 0; outIndex < outBatchObjects.numFilled; ++outIndex ) {
		const auto& currOutObj{ outObjectVec[ outIndex ] };
		const auto currOutAssociated{ currOutObj.associatedObjectIn };
		if( currOutAssociated != nullptr ) {
			std::cout << "Ref [x: " << currOutAssociated->bbox.x << " y: " << currOutAssociated->bbox.y
					  << " w: " << currOutAssociated->bbox.width << " h: " << currOutAssociated->bbox.height << "] "
					  << " Tracked [x: " << currOutObj.bbox.x << " y: " << currOutObj.bbox.y
					  << " w: " << currOutObj.bbox.width << " h: " << currOutObj.bbox.height << "]" << std::endl;
		} else {
			std::cout << "No association" << std::endl;
		}
	}

	auto endPoint = std::chrono::high_resolution_clock::now();
	std::chrono::duration<float, std::milli> fdur = endPoint - startPoint;
	std::cout << "Count: " << inObjectVec.size() << " Runtime: " << fdur.count() << " Avg runtime: " << fdur.count() / inObjectVec.size() << std::endl;
}

int main()
{
	processFrame( cv::Mat( 640, 380, CV_8U ), 10, 1000 );
	processFrame( cv::Mat( 640, 380, CV_8U ), 11, 1100 );
	processFrame( cv::Mat( 640, 380, CV_8U ), 12, 1200 );
}

Hello customer,

doTracking field is set for each bbox and supposed to be true if it is meant to be tracked. There could be some cases where an object is detected, but not intended to be tracked for some reason like it is not the class of interest, or too small, etc. So, you have freedom to use that field if there’s any need.

For the frames where the detector is not run, you are supposed to set the following two fields:

frame.objectsIn.detectionDone = false;
frame.objectsIn.numFilled = 0;

It is a known bug, but KLT checks only objectsIn.numFilled field. So, please make sure it is set properly.

In your code snippet, you could try

  • frame.objectsIn.numFilled = currObjectCount;
  • frame.objectsIn.numFilled = ( frameNum % 2 != 0 ) ? currObjectCount : 0;

Please try that out and report how it went.

Dear @pshin,

I’ve tried the suggested modifications.
For the first frame (when we have detection) we send 4 objects in and the KLT tracker returns them properly (as previously).

On the frames when frame.objectsIn.detectionDone false and frame.objectsIn.numFilled == 0, outBatchObjects.numFilled becomes 0, so we can not extract the tracking information on these frames.

Hello Customer,

I noticed that you pass an empty frame as an input like below:

processFrame( cv::Mat( 640, 380, CV_8U ), 10, 1000 );

KLT tracker tries to find a set of features points to track. If there’s no feature points found, it will terminate each target tracking. I suspect that is the case in your sample example.

Please try using a sample image that has some textures.

Dear @pshin,

Thank you for your help, something happened :) I extracted six 640x360 images containing a person and a car. Each image is shifted by 25 pixel in the x direction from the previous one.

The output is the following:

KLT Tracker Init
Ref [x: 261 y: 90 w: 108 h: 77]  Tracked [x: 261 y: 90 w: 108 h: 77] Confidence: 1.1 ID: 1 Age: 0
Ref [x: 63 y: 93 w: 110 h: 249]  Tracked [x: 63 y: 93 w: 110 h: 249] Confidence: 1.1 ID: 2 Age: 0
Count: 2 Runtime: 1.85752 Avg runtime: 0.92876
No reference,  Tracked [x: 286 y: 90 w: 108 h: 77] Confidence: 0.96 ID: 1 Age: 0
No reference,  Tracked [x: 88 y: 93 w: 110 h: 249] Confidence: 0.977778 ID: 2 Age: 0
Count: 2 Runtime: 1.71417 Avg runtime: 0.857083
Ref [x: 313 y: 90 w: 108 h: 77]  Tracked [x: 313 y: 90 w: 108 h: 77] Confidence: 1.1 ID: 1 Age: 0
Ref [x: 114 y: 93 w: 110 h: 249]  Tracked [x: 114 y: 93 w: 110 h: 249] Confidence: 1.1 ID: 2 Age: 0
Count: 2 Runtime: 2.16206 Avg runtime: 1.08103
No reference,  Tracked [x: 339 y: 90 w: 108 h: 77] Confidence: 0.956522 ID: 1 Age: 0
No reference,  Tracked [x: 140 y: 93 w: 110 h: 249] Confidence: 0.978723 ID: 2 Age: 0
Count: 2 Runtime: 0.889805 Avg runtime: 0.444903
Ref [x: 364 y: 90 w: 108 h: 77]  Tracked [x: 364 y: 90 w: 108 h: 77] Confidence: 1.1 ID: 1 Age: 0
Ref [x: 166 y: 93 w: 110 h: 249]  Tracked [x: 166 y: 93 w: 110 h: 249] Confidence: 1.1 ID: 2 Age: 0
Count: 2 Runtime: 1.8243 Avg runtime: 0.912148
No reference,  Tracked [x: 389 y: 90 w: 108 h: 77] Confidence: 0.96 ID: 1 Age: 0
No reference,  Tracked [x: 191 y: 93 w: 110 h: 249] Confidence: 0.977778 ID: 2 Age: 0
Count: 2 Runtime: 0.717684 Avg runtime: 0.358842

Tracking works very well. The only thing interesting thing is Age: it is always 0. So basicaly it works, thank you :)

Just one more question: is it possible to keep objects without detection?

For instance:
-On the first frame the detector detects a person and a car
-On the second image KLT tracker tracks both of them with 0.9 confidence
-On the third frame the detector detects the car but fails to detect the person (CURRENTLY: drops the person immediately, EXPECTATION: a car with 1 confidence, a person with a lower confidence if the tracker could track it from first frame features).
-On the fourth frame we would like to track (CURRENTLY: tracks only the car, EXPECTATION: track the car from the detection on the third frame and the person from the first frame detection).

Is this use-case supported? Can we only update only a subset of objects with new detection and keep the others tracked?

Thank you in advance: Adam

#include <iostream>
#include <nvdstracker.h>
#include <cuda_runtime_api.h>
#include <chrono>
#include <vector>
#include <array>
#include <opencv2/core/mat.hpp>
#include <opencv2/imgcodecs.hpp>
#include <unordered_map>

namespace TMetadataV2ObjectHeader {
	enum class TObjectType {
		car,
		person
	};
}

struct Object {
	int x;
	int y;
	int width;
	int height;
};

static const std::unordered_map< TMetadataV2ObjectHeader::TObjectType, std::vector< Object > > objects_at_0{
    { // CAR
     TMetadataV2ObjectHeader::TObjectType::car,	// type
     std::vector< Object >{ // vector (x,y,width,height)
                            Object{ 261, 90, 108, 77 }
     }
    },
    { // PERSON
     TMetadataV2ObjectHeader::TObjectType::person,	// type
     std::vector< Object >{ // vector (x,y,width,height)
                            Object{ 63, 93, 110, 249 }
     }
    }
};

static const std::unordered_map< TMetadataV2ObjectHeader::TObjectType, std::vector< Object > > objects_at_2{
    { // CAR
     TMetadataV2ObjectHeader::TObjectType::car,	// type
     std::vector< Object >{ // vector (x,y,width,height)
                            Object{ 313, 90, 108, 77 }
     }
    },
    { // PERSON
     TMetadataV2ObjectHeader::TObjectType::person,	// type
     std::vector< Object >{ // vector (x,y,width,height)
                            Object{ 114, 93, 110, 249 }
     }
    }
};

static const std::unordered_map< TMetadataV2ObjectHeader::TObjectType, std::vector< Object > > objects_at_4{
    { // CAR
     TMetadataV2ObjectHeader::TObjectType::car,	// type
     std::vector< Object >{ // vector (x,y,width,height)
                            Object{ 364, 90, 108, 77 }
     }
    },
    { // PERSON
     TMetadataV2ObjectHeader::TObjectType::person,	// type
     std::vector< Object >{ // vector (x,y,width,height)
                            Object{ 166, 93, 110, 249 }
     }
    }
};

void processFrame( cv::Mat currentFrame, uint32_t frameNum, int64_t frameTimestamp, const std::unordered_map< TMetadataV2ObjectHeader::TObjectType, std::vector< Object > >& objects )
{
	static constexpr std::array< TMetadataV2ObjectHeader::TObjectType, 2 > objectTypes{ {
		TMetadataV2ObjectHeader::TObjectType::car,
		TMetadataV2ObjectHeader::TObjectType::person
	} };

	auto startPoint = std::chrono::high_resolution_clock::now();
	char path[] = "/opt/nvidia/deepstream/deepstream-4.0/samples/configs/deepstream-app/tracker_config.yml";

	NvMOTQuery query{};
	{
		const auto status = NvMOT_Query( sizeof( path ), path, &query );
		if( status != NvMOTStatus_OK ) {
			std::cout << "Error";
		}
	}

	/* INIT */

	void* yDevMem;
	const uint32_t width = static_cast< uint32_t >( currentFrame.cols );
	const uint32_t height = static_cast< uint32_t >( currentFrame.rows );
	const uint32_t pitch = static_cast< uint32_t >( currentFrame.step1() );
	//cudaMallocManaged( &yDevMem, width * height, cudaMemAttachGlobal );
	yDevMem = currentFrame.data;
	const uint32_t fullSize = pitch * height;

	static bool initCompleted{ false };
	static NvMOTContextHandle pContextHandle{};

	if( !initCompleted ) {
		// IN params
		NvMOTPerTransformBatchConfig batchConfig[ 1 ]{};
		batchConfig->bufferType = NVBUF_MEM_CUDA_UNIFIED;
		batchConfig->colorFormat = NVBUF_COLOR_FORMAT_GRAY8;
		batchConfig->maxHeight = height;
		batchConfig->maxPitch = pitch;
		batchConfig->maxSize = fullSize;
		batchConfig->maxWidth = width;

		NvMOTConfig pConfigIn{};
		pConfigIn.computeConfig = NVMOTCOMP_CPU;                    /**< Compute target. see NvMOTCompute */
		pConfigIn.maxStreams = 1;                                   /**< Maximum number of streams in a batch. */
		pConfigIn.numTransforms = 1;                                /**< Number of NvMOTPerTransformBatchConfig entries in perTransformBatchConfig */
		pConfigIn.perTransformBatchConfig = batchConfig;            /**< List of numTransform batch configs including type and resolution, one for each transform*/
		pConfigIn.miscConfig.gpuId = 0;                             /**< GPU to be used. */
		pConfigIn.miscConfig.maxObjPerBatch = 0;                    /**< Max number of objects to track per stream. 0 means no limit. */
		pConfigIn.miscConfig.maxObjPerStream = 0;                   /**< Max number of objects to track per batch. 0 means no limit. */
		pConfigIn.customConfigFilePathSize = sizeof( path ) ;       /**< The char length in customConfigFilePath */
		pConfigIn.customConfigFilePath = path;                      /**< Path to the tracker's custom config file. Null terminated */

		// OUT Params
		NvMOTConfigResponse pConfigResponse{};

		{
			const auto status = NvMOT_Init( &pConfigIn, &pContextHandle, &pConfigResponse );
			if( status != NvMOTStatus_OK ) {
				std::cout << "Error";
			} else {
				initCompleted = true;
			}
		}
	}

	/* PROCESS */

	// IN Params
	NvBufSurfaceParams bufferParam[ 1 ]{};
	bufferParam->width = width;                             /** width of buffer */
	bufferParam->height = height;                           /** height of buffer */
	bufferParam->pitch = pitch;                             /** pitch of buffer */
	bufferParam->colorFormat = NVBUF_COLOR_FORMAT_GRAY8;        /** color format */
	bufferParam->layout = NVBUF_LAYOUT_PITCH;               /** BL or PL for Jetson, ONLY PL in case of dGPU */
	bufferParam->dataSize = fullSize;                           /** size of allocated memory */
	bufferParam->dataPtr = yDevMem;                     /** pointer to allocated memory, Not valid for NVBUF_MEM_SURFACE_ARRAY and NVBUF_MEM_HANDLE */
	bufferParam->planeParams.num_planes = 1;                        /** Number of planes */
	bufferParam->planeParams.width[ 0 ] = width;                /** width of planes */
	bufferParam->planeParams.height[ 0 ] = height;              /** height of planes */
	bufferParam->planeParams.pitch[ 0 ] = pitch;                /** pitch of planes in bytes */
	bufferParam->planeParams.offset[ 0 ] = 0;                       /** offsets of planes in bytes */
	bufferParam->planeParams.psize[ 0 ] = pitch * height;   /** size of planes in bytes */
	bufferParam->planeParams.bytesPerPix[ 0 ] = 1;                  /** bytes taken for each pixel */

	bufferParam->mappedAddr.addr[ 0 ] = yDevMem;            /** pointers of mapped buffers. Null Initialized values.*/
	bufferParam->mappedAddr.eglImage = nullptr;
	//bufferParam->bufferDesc;  /** dmabuf fd in case of NVBUF_MEM_SURFACE_ARRAY and NVBUF_MEM_HANDLE type memory. Invalid for other types. */

	NvBufSurfaceParams* bufferParamPtr{ bufferParam };

	static std::vector< NvMOTObjToTrack > inObjectVec{};
	static std::vector< NvMOTTrackedObj > outObjectVec{};

	size_t currObjectCount{ 0 };
	for( const TMetadataV2ObjectHeader::TObjectType currType: objectTypes ) {
		currObjectCount += objects.at( currType ).size();
	}
	if( inObjectVec.size() < currObjectCount ) {
		inObjectVec.resize( currObjectCount );
	}
	if( outObjectVec.size() < currObjectCount ) {
		outObjectVec.resize( currObjectCount );
	}

	size_t currInObjectIndex{ 0 };

	/* Iterate through the possible object types */
	/* The inner loop operates on the vector of tracked objects from that type */
	/* You can mock this section with generated objects */
	for( const TMetadataV2ObjectHeader::TObjectType currObjType: objectTypes ) {
		for( const auto& currObj: objects.at( currObjType ) ) {
			NvMOTObjToTrack& currInObject = inObjectVec[ currInObjectIndex++ ];
			currInObject.classId = static_cast< uint16_t >( currObjType );              /**< Class of the object to be tracked. */
			currInObject.bbox.x = currObj.x;                /**< Bounding box. */
			currInObject.bbox.y = currObj.y;
			currInObject.bbox.width = currObj.width;
			currInObject.bbox.height = currObj.height;
			currInObject.confidence = 1.f;          /**< Detection confidence of the object. */
			currInObject.doTracking = true;         /**< True: track this object.  False: do not initiate  tracking on this object. */
		}
	}

	NvMOTFrame frame{};
	frame.streamID = 0;                             /**< The stream source for this frame. */
	frame.frameNum = frameNum;                                /**< Frame number sequentially identifying the frame within a stream. */
	frame.timeStamp = frameTimestamp;                         /**< Timestamp of the frame at the time of capture. */
	frame.timeStampValid = true;                    /**< The timestamp value is properly populated. */
	frame.doTracking = true;                        /**< True: track objects in this frame; False: do not track this frame. */
	frame.reset = false;                            /**< True: reset tracking for the stream. */
	frame.numBuffers = 1;                           /**< Number of entries in bufferList. */
	frame.bufferList = &bufferParamPtr;             /**< Array of pointers to buffer params. */
	frame.objectsIn.detectionDone = ( frameNum % 2 == 0 ); // We detect on every second image
	frame.objectsIn.numAllocated = inObjectVec.size();
	frame.objectsIn.numFilled = ( frameNum % 2 == 0 ? currObjectCount : 0 );//currObjectCount;
	frame.objectsIn.list = inObjectVec.data();

	NvMOTProcessParams processParams{};
	processParams.numFrames = 1;
	processParams.frameList = &frame;

	// OUT Params
	NvMOTTrackedObjBatch outTrackedBatch{};
	NvMOTTrackedObjList outBatchObjects{};
	outBatchObjects.list = outObjectVec.data();
	outBatchObjects.streamID = 0;      /**< Stream associated with objects in the list. */
	outBatchObjects.frameNum = frameNum;    /**< Frame number for objects in the list. */
	outBatchObjects.valid = true;             /**< This entry in the batch is valid */
	outBatchObjects.numAllocated = outObjectVec.size();  /**< Number of blocks allocated for the list. */
	outBatchObjects.numFilled = /*outObjVec.size()*/0;     /**< Number of populated blocks in the list. */

	outTrackedBatch.numAllocated = 1;
	outTrackedBatch.numFilled = 1;
	outTrackedBatch.list = &outBatchObjects;

	{
		const auto status = NvMOT_Process( pContextHandle, &processParams, &outTrackedBatch );
		if( status != NvMOTStatus_OK ) {
			std::cout << "Error";
		}
	}

	for( size_t outIndex = 0; outIndex < outBatchObjects.numFilled; ++outIndex ) {
		const auto& currOutObj{ outObjectVec[ outIndex ] };
		const auto currOutAssociated{ currOutObj.associatedObjectIn };
		if( currOutAssociated != nullptr ) {
			std::cout << "Ref [x: " << currOutAssociated->bbox.x << " y: " << currOutAssociated->bbox.y
					  << " w: " << currOutAssociated->bbox.width << " h: " << currOutAssociated->bbox.height << "] ";
		} else {
			std::cout << "No reference, ";
		}
		std::cout << " Tracked [x: " << currOutObj.bbox.x << " y: " << currOutObj.bbox.y
				  << " w: " << currOutObj.bbox.width << " h: " << currOutObj.bbox.height << "]"
				  << " Confidence: " << currOutObj.confidence << " ID: " << currOutObj.trackingId << " Age: " << currOutObj.age << std::endl;
	}

	auto endPoint = std::chrono::high_resolution_clock::now();
	std::chrono::duration<float, std::milli> fdur = endPoint - startPoint;
	std::cout << "Count: " << inObjectVec.size() << " Runtime: " << fdur.count() << " Avg runtime: " << fdur.count() / inObjectVec.size() << std::endl;
}

int main()
{
	std::array< cv::Mat, 6 > images{ {
		cv::imread( "/home/balazad/Development/Cpp/DSTest/img_1.jpg", cv::IMREAD_GRAYSCALE ),
		cv::imread( "/home/balazad/Development/Cpp/DSTest/img_2.jpg", cv::IMREAD_GRAYSCALE ),
		cv::imread( "/home/balazad/Development/Cpp/DSTest/img_3.jpg", cv::IMREAD_GRAYSCALE ),
		cv::imread( "/home/balazad/Development/Cpp/DSTest/img_4.jpg", cv::IMREAD_GRAYSCALE ),
		cv::imread( "/home/balazad/Development/Cpp/DSTest/img_5.jpg", cv::IMREAD_GRAYSCALE ),
		cv::imread( "/home/balazad/Development/Cpp/DSTest/img_6.jpg", cv::IMREAD_GRAYSCALE ),
	} };

	std::array< const std::unordered_map< TMetadataV2ObjectHeader::TObjectType, std::vector< Object > >*, 6 > objs{ {
		&objects_at_0,
		&objects_at_0,
		&objects_at_2,
		&objects_at_2,
		&objects_at_4,
		&objects_at_4
	} };

	for( size_t imgIndex = 0; imgIndex < images.size(); ++imgIndex ) {
		processFrame( images[ imgIndex ], imgIndex, imgIndex * 100, *objs[ imgIndex ] );
	}
}

Hello Adam,

-On the third frame the detector detects the car but fails to detect the person (CURRENTLY: drops the person immediately, EXPECTATION: a car with 1 confidence, a person with a lower confidence if the tracker could track it from first frame features).

[pshin] KLT tracker has capability of tracking objects even if there’s no detection, as long as there’s enough visual features to use. The reason why the person immediately dropped at the third frame is not because it is not detected. Rather, it would be because KLT didn’t find enough interest points to track.

Dear pshin,

I created two test cases:

1, I modified the input objects on the third frame (I removed the detection of the person).
On the output we can see that KLT tracker drops the person on Frame 3 (because no detection).
On Frame 4 there is no track for the person
On Frame 5 detector detects the person again, a new track appears (ID: 3)

KLT Tracker Init
---Frame 1---
Ref [x: 261 y: 90 w: 108 h: 77]  Tracked [x: 261 y: 90 w: 108 h: 77] Confidence: 1.1 ID: 1 Age: 0
Ref [x: 63 y: 93 w: 110 h: 249]  Tracked [x: 63 y: 93 w: 110 h: 249] Confidence: 1.1 ID: 2 Age: 0
Count: 2 Runtime: 7.324 Avg runtime: 3.662
---Frame 2---
No reference,  Tracked [x: 286 y: 90 w: 108 h: 77] Confidence: 0.96 ID: 1 Age: 0
No reference,  Tracked [x: 88 y: 93 w: 110 h: 249] Confidence: 0.977778 ID: 2 Age: 0
Count: 2 Runtime: 2.59108 Avg runtime: 1.29554
---Frame 3---
Ref [x: 313 y: 90 w: 108 h: 77]  Tracked [x: 313 y: 90 w: 108 h: 77] Confidence: 1.1 ID: 1 Age: 0
Count: 2 Runtime: 2.24051 Avg runtime: 1.12025
---Frame 4---
No reference,  Tracked [x: 339 y: 90 w: 108 h: 77] Confidence: 0.971429 ID: 1 Age: 0
Count: 2 Runtime: 1.09489 Avg runtime: 0.547447
---Frame 5---
Ref [x: 364 y: 90 w: 108 h: 77]  Tracked [x: 364 y: 90 w: 108 h: 77] Confidence: 1.1 ID: 1 Age: 0
Ref [x: 166 y: 93 w: 110 h: 249]  Tracked [x: 166 y: 93 w: 110 h: 249] Confidence: 1.1 ID: 3 Age: 0
Count: 2 Runtime: 1.98103 Avg runtime: 0.990516
---Frame 6---
No reference,  Tracked [x: 389 y: 90 w: 108 h: 77] Confidence: 0.96 ID: 1 Age: 0
No reference,  Tracked [x: 191 y: 93 w: 110 h: 249] Confidence: 0.977778 ID: 3 Age: 0
Count: 2 Runtime: 0.781161 Avg runtime: 0.390581

2, I modified the application to detect only on the Frame 1 then do KLT tracking on the following frames. In this use-case the tracker tracks both of the objects very well on all the frames. This is why I think the previous version drops the track of the person on Frame 3 because there is no detection for that object on that frame.

KLT Tracker Init
---Frame 1---
Ref [x: 261 y: 90 w: 108 h: 77]  Tracked [x: 261 y: 90 w: 108 h: 77] Confidence: 1.1 ID: 1 Age: 0
Ref [x: 63 y: 93 w: 110 h: 249]  Tracked [x: 63 y: 93 w: 110 h: 249] Confidence: 1.1 ID: 2 Age: 0
Count: 2 Runtime: 1.51455 Avg runtime: 0.757275
---Frame 2---
No reference,  Tracked [x: 286 y: 90 w: 108 h: 77] Confidence: 0.96 ID: 1 Age: 0
No reference,  Tracked [x: 88 y: 93 w: 110 h: 249] Confidence: 0.977778 ID: 2 Age: 0
Count: 2 Runtime: 1.70903 Avg runtime: 0.854514
---Frame 3---
No reference,  Tracked [x: 313 y: 90 w: 108 h: 77] Confidence: 0.96 ID: 1 Age: 0
No reference,  Tracked [x: 114 y: 93 w: 110 h: 249] Confidence: 0.977778 ID: 2 Age: 0
Count: 2 Runtime: 1.13419 Avg runtime: 0.567096
---Frame 4---
No reference,  Tracked [x: 339 y: 90 w: 108 h: 77] Confidence: 0.96 ID: 1 Age: 0
No reference,  Tracked [x: 140 y: 93 w: 110 h: 249] Confidence: 0.977778 ID: 2 Age: 0
Count: 2 Runtime: 0.850839 Avg runtime: 0.42542
---Frame 5---
No reference,  Tracked [x: 364 y: 90 w: 108 h: 77] Confidence: 0.96 ID: 1 Age: 0
No reference,  Tracked [x: 166 y: 93 w: 110 h: 249] Confidence: 0.977778 ID: 2 Age: 0
Count: 2 Runtime: 0.75435 Avg runtime: 0.377175
---Frame 6---
No reference,  Tracked [x: 390 y: 90 w: 108 h: 77] Confidence: 0.96 ID: 1 Age: 0
No reference,  Tracked [x: 192 y: 93 w: 110 h: 249] Confidence: 0.977778 ID: 2 Age: 0
Count: 2 Runtime: 0.681143 Avg runtime: 0.340571

The second use case is a good evidence that shows KLT has capability of tracking without detection. With no detections from Frame 2 to 5, KLT was able to keep track of objects.

I don’t know exactly what happened in your first case. How did you make a detection “drop” for the third frame? Did you make frame.objectsIn.numFilled = 1?

On the first use-case I removed the detection of the person from objects_at_2 (because of the removal frame.objectsIn.numFilled = 1, yes.

I’ve attached the full code.

#include <iostream>
#include <nvdstracker.h>
#include <cuda_runtime_api.h>
#include <chrono>
#include <vector>
#include <array>
#include <opencv2/core/mat.hpp>
#include <opencv2/imgcodecs.hpp>
#include <unordered_map>


namespace TMetadataV2ObjectHeader {
	enum class TObjectType {
		car,
		person
	};
}

struct Object {
	int x;
	int y;
	int width;
	int height;
};

static const std::unordered_map< TMetadataV2ObjectHeader::TObjectType, std::vector< Object > > objects_at_0{
    { // CAR
     TMetadataV2ObjectHeader::TObjectType::car,	// type
     std::vector< Object >{ // vector (x,y,width,height)
                            Object{ 261, 90, 108, 77 }
     }
    },
    { // PERSON
     TMetadataV2ObjectHeader::TObjectType::person,	// type
     std::vector< Object >{ // vector (x,y,width,height)
                            Object{ 63, 93, 110, 249 }
     }
    }
};

static const std::unordered_map< TMetadataV2ObjectHeader::TObjectType, std::vector< Object > > objects_at_2{
    { // CAR
     TMetadataV2ObjectHeader::TObjectType::car,	// type
     std::vector< Object >{ // vector (x,y,width,height)
                            Object{ 313, 90, 108, 77 }
     }
    },
    { // PERSON - REMOVED DETECTION
     TMetadataV2ObjectHeader::TObjectType::person,	// type
     std::vector< Object >{ // vector (x,y,width,height)
                            //Object{ 114, 93, 110, 249 }
     }
    }
};

static const std::unordered_map< TMetadataV2ObjectHeader::TObjectType, std::vector< Object > > objects_at_4{
    { // CAR
     TMetadataV2ObjectHeader::TObjectType::car,	// type
     std::vector< Object >{ // vector (x,y,width,height)
                            Object{ 364, 90, 108, 77 }
     }
    },
    { // PERSON
     TMetadataV2ObjectHeader::TObjectType::person,	// type
     std::vector< Object >{ // vector (x,y,width,height)
                            Object{ 166, 93, 110, 249 }
     }
    }
};

void processFrame( cv::Mat currentFrame, uint32_t frameNum, int64_t frameTimestamp, const std::unordered_map< TMetadataV2ObjectHeader::TObjectType, std::vector< Object > >& objects )
{
	static constexpr std::array< TMetadataV2ObjectHeader::TObjectType, 2 > objectTypes{ {
		TMetadataV2ObjectHeader::TObjectType::car,
		TMetadataV2ObjectHeader::TObjectType::person
	} };

	auto startPoint = std::chrono::high_resolution_clock::now();
	char path[] = "/opt/nvidia/deepstream/deepstream-4.0/samples/configs/deepstream-app/tracker_config.yml";

	NvMOTQuery query{};
	{
		const auto status = NvMOT_Query( sizeof( path ), path, &query );
		if( status != NvMOTStatus_OK ) {
			std::cout << "Error";
		}
	}

	/* INIT */

	void* yDevMem;
	const uint32_t width = static_cast< uint32_t >( currentFrame.cols );
	const uint32_t height = static_cast< uint32_t >( currentFrame.rows );
	const uint32_t pitch = static_cast< uint32_t >( currentFrame.step1() );
	//cudaMallocManaged( &yDevMem, width * height, cudaMemAttachGlobal );
	yDevMem = currentFrame.data;
	const uint32_t fullSize = pitch * height;

	static bool initCompleted{ false };
	static NvMOTContextHandle pContextHandle{};

	if( !initCompleted ) {
		// IN params
		NvMOTPerTransformBatchConfig batchConfig[ 1 ]{};
		batchConfig->bufferType = NVBUF_MEM_CUDA_UNIFIED;
		batchConfig->colorFormat = NVBUF_COLOR_FORMAT_GRAY8;
		batchConfig->maxHeight = height;
		batchConfig->maxPitch = pitch;
		batchConfig->maxSize = fullSize;
		batchConfig->maxWidth = width;

		NvMOTConfig pConfigIn{};
		pConfigIn.computeConfig = NVMOTCOMP_CPU;                    /**< Compute target. see NvMOTCompute */
		pConfigIn.maxStreams = 1;                                   /**< Maximum number of streams in a batch. */
		pConfigIn.numTransforms = 1;                                /**< Number of NvMOTPerTransformBatchConfig entries in perTransformBatchConfig */
		pConfigIn.perTransformBatchConfig = batchConfig;            /**< List of numTransform batch configs including type and resolution, one for each transform*/
		pConfigIn.miscConfig.gpuId = 0;                             /**< GPU to be used. */
		pConfigIn.miscConfig.maxObjPerBatch = 0;                    /**< Max number of objects to track per stream. 0 means no limit. */
		pConfigIn.miscConfig.maxObjPerStream = 0;                   /**< Max number of objects to track per batch. 0 means no limit. */
		pConfigIn.customConfigFilePathSize = sizeof( path ) ;       /**< The char length in customConfigFilePath */
		pConfigIn.customConfigFilePath = path;                      /**< Path to the tracker's custom config file. Null terminated */

		// OUT Params
		NvMOTConfigResponse pConfigResponse{};

		{
			const auto status = NvMOT_Init( &pConfigIn, &pContextHandle, &pConfigResponse );
			if( status != NvMOTStatus_OK ) {
				std::cout << "Error";
			} else {
				initCompleted = true;
			}
		}
	}

	/* PROCESS */

	// IN Params
	NvBufSurfaceParams bufferParam[ 1 ]{};
	bufferParam->width = width;                             /** width of buffer */
	bufferParam->height = height;                           /** height of buffer */
	bufferParam->pitch = pitch;                             /** pitch of buffer */
	bufferParam->colorFormat = NVBUF_COLOR_FORMAT_GRAY8;        /** color format */
	bufferParam->layout = NVBUF_LAYOUT_PITCH;               /** BL or PL for Jetson, ONLY PL in case of dGPU */
	bufferParam->dataSize = fullSize;                           /** size of allocated memory */
	bufferParam->dataPtr = yDevMem;                     /** pointer to allocated memory, Not valid for NVBUF_MEM_SURFACE_ARRAY and NVBUF_MEM_HANDLE */
	bufferParam->planeParams.num_planes = 1;                        /** Number of planes */
	bufferParam->planeParams.width[ 0 ] = width;                /** width of planes */
	bufferParam->planeParams.height[ 0 ] = height;              /** height of planes */
	bufferParam->planeParams.pitch[ 0 ] = pitch;                /** pitch of planes in bytes */
	bufferParam->planeParams.offset[ 0 ] = 0;                       /** offsets of planes in bytes */
	bufferParam->planeParams.psize[ 0 ] = pitch * height;   /** size of planes in bytes */
	bufferParam->planeParams.bytesPerPix[ 0 ] = 1;                  /** bytes taken for each pixel */

	bufferParam->mappedAddr.addr[ 0 ] = yDevMem;            /** pointers of mapped buffers. Null Initialized values.*/
	bufferParam->mappedAddr.eglImage = nullptr;
	//bufferParam->bufferDesc;  /** dmabuf fd in case of NVBUF_MEM_SURFACE_ARRAY and NVBUF_MEM_HANDLE type memory. Invalid for other types. */

	NvBufSurfaceParams* bufferParamPtr{ bufferParam };

	static std::vector< NvMOTObjToTrack > inObjectVec{};
	static std::vector< NvMOTTrackedObj > outObjectVec{};

	size_t currObjectCount{ 0 };
	for( const TMetadataV2ObjectHeader::TObjectType currType: objectTypes ) {
		currObjectCount += objects.at( currType ).size();
	}
	if( inObjectVec.size() < currObjectCount ) {
		inObjectVec.resize( currObjectCount );
	}
	if( outObjectVec.size() < currObjectCount ) {
		outObjectVec.resize( currObjectCount );
	}

	size_t currInObjectIndex{ 0 };

	/* Iterate through the possible object types */
	/* The inner loop operates on the vector of tracked objects from that type */
	/* You can mock this section with generated objects */
	for( const TMetadataV2ObjectHeader::TObjectType currObjType: objectTypes ) {
		for( const auto& currObj: objects.at( currObjType ) ) {
			NvMOTObjToTrack& currInObject = inObjectVec[ currInObjectIndex++ ];
			currInObject.classId = static_cast< uint16_t >( currObjType );              /**< Class of the object to be tracked. */
			currInObject.bbox.x = currObj.x;                /**< Bounding box. */
			currInObject.bbox.y = currObj.y;
			currInObject.bbox.width = currObj.width;
			currInObject.bbox.height = currObj.height;
			currInObject.confidence = 1.f;          /**< Detection confidence of the object. */
			currInObject.doTracking = true;         /**< True: track this object.  False: do not initiate  tracking on this object. */
		}
	}

	NvMOTFrame frame{};
	frame.streamID = 0;                             /**< The stream source for this frame. */
	frame.frameNum = frameNum;                                /**< Frame number sequentially identifying the frame within a stream. */
	frame.timeStamp = frameTimestamp;                         /**< Timestamp of the frame at the time of capture. */
	frame.timeStampValid = true;                    /**< The timestamp value is properly populated. */
	frame.doTracking = true;                        /**< True: track objects in this frame; False: do not track this frame. */
	frame.reset = false;                            /**< True: reset tracking for the stream. */
	frame.numBuffers = 1;                           /**< Number of entries in bufferList. */
	frame.bufferList = &bufferParamPtr;             /**< Array of pointers to buffer params. */
	frame.objectsIn.detectionDone = ( frameNum % 2 == 0 ); // We detect on every second image
	frame.objectsIn.numAllocated = inObjectVec.size();
	frame.objectsIn.numFilled = ( frameNum % 2 == 0 ? currObjectCount : 0 );//currObjectCount;
	frame.objectsIn.list = inObjectVec.data();

	NvMOTProcessParams processParams{};
	processParams.numFrames = 1;
	processParams.frameList = &frame;

	// OUT Params
	NvMOTTrackedObjBatch outTrackedBatch{};
	NvMOTTrackedObjList outBatchObjects{};
	outBatchObjects.list = outObjectVec.data();
	outBatchObjects.streamID = 0;      /**< Stream associated with objects in the list. */
	outBatchObjects.frameNum = frameNum;    /**< Frame number for objects in the list. */
	outBatchObjects.valid = true;             /**< This entry in the batch is valid */
	outBatchObjects.numAllocated = outObjectVec.size();  /**< Number of blocks allocated for the list. */
	outBatchObjects.numFilled = /*outObjVec.size()*/0;     /**< Number of populated blocks in the list. */

	outTrackedBatch.numAllocated = 1;
	outTrackedBatch.numFilled = 1;
	outTrackedBatch.list = &outBatchObjects;

	{
		const auto status = NvMOT_Process( pContextHandle, &processParams, &outTrackedBatch );
		if( status != NvMOTStatus_OK ) {
			std::cout << "Error";
		}
	}

	for( size_t outIndex = 0; outIndex < outBatchObjects.numFilled; ++outIndex ) {
		const auto& currOutObj{ outObjectVec[ outIndex ] };
		const auto currOutAssociated{ currOutObj.associatedObjectIn };
		if( currOutAssociated != nullptr ) {
			std::cout << "Ref [x: " << currOutAssociated->bbox.x << " y: " << currOutAssociated->bbox.y
					  << " w: " << currOutAssociated->bbox.width << " h: " << currOutAssociated->bbox.height << "] ";
		} else {
			std::cout << "No reference, ";
		}
		std::cout << " Tracked [x: " << currOutObj.bbox.x << " y: " << currOutObj.bbox.y
				  << " w: " << currOutObj.bbox.width << " h: " << currOutObj.bbox.height << "]"
				  << " Confidence: " << currOutObj.confidence << " ID: " << currOutObj.trackingId << " Age: " << currOutObj.age << std::endl;
	}

	auto endPoint = std::chrono::high_resolution_clock::now();
	std::chrono::duration<float, std::milli> fdur = endPoint - startPoint;
	std::cout << "Count in: " << frame.objectsIn.numFilled << " Count out: " << outBatchObjects.numFilled << " Runtime: " << fdur.count() << " Avg runtime: " << fdur.count() / inObjectVec.size() << std::endl;
}

int main()
{
	std::array< cv::Mat, 6 > images{ {
		cv::imread( "/home/balazad/Development/Cpp/DSTest/img_1.jpg", cv::IMREAD_GRAYSCALE ),
		cv::imread( "/home/balazad/Development/Cpp/DSTest/img_2.jpg", cv::IMREAD_GRAYSCALE ),
		cv::imread( "/home/balazad/Development/Cpp/DSTest/img_3.jpg", cv::IMREAD_GRAYSCALE ),
		cv::imread( "/home/balazad/Development/Cpp/DSTest/img_4.jpg", cv::IMREAD_GRAYSCALE ),
		cv::imread( "/home/balazad/Development/Cpp/DSTest/img_5.jpg", cv::IMREAD_GRAYSCALE ),
		cv::imread( "/home/balazad/Development/Cpp/DSTest/img_6.jpg", cv::IMREAD_GRAYSCALE ),
	} };

	std::array< const std::unordered_map< TMetadataV2ObjectHeader::TObjectType, std::vector< Object > >*, 6 > objs{ {
		&objects_at_0,
		&objects_at_0,
		&objects_at_2,
		&objects_at_2,
		&objects_at_4,
		&objects_at_4
	} };

	for( size_t imgIndex = 0; imgIndex < images.size(); ++imgIndex ) {
		std::cout << "---Frame " << imgIndex + 1 << "---" << std::endl;
		processFrame( images[ imgIndex ], imgIndex, imgIndex * 100, *objs[ imgIndex ] );
	}
}

Hello Adam,

Sorry for the delay.

I removed the detection of the person from objects_at_2(because of the removal frame.objectsIn.numFilled = 1

In your full code, you leave the Object field blank in objects_at_2, intending to delete Person object. However, because you still have the TObjectType::person key entry in the unordered_map data structure, it caused your input to still have two objects. See your code below:

for( const TMetadataV2ObjectHeader::TObjectType currType: objectTypes ) {
	currObjectCount += objects.at( currType ).size();
}

currObjectCount is still be 2 (instead of 1) for 3rd and 4th frames.

Also

-outTrackedBatch.numFilled = 1;
+outTrackedBatch.numFilled = 0;

, because outTrackedBatch.numFilled field is supposed to be filled and incremented inside the low-level tracker.

These are things that caught my eyes looking at your code. If the issue still persists, please talk to amycao to issue a ticket and reproduce the issue internally.

Dear pshin,

I do not leave an Object but an std::vector< Object > field blank.

for( const TMetadataV2ObjectHeader::TObjectType currType: objectTypes ) {
currObjectCount += objects.at( currType ).size();
}

objects.at( currType ) aka. objects.at( TMetadataV2ObjectHeader::TObjectType::person ) will give us that empty vector, so its size will be 0. currObjectCount will incremented only for the car once.

I’ve double checked it.

I modified outTrackedBatch.numFilled field to be 0, but NvMOT_Process left it on 0 (but filled outBatchObjects.numFilled and others, I do not know if it is a bug).

I will open a ticket, thank you for your help!

Hi
Sorry for a late reply, will try to repro your issue.

And can you share your input pictures, thanks.

Dear Amycao,

Thank you for your reply, I’ve opened a ticket at your partner portal. I attached the images and the complete source code there.