Bounding box data in deepstream_test_2

Please provide complete information as applicable to your setup.

• Jetson Nano B01
• DeepStream 5.1
• JetPack Version 4.5
• TensorRT Version 7.1.3
• Question type
• Requirement details: Get bounding box data (XY coordinates for each ID object)

Hi,

I am reusing the Deepstream sample with Python API.

The sample that I am using is deepstream_test_2 which is in /deepstream-5.1/sources/deepstream_python_apps/apps/deepstream-test2 directory.

By default, this example reports only the quantity of people/vehicles being tracked. The output message format is something like:

Frame Number=402 Number of Objects=3 Vehicle_count=1 Person_count=2

What I would like to get additionally is the XY coordinates of each object ID tracked but I do not know where can I report this data while the video streaming is running. For example, I want a report message like:

Frame Number=402 Number of Objects=3 Vehicle_count=1 Person_count=2
                 > ID=1  Object=Person Centroid(x,y)=250,100
                 > ID=3  Object=Person Centroid(x,y)=300,650
                 > ID=15 Object=Car    Centroid(x,y)=600,500

This is a output example just to explain the additional information that I would like to acquire.

Which file or function could I acquire this information? and how can I report that?

Hey, you can get the info from meta data, please refer MetaData in the DeepStream SDK — DeepStream 5.1 Release documentation

If you want to dump these info to a file, you can refer deepstream-app.c → write_kitti_track_output

Right, but I am using Python API in Deepstream samples.

According to the python script in deepstream-test2 i think that I could get the desired information by acquiring the information printed in this few lines:

for trackobj in pyds.NvDsPastFrameObjBatch.list(pPastFrameObjBatch):
print("streamId=",trackobj.streamID)
print("surfaceStreamID=",trackobj.surfaceStreamID)
for pastframeobj in pyds.NvDsPastFrameObjStream.list(trackobj):
    print("numobj=",pastframeobj.numObj)
    print("uniqueId=",pastframeobj.uniqueId)
    print("classId=",pastframeobj.classId)
    print("objLabel=",pastframeobj.objLabel)
    for objlist in pyds.NvDsPastFrameObjList.list(pastframeobj):
        print('frameNum:', objlist.frameNum)
        print('tBbox.left:', objlist.tBbox.left)
        print('tBbox.width:', objlist.tBbox.width)
        print('tBbox.top:', objlist.tBbox.top)
        print('tBbox.right:', objlist.tBbox.height)
        print('confidence:', objlist.confidence)
        print('age:', objlist.age)

I already enable it and ran a sample .h264 file which have cars and people in every frame.

The output was something like this:

Frame Number=705 Number of Objects=3 Vehicle_count=0 Person_count=3
streamId= 0
surfaceStreamID= 0
Frame Number=706 Number of Objects=3 Vehicle_count=0 Person_count=3
streamId= 0
surfaceStreamID= 0
Frame Number=707 Number of Objects=3 Vehicle_count=0 Person_count=3
streamId= 0
surfaceStreamID= 0
Frame Number=708 Number of Objects=4 Vehicle_count=0 Person_count=4
streamId= 0
surfaceStreamID= 0
numobj= 24
uniqueId= 10721585435768782869
classId= 2
objLabel= Person
frameNum: 684
tBbox.left: 1104.0
tBbox.width: 48.0
tBbox.top: 396.1956481933594
tBbox.right: 146.7391357421875
confidence: 1.0
age: 1
frameNum: 685
tBbox.left: 1097.4107666015625
tBbox.width: 46.17839431762695
tBbox.top: 397.36956787109375
tBbox.right: 145.85870361328125
confidence: 1.0
age: 2
frameNum: 686
tBbox.left: 1100.14697265625
tBbox.width: 46.145877838134766
tBbox.top: 397.2381896972656
tBbox.right: 145.7559814453125
confidence: 0.8343579769134521
age: 3
frameNum: 687
tBbox.left: 1101.922119140625
tBbox.width: 46.11335754394531
tBbox.top: 397.60369873046875
tBbox.right: 145.65325927734375
confidence: 0.7782335877418518
age: 4
frameNum: 688
tBbox.left: 1101.685302734375
tBbox.width: 46.080833435058594
tBbox.top: 397.3454895019531
tBbox.right: 145.550537109375
confidence: 0.7782335877418518
age: 5
frameNum: 689
tBbox.left: 1101.4322509765625
tBbox.width: 46.048316955566406
tBbox.top: 397.267333984375
tBbox.right: 145.4478302001953
confidence: 0.7782335877418518
age: 6
frameNum: 690
tBbox.left: 1101.340087890625
tBbox.width: 46.01579284667969
tBbox.top: 397.43109130859375
tBbox.right: 145.34510803222656
confidence: 0.7782335877418518

According to this part of this output, We can see that:

  • Case 1: The majority of output frames does not presents the entire information:

    Frame Number=705 Number of Objects=3 Vehicle_count=0
    Person_count=3
    streamId= 0
    surfaceStreamID= 0

  • Case 2: Just some frames outputs the total of data variables:

    Frame Number=708 Number of Objects=4 Vehicle_count=0
    Person_count=4
    streamId= 0
    surfaceStreamID= 0
    numobj= 24
    uniqueId= 10721585435768782869
    classId= 2
    objLabel= Person
    frameNum: 684
    tBbox.left: 1104.0
    tBbox.width: 48.0
    tBbox.top: 396.1956481933594
    tBbox.right: 146.7391357421875
    confidence: 1.0
    age: 1

First, I do not know if it is normal the majority do not print all parameters.
Second, I dont know why the values just corresponds to just a object and not all objects identified in the same frame according to the attributed unique ID.