Saving meta data matching SmartRecord files

Hi all,

I would like suggestions and/or guidance on how to implement the following requirement. @miguel.taylor, @mdegans, @rjhowell44, maybe you have some suggestions.

The overall requirement is this:
1. Multiple live sources (rtsp) with different resolution and frame rates should be supported
2. For each source do inferencing, tracking and analytics
3. For each source, use the object and analytics meta data to apply user defined rules to trigger actions (e.g. If a person crosses a line or enters a certain region, save a recording and notify an external system)
4. Recordings
4.1 The recorded file should be the original stream (resolution, framerate) with no overlays
4.2 The meta data for each frame of the recorded file should be saved to file or database (this is to allow for drawing the detections during playback onto the orignal streams, and also to allow for searching for objects/events)

Most of this can be done currently with SmartRecord as it is saving the original streams as per the image

The issue I am facing is with requirement 4.2. I am not sure if it is even possible to save the meta data for the exact frames that SmartRecord saves.

I would think that one would need a plugin to save the meta data, with the exact same logic as SmartRecord, after the Deepstream components to enable saving the correct metadata (e.g. SmartMetaRecord). Unless there is a different way of matching up meta data with the recorded file, e.g. a frame timestamp instead of frame index.

Any suggestions would be much appreciated.


Hi @jlerasmus, In your deepstream app you need code to actually trigger smart record. So one possibility of what you can do is place a probe on a pad downstream of where your meta is written. If you have something like nvinfer ! tracker ! analytics, you could put it on the analytics src pad (i.e. after all the interesting stuff is written to the meta) for example.

In the probe callback function you put your logic to determine when to start smart-record and start it with NvDsSRStart(). At this point you can also do whatever you want with the metadata, such as write it to a file.
Or… you could wait till the end of the smart record… smart record will call a callback when its finished writing the video file. In this callback you have access to the smart record filename and use that as well…

I’m using the above techniques in my own apps and they work well.

ps. You requirement of allowing sources on different framerates. The deepstream streammuxer won’t work with that so you will have to align the framerates somehow before the stream muxer. You might be able to use the videorate plugin but I haven’t used it myself.

Thanks @jasonpgf2a for you feedback. Conceptually I am with you on where and how to hookup and start smart record, although I still need to implement some of these.

The issue I struggling with is how to go about “syncing” the meta data to the actual recorded file’s frames?

For example:

  • Smart record is configured to pre-record for 5 seconds. From my understanding that might not be precisely 5 seconds but rather on where the keyframe occurred
  • When an object crosses a line, the rule engine (running in the probe) triggers which:
    – Starts smart-record
    – Starts writing the meta data, or starts buffering and then writes when smart-record finishes

In this case I would only have meta data from the event onward, and not the meta data before the event to match the smart-record pre-buffer (e.g. object moving towards the line which triggered the event). Maybe I am complicating things.

ps. You requirement of allowing sources on different framerates. The deepstream streammuxer won’t work with that so you will have to align the framerates somehow before the stream muxer. You might be able to use the videorate plugin but I haven’t used it myself.

I was not aware of this and will need to look further into this as this might cause problems with RTSP cameras having different frame rates. This might even complicate “syncing” meta data further if the streammuxer’s framerate is different from the source.

I know you’d prefer not to but you can run smart record after nvinfer, tracker, analytics and osd elements so that bounding boxes are drawn and show in the recorded files. Avoiding all the synchronisation issues…

Hey Jaco, for DSL, my current thought is to add the Smart Record as a new Sink Component with the Start-record as a new ODE Action. And as Jason mentioned, in this way all meta data will be included in the recorded stream. Not sure if this will meet your requirements, but I will know more in the coming weeks.

Perhaps we should continue this discussion on our other GitHub thread, but having given this a little thought… I currently have ODE Actions for logging the event to file, and Dumping in KITTI format, as examples… I could add a new Action type that saves the ODE data to a standard SQL database file. You could Tee the stream prior to OSD (box overlay) with its own Record Sink (or not use an OSD)… perhaps that would work for you?

I haven’t used SmartRecordBin yet, but I know @jasonpgf2a has, so he’d be better to ask your question. If it were me, I’d attach a subtitle track and stick the metadata in it. Containers like matroska, webm, and mp4 support subtlitle tracks.

You can serialize your detections/whatever in your preferred format (eg. json) and have your custom player draw the metadata (if that’s your desire) at the player end. No re-encode necessary if you simply remux, so no degredation in quality from your source. You can also still split the video or start/stop recording. I am not sure if SmartRecordBin supports this. Somebody from Nvidia or @jasonpgf2a would have to answer that.

Another advantage to a subtitle track would be that you can use standard gstreamer elements to broker it (eg. to a sidecar file like .srt) instead of rolling your own solution for that. Alternatively, if you want to go direct to database, you might want to check out nvmsgconv and nvmsgbroker.

This made me think… maybe an alternative is to have the SmartRecord before the inferencing (as in the screenshot), which allows saving the source streams. And then have additional SmartRecord after the inferencing and osd, with a video resize in between saving smaller previews. This still does not solve the issue of having meta-data matching the fulle pre- and post-record frames.

Do you foresee any issues with such a setup?

Make sense to Tee the stream prior to the OSD, if only then SmartRecord had an option to also wright the attached meta data to file. Maybe someone from Nvidia might help, or tag the right person @dusty_nv, @bcao, @mchi

With regards to the SQL database file, are you referring to a SQLite database file?

Thanks @mdegans, this totally makes sense and might be a viable solution - especially if SmartRecord could add an option of saving the attached meta data to file, or maybe more open ended if a callback could be specified.

This made me think, if SmartRecord does not provide such an option, would it be possible to “inject” elements into the SmartRecord “pipeline” so that one could Tee the stream just before the file sink, and in such a way add elements to write the meta data matching hose frames streamed to the file sink.

I don’t know that you can insert elements inside the smart record bin. It’s probably technically possible but you’d have to unlink what’s in there and insert your own and manage their lifecycle. Your code would probably have to change in every deep stream release as they update smart record as well.

I think it would be easier to just write your own “smart record” functionality. Before nvidia brought out smart record I used to do just that via dynamically adding and removing elements. The only real tricky bit (that took me close to 6 months to understand) is handling the eos properly so that your pipeline keeps playing but your recorded files are finalised properly.
There’s another thread in here where I described the process to someone else. I’m on my phone right now so it’s not easy to search.

On the other issue of timing so that your saved metadata files line up with the frames in your video files… you have the pts available when saving the meta. I haven’t looked at this myself but will that line up with the time stamps of the frames in the file?? In which case that would be your link. ??

How much of the meta data are you looking to wright? Do you need to save all of it, meaning every Object meta, from every Frame meta, from every batch meta? Or just the Object and Frame meta that triggered the recording? Or perhaps other detection events?

Saving the data should be fairly straight forward, it is more a matter of what format and what you want to do with it. SQLite could be an option, I mentioned SQL as meaning “in a standard format sense” such that the data could be easily queried…

Also, keep in mind that if you’re saving the original video stream, replaying the stream through the same pipeline “should” produce the same inference/tracking/display/etc ( i.e. meta data), along with the same detection events.

Again, it really depends on whether you want to save all of the meta data, or just the information that would allow you to locate/seek the frames of interest during playback. Perhaps just the Frame/Object meta data and jpeg image for a preview, for each triggered event.

@jasonpgf2a, somewhat off topic, but I’ve been struggling with that same EOS issue after dynamic source removal, If you you could share that link when you have a chance, I would very much appreciate it.

Re: bin surgery, you can do it with signals as you suggest. When an element is added to a bin, including by the bin itself, a “element added” signal is emitted. The callback signature includes the bin, the element that’s just been added, and anything else you want as user data. You could use this to add a probe anywhere inside the smart decode bin if you wanted to, or even relink things to add another element.

So the idea is in this on_element_added callback you change whatever properties you want on the element or bin. You don’t need access to the bin source code but it might be handy in this case to know exactly how the bin links things (otherwise you can deduce this with the log and some trial and error). Nvidia uses this techinque in one of their examples override the properties on their decoder (to set max performance) when it’s added to decodebin.

Re: timing. I suspect that’s why GoPro embeds their metadata in a data track and why DJI uses a subtitle sidecar file for it’s drone flight metadata. I didn’t invent the approach.

Do you mean that the nvds metadata is saved in a container file like mp4? Or that you can run inference again and obtain the metadata? I think the later is undesirable, as an obvious waste of resources.

Anyhow, I’m actually wondering how is the nvds metadata actually encoded and transported across bins. Is it using the standard GstMeta structure with proprietary encoding on top of it, or is it using an entirely different proprietary mechanism?

I’m by no means an expert here, so please take what I say with a block of salt… but If the playback is “frame-by-frame” identical, and the settings remain unchanged, the GIE’s should produce the same meta data for the same frame each time. perhaps the confidence can fluctuate? The unique object tracking Id can change depending on where you start the playback.

Others with more experience, please correct me if I’m wrong.

If you haven’t already, take a look at

NvDsFrameMeta and NvDsObjectMeta in /opt/nvidia/deepstream/deepstream-X.0/sources/includes/nvdsmeta.h

to see what meta data is produced… and I believe Deepstream uses the standard/scale-able approach built into GStreamer. That said, my experience with GStreamer has come soley from working with Deepstream, so I’m probably not your best source

Re: how metadata is transported across bins: There are various metdata structures that are attached to the buffer. As the buffer travels down the pipeline, so goes the metadata with it. At the end of the pipeline, it’s freed. Various nvidia metadata examples show how to attach your own and parse the existing. Additionally, gstreamer has it’s own facilities to accomplish the same thing (which is probably what Nvidia is using).

Re: metadata saved in a container: it’s not done like this currently but it could be. There are various subtitle elements in gstreamer that show how this can be done and containser like mkv, mp4, and webm support this.

Regarding keeping the meta muxed into the recorded files… There does seem to be plugins and ways to read a .srt file with a filesrc element and then displaying it over the playing video or mux it into a file with the video stream.
In this case though you would be generating a text stream inside the pipeline with the .srt contents instead of reading it from a file. This may mean creating your own component to generate the text stream or using appsrc somehow. ??

My initial thought was to save all meta data for each frame matching the frames in the recorded files from SmartRecord, this could even include the OSD meta from the NVAnalytics plugin. Having all the meta data would allow playing back the original streams and re-drawing the OSD stuff. My thought was to have a plugin read the meta data from file and populate the frame’s meta data, then you could have the OSD just redraw the stuff (e.g. FileSrc > ReadMeta - NvOsd - Display). This should be pretty fast as no inferencing needs to be done.

As you mentioned, an alternative might to store meta data and a jpeg snapshot just for events (and record the original streams). This would definitely solve the part where one wants to search for events (e.g. Show me persons and vehicles for yesterday). These previews could show the bounding boxes. Having a GIF instead of a single jpeg would even be nicer. Playback would then be the original stream without bounding boxes. It would be nice if above is possible to display all meta data (bounding boxes again)

Thanks for confirming that this is possible. I will investigate.

Thanks for pointing these out.

Have a look at my response above to Robert on how I imagined reading and displaying the meta data again with OSD. Would this work?