Working with gxf entities video

Hello,
I am a new with Holoscan and want to ask some questions on the gxf entities video files.

  1. Do I understand it correct that this is video which was split on different parts to be processed in a graph?
  2. In order to get the gxf entities video from the original one there is a Python script. Is there a C++ implementation?
  3. How can I restore video back from the gxf entities video to the original format?
  4. If I want to process video with some a library (to work with the video frames) after reading it from the disc with the video-replayer, how can I do this? Do I need to create a C++ style operator and receive the output from the video_replayer? In this case can I treat the input as frames or not?

With the best wishes,
Valeriy

Hi Valeriy, just to make sure we’re on the same page, we’re talking about file pairs such as {surgical_video.gxf_entities, surgical_video.gxf_index}.

  1. The original video is converted to a GXF entities for playback with the stream_playback operator. There is a new video playback operator that is able to take in H264 format so that converting to GXF entities is no longer required if you choose to use the new operator, for an example please see holohub/applications/h264_endoscopy_tool_tracking at main · nvidia-holoscan/holohub · GitHub. Please note that: The H.264 video decode operator does not adjust framerate as it reads the elementary stream input. As a result the video stream will be displayed as quickly as the decoding can be performed. This feature will be coming soon to a new version of the operator.
  2. The convert_video_to_gxf_entities.py script is only in Python but utilizing the script should be a one-time event for each video you want to convert to GXF entities, and the conversion is ahead of not at the same time as app runtime.
  3. To convert the GXF entities back, the reverse script is coming at the end of July.
  4. You can treat the input from a video replayer as frames, and write your own processing logic in a native Holoscan operator in either C++ or Python. I would suggest looking at the C++ and Python examples here holoscan-sdk/examples/tensor_interop at main · nvidia-holoscan/holoscan-sdk · GitHub, as well as other examples on the holoscan-sdk repo. You can also see HoloHub’s applications holohub/applications at main · nvidia-holoscan/holohub · GitHub for using custom processing on frames, for example holohub/applications/ssd_detection_endoscopy_tools at main · nvidia-holoscan/holohub · GitHub also demonstrates how to do custom processing, holohub/applications/ssd_detection_endoscopy_tools/ssd_step2_route1.py at main · nvidia-holoscan/holohub · GitHub shows using the torchvision library in a native Holoscan operator.
1 Like

Thank you very much for your explanations!