Jetson inference Detectnet Input resolution/Use of IP Camera with higher resolution than 1080p

CN_Jetter · November 18, 2021, 12:55pm

Heyho,
I’d like to use the Jetson inference Library for a Project and I’m stuck with using already installed IP Cameras with 1440p image format and try this workaround for RTSP image input

Do I have to preprocess the images made by the camera before I can label them? I plan to Label them using labelimg like someone does in this post

Does the Transfer learning feature of Jetson inference take images of any resolution and does the resizing by itself.
Same with the Camera input, do I have to resize it?

Thanks in advance

markopo · November 18, 2021, 1:55pm

Good afternoon.
I think most valuable for easy starting is to watch dusty explanation on Jetson AI Fundamentals - S3E5 - Training Object Detection Models - YouTube

Marek

dusty_nv · November 18, 2021, 8:02pm

Hi @CN_Jetter, it automatically handles the resizing internally (both for training and inference), so it doesn’t matter what the camera resolution is. For example, with SSD-Mobilenet the video feed will automatically be resized down to 300x300 before being fed into the DNN (you don’t need to worry about this or take it into consideration during labelling)

After trying labelimg and CVAT, I prefer to use CVAT tool because it exports the dataset with Pascal VOC directory structure. Whereas labelimg just gives you a bunch of XML files. In theory each can be made to work though.

This workaround was from before RTSP input was natively supported in jetson-inference/jetson-utils. See here for more recent documentation about it: https://github.com/dusty-nv/jetson-inference/blob/master/docs/aux-streaming.md#rtsp

CN_Jetter · November 19, 2021, 8:13am

Thanks a lot for answering all my questions <3

After trying labelimg and CVAT, I prefer to use CVAT tool because it exports the dataset with Pascal VOC directory structure. Whereas labelimg just gives you a bunch of XML files. In theory each can be made to work though.

I was kinda set off by the rather complex interface of CVAT (couldn’t find the screen where I can create the bboxes with the lables).
I should give it a second chance. Thank you for pointing out the benefits of CVAT

dusty_nv · November 19, 2021, 7:27pm

Yea, CVAT interface is more complex. With any tool I recommend to do a small subset of annotations first (say just 10 images), and then make sure you can run that through train_ssd.py to make sure it’s working before you spend a lot of time annotating.

The only mod needed with CVAT was to make a labels.txt for the dataset (the class names, one per line). The text file it exports in the root level of the dataset is not that.

system · December 8, 2021, 7:08am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.