I’m working on Jetson Nano with deepstream-6.0 and handling a 3264x2464 RTSP input.
I’ve trained a detectnet_v2 resnet_18 model using the tao_toolkit.
What resolutions should I use for training and inference? Is it necessary to resize all the images to 960x544 during training?
And when running inference, can I provide the input as 3264x2464 and expect it to be automatically resized?
The training resolution and inferencing resolution is decided by the model input layer resolution.
The larger the model input resolution, the larger the model size.
DeepStream SDK is only an inferencing framework. Since you have trained the models, you should have known about the model input resolution.
For inferencing with DeepStream SDK, you can use gst-nvinfer to deploy your trained model. The resize, format conversion,… are done inside DeepStream, the only thing you have to do is to fill the configurations with proper parameters. Please refer to DeepStream samples. C/C++ Sample Apps Source Details — DeepStream 6.2 Release documentation
Thanks,
Do you think it would be effective to train the TrafficamNet model using images captured from a height of 12 feet in order to detect cars and pedestrians?
Or should i train another architecture from scratch ?
According to our experience, if your model will be used to inference the pictures captured from a 12 feet height, it is better to introduce such images in your training dataset. TAO toolkit have provided pre-trained TrafficamNet model, you can retain the model with your own dataset. Overview - NVIDIA Docs