I’ve created a custom tiny YOLOv4 Tensorflow RT model, which I’m running on a 4GB Nano development board using this Python repository, for live inference.
My input size is 416x416 (needed because I’m trying to detect relatively small objects within the frame), and I’m using 8-bit integers. The rest of my setup details can be seen in this jtop output:
I’m only managing to achieve a maximum throughput of 2-2.3fps, and need to improve this to at least 15fps.
I was wondering if there are any further steps I can take (beyond using TFRT, tiny YOLO and 8-bit integers) to improve the fps on the Nano, and what speed improvements I might expect by moving up to one of the more powerful Jetson boards?
Could you check the Makefile in the folder first?
Since it links the Deepstream header with a relative path, you need to update it if coping it elsewhere.
Fantastic - thanks very much! I now have a working model, running at 35fps, which is great.
But, I get Num classes mismatch. Configured:2, detected by network: 80 warnings when it’s running.
My original darknet cfg file definitely had only 2 classes, as does labels.txt. And Deepstream fails to run the model if I set num-detected-classes=80 in config_infer_primary_yoloV4.txt.
Can you suggest anywhere else where the mismatch might have occurred?