Hello All,
I am trying Object detection using YOLO - Deep Learning. On my PC(RAM - 16GB, i7 Processor, CUDA 7.0, GPU memory 3007MiB) the performance is 10fps. I wanted to improve the performance of the algorithm and hence used Jetson TX1 development kit. I have followed the github link of Dustin Franklin url[/url] for object detection using Tensor-rt. But using the development kit my performance got reduced to 5 - 6fps. Kindly help me with following queries:
- Whether Tensor-rt supports YOLO(as I have read that Tensor-rt works with caffe).
- Is there any way we can convert the Yolo files(weight and cfg files) to equivalent caffe(caffemodel and prototxt files).
- If yes to (1), then What can be possible reason for poor performance in TX1 development kit and how we can resolve the this issues.
- If no to (1), then what can be the alternative.
A big thanks in advance for taking your time out to help me. I have attached system details of my PC.
Hi,
Please help us to maximize the cpu/gpu frequency to get best performance first.
sudo ./jetson_clocks.sh
TensorRT supports YOLO but needs some alternative on leaky RELU.
More details, please check this: https://devtalk.nvidia.com/default/topic/990426/tensorrt-yolo-inference-error/
Hi AastaLLL,
Thank you very much for sharing your knowledge.
In my knowledge sudo ./jetson_clocks.sh is to increase the clock limit to increase the performance of TX1. That already I have tried but it doesnt made much difference. I am currently using YOLO(YOLO: Real-Time Object Detection) for Object detection. Link you provided for TensorRT supports YOLO is good but I have a query that We need to convert the network structure of our neural network in equivalent format of caffe framework.
As in the link it is described as the person has used YOLO but the network structure is in prototxt file format and the weight and biases are in caffemodel file format. But in YOLO the network structure is in cfg file format and the weight and biases are in weight file format. In order to make it run we need to convert the cfg file to prototxt and weight file to caffemodel.
Is there any script to convert cfg file to prototxt file?
(OR) we need to manually create the network structure in prototxt file.
Hi,
Thanks for your feedback.
As you said, jetson_clock.sh will maximize the cpu and gpu frequency.
Generally, it should be around 10-11fps with the maximal frequency.
If you can’t reach this performance, please let us know.
For model format, we use this public script to transfer model into caffe: (YOLOv1 only)
And as the issue mentioned in #2, please apply our WAR on leaky-RELU to make the network run correctly.
Here you can find the supported layer of tensorRT:
https://devblogs.nvidia.com/parallelforall/production-deep-learning-nvidia-gpu-inference-engine/
Hi,
Thank you once again.
I have followed all the above mentioned links but the performance of the algorithm is not affected much. It just reached 7fps with low accuracy.
I just wanted to enquire Is there any way to maintain the performance of atleast 10fps with good accuracy.
Hi,
Thanks for your feedback.
May I know the 7fps is result of YOLO or detectNet?
We have verified that detectNet can reach around 10 fps in TX1 with maximal frequency before.
If you want to solve a detection problem, it’s recommended to use detectNet.