I convert a tensorflow model to UFF. Howerver, the inference time is slower than the original tensorflow model. My GPU is 1070
Could you share more information about your use-case?
- Which network do you use?
- What is the environment of your use-case?
Is it TensorRT on Jetson and TensorFlow on GTX-1070?
My network is mobilenet, both TensorRT and TensorFlow on GTX-1070
My environment is Unbuntu16.04, cuda8.0.
We found a public mobilenet repository.
Is this the model you are using? We want to check it further.
It is exactly what I use.
By the way, I found another confused problem.
I follow your example lenet.py. When I covert the whole graph to UFF, the tensorrt is faster than tensorflow. However, when the graph is only one convolution layer, the tensorrt is much slower than tensorflow, I don’t know the reason.
Hey,i also have some trobule with the lantency of tensorRT.
My uff-tensorflow model in tensorRT (with docker) takes 300s for 1000 inferences.And in naive tensorflow 1.8 it only takes about 30s??!
Don’t familiay with cuda.So ,the main advantage of tensorRT is the throughput， or there should be some bug in my code?
ps:both test on p100
Do you use TensorRT API or TensorFlow-TensorRT interface?
I encounter with the same problem as cosin7877, I use TensorRT API and convert the tensorflow model to uff file. Usually
,what should be the reasons of these problems?