But even after searching a long time how to do that, it is not clear to me.
I can see that there are all these tools to do training or inference like the Transfer Learning Toolkit, TLT computer vision pipeline, TensorRT, JetPack, DeepStreamSDK and some of them seem to run in docker containers. Then there are these conversion tools to convert them from/to .tlt, etlt and .trt and so on.
How does any of these brings me closer to my goal to do inference on the Jetson Nano or if it is just a x68 PC for now?
If you could just let me know if this is possible and if so what is the way to go, that would be great. Thanks!
“tao gazenet inference xxx”. See Gaze Estimation — TAO Toolkit 3.22.05 documentation . For this approach, suggest you to run official released notebook as the starting point. This notebook will download public dataset and run training and inference.
This approach runs in x86 PC only.
I tried running the Facial Landmarks Estimation app and it works on a single image. However, if I input a video, it is super slow. Sometimes it takes a minute for a single frame. If I also try to output the results in an output video, the app just get stuck after 6-7 frames and does not continue even after 10 minutes waiting.
The gaze estimation works on one specific image now. For all the other images I get a segmentation fault even though they are from the same camera / same size.
But yeah, also in this case, super slow.
DeepStream can generate engine from such models but the implementation of buffer allocation has some problems. So if running the GazeNet sample application without engine, it will fail with core dump for the first time running. The engine will be generated after the first time running. When running the applications again, it will work.
That sounds a bit like it sometimes works and sometimes it doesn’t. Does that mean that this whole software is just not ready for real usage or should their still be an issue on my side?
Internal team is working on that. It will be available in future release.
Adding gaze estimation values as text overlay was at least straight forward.
However, the gaze estimation does not seem to work well with the infrared red images I use. At least it does not react on pupil movements. Face detection and face alignment work fine. Just not on the bright white pupil from the infrared.
That is a pity, but I assume the only way to fix it would be to train it with data from this camera.