I am interested in running the googlenet car detection sample given with the multimedia api. I would like to test it in realtime video stream as detailed here: https://devblogs.nvidia.com/parallelforall/jetpack-doubles-jetson-tx1-deep-learning-inference/
What optimizations do I need to use? And how do I translate the googlenet output into a bounding box? It is unclear to me from just the .prototxt and .caffemodel files how to do this? Could somebody share an example program?