Recently released JetPack 2.3 includes improved performance in deep learning. And last week at GTC Europe, the latest GPU-equipped deep-learning robotic technology was unveiled. Curious how to deploy neural networks? Start developing applications with advanced AI and computer vision today using NVIDIA’s high-performance deep learning tools. Join us for this online webinar being held on 10/12, where we’ll share and discuss:
How to use NVIDIA’s deep learning tools, such as DIGITS and TensorRT.
Various types of neural network-based primitives available as a building blocks, deployable onboard intelligent robots and drones using NVIDIA’s Jetson Embedded Platform.
Realtime deep-learning solutions for image recognition, object localization, and segmentation. (GitHub)
Training workflows for customizing network models with new training datasets and emerging approaches to automation like deep reinforcement learning and simulation
Thanks everyone for joining us for the webinar earlier today! And for those of you who would like to watch at another time, the recording is available. To access the slides, you can find them located here:
Thanks for the Webinar @dusty_nv
I could hear the audio, but no visuals yesterday.
Checking it now.
P.S.:
The presenter said an email will be sent after the webinar, but that didn’t arrive either.
Is that a problem at my end? Did anyone else receive emails on this?
I am getting all other emails so not the problem with filter.
Thanks for your presentation.
I am testing TensorRT and your demos available on github.
I would like to ask you two questions related to my TensorRT use.
1/ TensorRT supported layers
You explained that any caffe prototxt can be converted to TensorRT format for fast inference.
You mentionned TensorNet class can do this conversion for imagetNet or detectNet net types.
How is it possible to do it for my own designed nets since it seems not all types of layers are supported for this conversion ? What are the exact list of supported layers ? Will this list be extended ?
2/ FP16 usage
As I did not manage to convert my network and get all the benefits of TensorRT optimizations, I wanted at least to use FP16 instead of FP32 and figure out how much speed improvement I could get. Unfortunately, using the nvcaffe fp16 experimental branch, I got the exact same problem since this branch is not maintained and recent layers such as dilated convolutions are not supported. Is there any up to date fp16 compatible caffe version available somewhere ? If not, what is the methodology to make my layers FP16 compatible ?
Hi Alex, the exact list of supported layers is from the TensorRT documentation - to obtain it, you can either download JetPack 2.3 to a host PC and it will download the packages to a folder underneath JetPack 2.3 that you can unzip and read the docs. Or you can get the desktop version through the NVIDIA website. Offhand here are some supported layer types, with more being added in the future in addition to support for adding custom layers:
Convolution: 2D
Activation: ReLU, tanh and sigmoid
Pooling
ElementWise
LRN
Fully-connected
SoftMax
Deconvolution
In the future the nvcaffe/fp16_experimental branch should be updated, however experimenting with latest layers and FP16 support today, I tend to use Torch on TX1 which includes working FP16 in cutorch and the latest cudnn 5.1 bindings. Regarding the latest Caffe layers, they would need to be optimized for FP16 in a similar fashion to the existing layers in nvcaffe/fp16_experimental. My understanding is Torch somewhat skirts the need for per-layer FP16 optimizations, because the underlying tensor operations in Torch have now already been accelerated with FP16. Hope that helps!
Hello again !
Thanks for your quick answer.
I will have a deeper look in the Tensor RT documentation.
Can you give me the exact path to acces this documentation on my host PC ?
Last question, do you have any idea when custom layers will be supported ?
Alex
On your host PC, you would either get the host version or if you ran JetPack 2.3 on your host, it would download the GIE/TensorRT package for ARM and you could retrieve it from there. Not sure when the next version would be out, but since TensorRT 1.0 allows you to bind to any output blobs, in theory you could intercept the data and run the custom layers elsewhere (i.e. caffe fp32)
How can I use TensorRT without nvcaffepasser ? In my application, caffemodel and prototxt are encrypted.
Should I use INetworkDefinition Class to define my network builder from encrypted file ?
Hi Renbo, yes you should be able to do something of the like — TensorRT includes C/C++ interfaces for configuring layers in addition to the caffeparser. I believe you may also be able to pass in a string to nvcaffeparser that was decrypted in memory by your application.
I checked out nvidia.com/DLI for the Deep Learning Institute, and couldn’t find much there aside from the Udacity course. Are there links you have to the other self-paced courses and hands-on labs mentioned?
I would love some links to a tutorial on DIGITS, and how to get started to train object recognition - similar to the TX1 robot that would find bananas and avoid apples.
I WAS very excited to sign up for the “Introduction to Deep Learning Quest” on QwikLabs.
However, I was disappointed when I logged in today and found that the quest has disappeared from the site, along with the labs on DIGITS, Cafe, Theano, and Torch.
Any idea who can help, or explain why these courses were pulled? The links for “Hands On Lab” work, but when clicking “Select Lab” within the page, QwikLabs doesn’t show the course.
I have succesfully trained a convolutional model for classifying small images (64x64). I now want to deploy it on my Jetson TX1 using TensorRT, taking as input bigger images, slicing them and feeding the network with batches of 64x64 images. I saw your code on github, thank you very much for that, it’s really helpful.
Sadly, I’m not really into CUDA and I was wondering if you have some example code on how to run inference using batches of images or maybe point out the way to achieve this by editing portions of your code.
Hi Martin, thanks for the suggestion, I will plan on adding support for batches starting with imageNet. Also I am adding an API to imageNet which returns the top N outputs instead of just the maximum output. I’ll post it here when the patch for batch processing has been committed.