HELLO AI WORLD | Deploying Edge AI Web Apps on Jetson Orin Nano

This post covers recent updates to Hello AI World (https://github.com/dusty-nv/jetson-inference)

Whether it’s for your autonomous robot, AI agent, or smart vision system, deploying them with live web interfaces undoubtedly improves their accessibility and practicality for everyday use. From any phone, tablet, or PC, you can instantly view dashboards with realtime analytics, monitor video streams remotely, receive alerts & script actions when events occur. They also make for great interactive learning and development tools.


^ Flask-based DNN Playground for experimenting with various models running in parallel on Jetson Orin Nano.

Taking this a step further, you can also enable users to intuitively tag/label incoming data and dynamically tune models in the background, forming a powerful feedback loop where the application adapts to their needs over time (see the Recognizer for an example of this). These days almost everyone has a camera phone in their pocket to collect data with, and can train their own AI’s to address unique use-cases and specific tasks at hand.


^ Self-hosted interactive training/inference workflow from the Recognizer webapp.

Alas, delivering a frictionless user experience (which is ultimately important to the quality of the data and models) isn’t trivial, especially considering the inherent challenges of edge ML, low-latency video streaming to clients, and the compute constraints of embedded systems.

Fortunately, Jetson Orin Nano has 40 teraops worth of horsepower onboard to not only perform DNN inferencing across many streams and sensor modalities, but also retrain/fine-tune models and host web services locally. It’s essentially a miniature, energy-efficient AI server that you can deploy into the field for local processing of your data at the point of origin.


^ Jetson Orin Nano Developer Kit and compute module with 1024-core Ampere GPU, 6-core Arm CPU, and 8GB 128-bit LPDDR5 memory. Available for $499 or $399 with EDU discount.

In a set of recent updates to Hello AI World, I added WebRTC support for low-latency video streaming to/from browsers (with Python/C++ on the backend and Javascript on the frontend) along with several examples of building edge AI web apps. It also works as-is with existing applications.

WebRTC has been seamlessly integrated into the video streaming APIs from jetson-utils so that from a developer’s perspective, it acts like any other input or output stream. All you have to do is create them with the webrtc:// protocol and they will spawn a WebRTC server:

$ detectnet.py --model=peoplenet /dev/video0 webrtc://@:8554/output

Then just navigate your browser to your Jetson’s IP on port 8554:


^ Built-in WebRTC server in Hello AI World can be used from existing inferencing applications or webapps.

It supports multiple clients (sharing the encoded video between them), multiple streams, and send/receive (so you can both send video from your Jetson, and receive video from the client’s cameras). On a local wireless network, the latency isn’t typically noticeable even in full-duplex mode where the browser webcam is being sent to the Jetson for processing, and then the Jetson sends it back to the browser.

There have also been several examples added of how to build frontends around these capabilities using various web frameworks, ranging from vanilla Javascript/HTML5/CSS to Python Flask, and Plotly Dash:

We’ll dig into each of these in future posts in the series. Not only are these useful for creating AI-powered webapps that tie into the real world, but also as interactive learning tools and for experimentation. Don’t miss the educational discount now available for Jetson Orin devkits. Happy building!

For the sake of completeness, there’ve been a number of other new features added to Hello AI World and jetson-inference/jetson-utils including:

Thanks to everyone who has tried this project over the years. If you ever need support, feel free to post an issue on GitHub or the Jetson forums.