How I used Jetson Nano and Vertex AI to catch a bus

I am excited to share with you my personal project which I have been working on in my spare time during the last 3 months or so.

Well, my first goal was to get my hands dirty in things I have little understanding in, secondly to solve a real-world problem, and then to encourage everyone to chase their ideas, even if they won’t seem to have a great impact on the world.

Problem: Near my place there is a bus stop, so being dependent on the bus schedule I was thinking whether I could create a systemic solution that will predict the time when the next bus arrives.

I decide to come up with my own solution based on machine learning that will predict the arrival time of the next bus.

Alright so first of all to be able to predict bus arrival time we should train a model that has bus arrival times for a considerable time period. Not to invent a wheel and sit next to the window and track the buses, I decided to use a security camera and image recognition software.

Like in any complex problem, to come to an effective solution, the problem was separated into smaller parts(You can check the final infrastructure diagram at bottom of the page)

Part 1: Use image recognition for tracking bus arrival times and store in the database
Obviously we need a security camera and system that processes the video stream, so for the camera, I used Dahua IP camera and for the processing initially I thought to use Vertex AI which provides solution for image and object detection, classification and a lot more but given the possible network and electricity issues, I decided to process stream locally and used NVIDIA® Jetson Nano™ nano which is “small Developer Kit, a powerful computer that lets you run multiple neural networks in parallel for applications like image classification, object detection . . .”. This comes with libraries and trained models which you can check here.

On the left you can see router with PoE adapter and Jetson Nano. On the right side is Dahua IP camera mounted on the ceiling of my balcony.

After mounting the security camera I was able to connect to the camera stream from Jetson Nano via RTSP. Using imagenet class and pretrained model from previously mentioned repo I was able to get basic classifications for stream at first run.

But we still need to detect the bus, don’t we? As with any model we need sufficient data to train it.

To describe it briefly I used an already trained model to take a screenshot from the stream every time it detected bus, after having almost 100 pictures of a bus I trained my own model.
To say that things were perfect at first would be wrong, apparently I needed more pictures to make my model more precise, but after having 300 pictures and constant training, the system got better and better.
At the moment model has been trained with more than 1300 pictures and it detects the arriving and departing bus even in different weather conditions, moreover scheduled bus was segregated from random bus.
I have trained three classes arriving_bus, background(everything that is not scheduled bus), departing_bus

If arriving_bus class prediction is greater than or equal 92% for 15 frames then it writes the arrival time to CSV file.

Here are small but important things:

  • I am tracking bus arrivals from 9am — 18pm, Monday to Friday(thanks to cron job)
  • As the buses follow a certain round trip, I am tracking both “arriving” and “departing” buses near my place. We are going to focus on arriving buses only
  • System takes screenshot from the stream every time it detects a bus, which is done for future model retraining and finding false positive detections
  • Instead of writing the time in csv, the system writes the seconds calculated from basis time of 9am sharp, e.g. 9:05:00 is stored as 300

Recap: Alright so at this point we have a system that detects and logs arriving bus time to CSV file locally. However, it would be really great to have the data stored into the cloud which gives a more flexible and sustainable solution that would cater for future enhancements. As you may guess from the title we are going to use GCP and data will be stored in BigQuery instead of CSV file. To write the data to BigQuery, I decided to use Google IoT service.

Part 2: Connect Jetson Nano to Google Iot Core and write data to BigQuery

So for this, I have used google-python-repo_iot file which uses private key to create JWT token and communicate with GCP IoT Core.

Following the best practices, I created infrastructure with Pub/Sub and Dataflow to write data to BigQuery. I used Dataflow to parse the data to BigQuery format.

As you can see from code in, I am still writing the bus arriving time to a CSV file and sending the same data to BigQuery.
This kind of procedure gives me reliability so I am sure even If I won’t have Internet connection I won’t lose data
For data resilience I was thinking of possible solutions to sync CSV and BigQuery at the end of day but that’s a topic for another time.

Recap: So we have a system that detects the bus and writes data to BigQuery. Now we need to create predictions based on the data we have.

Part 3: Create a model that will predict when the next bus will arrive.

To indicate buses and to be able to predict bus arrival time I started to label buses from 9:00 starting from bus_1 to bus_n.

Data looks like this.

To predict next bus arrival time, I created a model with Vertex AI Regression service. It’s pretty easy to do especially when you have data in BigQuery you can check this video to see how it’s done.

After the training, the model was deployed to an endpoint and on input accepts the bus number and provides the predicted arrival time on output more than Vertex AI provides API to query the model.

Recap: We have a complete system that detects and predicts bus arrival time. I needed some interface that would help me to know when the next bus would arrive or when was the last bus. Instead of building a website that makes respective API calls, I decided to use a voice assistant.

Part 4: Use voice assistant to get info on the last arrived bus or when will next bus arrive

Before configuring voice assistant we need to get the time of the last arrived bus and next bus possible arrival time. For that we are going to use Cloud Functions and create two separate functions.
The first cloud function queries Bigtable and outputs the last bus arrived time.
Second one queries to check the latest bus label so it can query the Vertex AI API for the next bus arrival time and output the predicted arrival time.

Initially, I hoped I would use Google Assistant(GA) as my all services were on GCP, but it turned out it’s harder than I thought, so first of all GA Actions support only node.js. Secondly I really had a hard time trying to connect GA and Cloud Functions. I even tried to use it with IFTTT but it didn’t help much . . .

Not to learn node.js I decided to use Alexa Skill which is Amazon’s voice assistant. It supports python and it’s accessible from my phone, also I have an Alexa speaker at my place.

So how can Alexa communicate with my infrastructure(in this case with Cloud Functions) on GCP? A possible solution is a peer2peer connection between AWS and GCP, but I just used publicly accessible cloud functions (I do not recommend making your Cloud Functions publicly accessible, by default they only can be accessed from within your VPC)

I have created Alexa Skill which queries respective Cloud Functions triggered by voice invocations.
Hopefully this will help me to catch bus, although at this point I am not using bus at all)
Here is the final demo!

The final architecture looks like this

This is the prediction(Column A) of the model and actual results from some random day(Column B).
As you can see first two hours look pretty good but after 11am things goes to bananas.

I have never expected to get 99% accuracy given the time and the single source of data.

This was an awesome project to work on. I really enjoyed the process and learned a lot.

I am already thinking about possible advancements like considering traffic congestion data for prediction and using solar panels to power up the whole system and make it autonomous. I think also the project will benefit from introducing DevOps practices.

Check Github repo

P.S Thanks to all people who supported and encouraged me while I was working on this project!

P.S1 As I have mentioned before I did this to learn new stuff and clear my backlog but the reason I published it is to encourage YOU to embody ideas you have no matter how crazy/hard/impossible they sound to you or/and others . . .

I would also like to thank the NVIDIA Team for making this possible!


Looks cool! I wish I lived next to the bus stop in order to try it out :D