Jetson Nano and Pedestrian detection

Hello, I need your expertise.
i’m trying to make an intelligent video surveillance system.
I got four ( and 8 in the further ) cameras. And the goal is to detect pedestrians.
I use jetson inference DetectNet. It work but i need more accurency… Somes chaires are taken for a person.

I try SSD-ResNet-v2, but i got some falses predictions. Like a chair take for a person.
Inception-v2 is not really more efficient.

PedNet is better but still not really accurate. And little too slow. I still got my chair take for a person.
I try to change threshold.

Sometimes the depestrian is not spotted.

Do you have any tips to improve accuracy?
I tried ImageNet but the results were not encouraging

Thank you !


Do you have the database for your use case?
If yes, it’s recommended to use Transfer Learning Toolkit below:

There is an in-loop pruning stage can lower the mode complexity but remains the accuracy.

Hello, I don’t have database. But I look forward to DeepStream and I’m interrested by PeopleNet-ResNet34
I will try to use it in few days.

Thank you for your answer ! Have a nice Day

Hi Charly,

I am working on the same application.
Have you tried the resnet10 model that comes with deepstream?
One of the 4 classes is “person”.
I am running that on a Jetson Nano with 2 RTSP streaming currently.

Few months ago the demo of a Jetson Nano inferencing 8 parallel streaming from 1080p cameras at 30fps was everywhere:

Do you plan to leave your system on 24h?

Have a nice day :)

Hello borreli.g92.
I will take a look to DeepStream next week-end. It seems to be good !
Where are you from ?
We may work togather on this project !


Yes, I believe that if you are looking for an optimized computer vision pipeline for object detection, deepstream is the way. Inside the deepstream material, I personally found very interesting the sample-app “test3” for the multistream and “test4” for the message broker libraries.

I am currently dealing with some issue due to crashes after long runs on the Nano.

That surely needs to be taken into account in a Nano-based surveillance system.

I would like to try “test3” example. I already take a look.
You may try to put the OS on usb drive for corruption problems.
I found a tutorial. I can give you the link

Hello borreli.g92. I added swap. It look like to work fine.
but, i want to use Peoplenet model.
I don’t really understand where to put the model downloaded on ngc and how to configure the config fil.
i try test3, but it doesn’t detect anythink when it is night

Hi @Charly,

Do not you have higher SD corruption risk if you enable swap memory?

Here you can find the specification about the config file:

The models from test3 fails to detect during night most probably because you are passing a grayscale image.
I was trying to use solve this issue by putting an OpenCV preprocessing before the detector model. I have seen that a simple Sepia filtering is improving the detection rate significantly.

Maybe you have some ideas about how to solve the point that is blocking me:


Yes of course, it will corrupt the SD Card.

But I have in project to remove the SD card and run on a USB key.

I want to finalize my program before.

I will read your link. Thank you.

Ok for sepia filter.
Is it possible to get raw data from streammux ? If is it, it will be easy to apply the filter before inference
I’m using python.
Your example look like C++ ?

If is not possible to get raw data. I think the best way is to use jetson inference.
I read somes topic, where is say, is it possible to use PeopleNet model.
I’m at work today but I will take a look when I go back at home !

Hello, I just take a look in deepstream-test3 python code.
So I think the process ( take image from cam, preprocess and inference etc…) is done in a loop who is not accessible.
This is done in compiled library of deepstream sdk ( I think) .
So, the easiest way to pre process images is to use Peoplenet pre trained model in jetson inference.
Here : Jetson inference
I’m lookinh a way to convert tlt Peoplenet pruned model to tensorrt model.

I will try to use tlt-converter
Here: Tlt-converter

So,in jetson inference you have a script named
It is easy to manually recover data from camera and apply sepia filter with opencv.
i will come back to you when i do it.

Hi @Charly,

Thanks for your message.

I am not sure that Jetson Inference + Peoplenet would achieve the FPS performance that are needed according to my idea. I would like to have 6 RTSP input processed at 15FPS. The resolution will be low (e.g. 640*480).

How many cameras do you plan to use at the same time?

In any case, I understand that deepstream is the most efficient way to create an inferencing pipeline on Jetson Nano, but I might be wrong.

When it comes to the pre-processing, I tried to create a custom gstreamer plugin following the gst-dsexample from the deepstream source file.

I am able to do some easy stuff (e.g. draw rectangles) and I also have found example to save images to disk.
However, I am not able to run an opencv sepia filter il the most simple way:

cv::Mat kernel =
(cv::Mat_<float>(3, 3)
	0.272, 0.534, 0.131,
	0.349, 0.686, 0.168,
	0.393, 0.769, 0.189);
cv::transform(input_img, output_img, kernel);

I also have posted some code snippet in my messages here: Adding Preprocessing to Frames RTSP

Have a nice day :)

Of course you are right. With ssd-mobilenet-v2 you can get 25 fps for just one camera.
At this time I have 4 camera and 6 or 8 in the further.

I plane for for 3-4 fps per cam. I have images with larges field of view. So, the time he crosses the field of view, he will be spotted.
With jetson inference it will be possible for 4 cams. But 8…
So DeepStream is optimized for this job so…
But I don’t have the skills to modify DeepStream.
I will look at your links.
Thank you.