detectnet-camera : usage with customized model

Hi,

With detectnet-console.py and using “googlenet” as network, I am able to see this working detecting different objects and the bounding box is seen in the images.
./detectnet-camera --network=googlenet --camera=0

The same when I tried using the cat-dog model, I am not getting the detections working. I downloaded cat-dog model and created the resnet.onnx model out of it. And used this as per below command…

./detectnet-camera --model=/home/jetbot/shankar/resnet18.onnx --input_blob=input_0 --output_blob=output_0 --labels=/home/jetbot/datasets/cat_dog_limited/labels.txt --camera=0

My understanding is that if I give my customized model, detectnet-camera should use this and detect object as per the customized model trained objects.

What is the difference between giving network as an arguments versus model as an argument?

Let me know how can I use customized model to detect objects live from the camera.

Regards,
Shankar

Hi Shankar, you should be using the imagenet-camera application, as these are classification models.

I think the first one ran with “googlenet” as it probably fell back to the default network (ssd-mobilenet-v2). However in the second case, when you specified the full path to your custom model, detectNet would fail to load classification model because it is expecting a detection model.

Instead your command line should be:

$ imagenet-camera --model=/home/jetbot/shankar/resnet18.onnx --input_blob=input_0 --output_blob=output_0 --labels=/home/jetbot/datasets/cat_dog_limited/labels.txt --camera=0

Hi Dusty,

With Imagenet-camera the object classification works. But I wanted the co-ordinates of the object as well. Hence I want to use detecnet-camera with my customized model.

I have added both detectnet and imagenet code into one file (imagenet-console_camera.py) as below…


create the camera and display

font = jetson.utils.cudaFont()
camera = jetson.utils.gstCamera(1280, 720, opt.camera)
display = jetson.utils.glDisplay()

load the recognition network

net = jetson.inference.imageNet(opt.network, sys.argv)

load the object detection network

net_detect = jetson.inference.detectNet(opt.network, sys.argv, opt.threshold)

process frames until user exits

while display.IsOpen():
# load an image (into shared CPU/GPU memory)
#img, width, height = jetson.utils.loadImageRGBA(opt.file_in)

# capture the image
img, width, height = camera.CaptureRGBA()

# detect objects in the image (with overlay)
detections = net_detect.Detect(img, 1280, 720, opt.overlay)	
#print("\n Image loaded is:",width,"X",height,"\n")

# classify the image
class_idx, confidence = net.Classify(img, width, height)

# find the object description
class_desc = net.GetClassDesc(class_idx)



# print the detections
print("detected {:d} objects in image".format(len(detections)))

for detection in detections:
	print(detection)
	
# overlay the result on the image
if confidence > 0.4:	
	font.OverlayText(img, width, height, "{:05.2f}% {:s}".format(confidence * 100, class_desc), 5, 5, font.White, font.Gray40)

# render the image
display.RenderOnce(img, width, height)

# update the title bar
#display.SetTitle("{:s} | Network {:.0f} FPS".format(net.GetNetworkName(), net.GetNetworkFPS()))

# print out the result
#print("image is recognized as '{:s}' (class #{:d}) with {:f}% confidence\n".format(class_desc, class_idx, confidence * 100))

# print out timing info
#net.PrintProfilerTimes()

# overlay the result on the image

The command I am using is,
python3.6 imagenet-console_camera.py --model=/home/jetbot/jetson-inference/python/training/classification/cat_dog/resnet18.onnx --input_blob=input_0 --output_blob=output_0 --labels=/home/jetbot/datasets/cat_dog/labels.txt --camera=0

With this the object classification is working…but I am not getting the oject detection co-ordinates properly…the bounding rectangular boxes is not correct.

The classification model can’t be used for object detection - their DNN architectures are inherently different. The classification model wasn’t trained to detect bounding boxes, only to recognize the object class.

The project does come with pre-trained models for SSD-Mobilenet and SSD-Inception which were trained on the MS COCO dataset, which includes classes for cat and dog (see the COCO class list here). So if desired you could use ssd-mobilenet-v2 model to detect those coordinates.

In that case, you would want to create your detectNet instance like below instead of from the same classification network as the command line:

net_detect = jetson.inference.detectNet("ssd-mobilenet-v2", threshold=opt.threshold)

Hi Dusty,

What I understand from your comments is that I cannot use my customised model which was created for classification for Detection.

How can I generate my own customised model for Detection?
Can the model be common for both classification and detection?

Regards,
Shankar

Hi Shankar, see this post for links to re-training SSD-Mobilenet using TensorFlow Object Detection API:
https://devtalk.nvidia.com/default/topic/1070225/jetson-nano/digits-or-somthing-else/post/5421938/#5421938

Detection performs many potential classifications per image (e.g. multiple bounding boxes, each classified as cat/dog/ect). Whereas typical image classification classifies the entire image as one thing. If you are doing detection, typically you don’t need or want the entire-image classification also.