Please, some look into my questions and try to give me some hints. I know 6 questions is a lot but please look and maybe you can help me with a few?
I think those questions and answers are something what some other people might find useful also.
For a 8 months I have been developing a solution to detect defects on white wooden details (or blanks) - have learned a lot about machine vision and about python and about Dusty’s jetson-inference. But i have had in mind some questions, maybe someone gives me some hints please.
- I’d like to use with Jetson Inference 1920x1080 and 60FPS with 4 cameras but getting this notice:
gstreamer] gstBufferManager – map buffer size was less than max size (1382400 vs 1382407)
[gstreamer] gstBufferManager recieve caps: video/x-raw, width=(int)1280, height=(int)720, framerate=(fraction)30/1, format=(string)NV12
[gstreamer] gstBufferManager – recieved first frame, codec=raw format=nv12 width=1280 height=720 size=1382407
Any ideas how to solve this issue? With 1280x720 and 30FPS and 4 cameras it’s working.
If i view Jetson Power GUI and run detectnet.py ( dusty-nv/jetson-inference: Hello AI World guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and NVIDIA Jetson. (github.com)) then I dont see that engines like “dla0”, “dla1”, etc are used? All engines appears to be offline. What does it mean? I think if we use all the NX resources or not? (or this stats doesnt show right information)
In my script i use only detection and not saving or processing or any other things - but i see that GPU usage jumps from time to time to 80% - its strange (it’s not a problem, but I still think why it is so):
#lets configure AI network
net = jetson_inference.detectNet(argv=[‘–model=/home/visioline/install/jetson-inference/python/training/detection/ssd/models/jw3/ssd-mobilenet.onnx’, ‘–labels=/home/visioline/install/jetson-inference/python/training/detection/ssd/models/jw3/labels.txt’, ‘–input-blob=input_0’, ‘–output-cvg=scores’, ‘–output-bbox=boxes’, ‘–confidence=0.7’, ‘–input-width=1980’, ‘–input-height=1080’, ‘–input-rate=60’])
#lets configure cameras
camera1 = jetson_utils.videoSource(“csi://0”) # select camera 1 - Capture a frame and return the cudaImage
camera2 = jetson_utils.videoSource(“csi://4”) # select camera 2 - Capture a frame and return the cudaImage
camera3 = jetson_utils.videoSource(“csi://2”) # select camera 3 - Capture a frame and return the cudaImage
camera4 = jetson_utils.videoSource(“csi://1”) # select camera 4 - Capture a frame and return the cudaImage
while(config.run == 1):
dsid += 1 #this tells which series of detection it is - if DSID is the same in multiple items, then it means it was found on same image set (images from multiple cameras)
start_time = time.time() #time now
now = datetime.now()
current_time = now.strftime("%H:%M:%S")
img1 = camera1.Capture('rgba32f') #lets capture image from camera 1
bimg1 = camera1.Capture('rgba32f') #lets capture image from camera 1 for saving it to file later
print("Camera 1 capture error")
if(config.detector == 1):
dcounter = 0 #lets reset detection counter
detections1 = net.Detect(img1, overlay="box,labels,conf") #overlay says how the defect is annotated on final image
for detection1 in detections1:
dcounter += 1 #lets add one to counter
mvcam = 1 #camera ID 1
dheight = int(detection1.Top)
dright = int(detection1.Right)
dleft = int(detection1.Left)
dbottom = int(detection1.Bottom)
dclassid = int(detection1.ClassID)
class_name = net.GetClassDesc(dclassid)
dconfidence = round(detection1.Confidence, 0) #lets use only integers (no point to be to percice)
if(dclassid!=99 or emulate_empty !=1): #save to array if that's not empty sight
filename = f'{image_folder}/{mvid}-1-{did}-{dsid}-{dclassid}.jpg'
filename_to_db = f'/data/defect/{today}/{mvid}-1-{did}-{dsid}-{dclassid}.jpg'
detection_array.append([mvid, mvcam, dsid, did, dcounter, dclassid, dconfidence, dleft, dheight, dright, dbottom, 0, 0, filename_to_db, "", img1, filename]) #lets add detection to array
array_members = len(detection_array) #lets find how many members are in array
if(array_members > 500): #lets delete if array is too big - dont know if 1000 is good number
detection_array.clear() #lets clear the array
if(dclassid==99 or emulate_empty == 1):
empty += 1
emptycounter += 1 #lets count how much emptys have we found
if(saveimage==1): #if we found something, lets save the picture also
saveImageRGBA(filename, img1, 1280, 720)
print(current_time + ": error saving image 1")
If i use “cmake -DENABLE_NVMM=off …/” what disadvantages does it give? I think i don’t do any other things besides detecting.
I use ONNX model. Is there any other scipts/solutions what I can use for detection in python? Basically I want to detect frame by frame only and to get output about coordinates, confident % and class ID?
Is there any good solution for testing camera parameters on Jetson and CSI cameras? I think brightness, saturation, resolution, etc and see realtime image? Also it could be fun to play with some application where I can switch on and off some Jetson VPI funtctions like “eroda” and “tilate”? For example those: VPI - Vision Programming Interface: Algorithms (nvidia.com)
At the moment I initialize one detection network and in the while loop I give it one frame from 4 cameras. Its the solution how I don’t use too much hardware resources. The question is - if i detect from different view angels defects - does it give any disadvantages? I mean - or i want to ask - is detectnet using somehow previous detections to detect better (does it learn realtime?)
The last question - Let’s imagine that we have 4 or even 6 x FullHD 60FPS images from 4 cameras - how to detect objects from there without not overloading hardware? Any hints how to speed up the process? I use 512x512 detectin, because some defects needed to detect are quite small.
Suggest how can I use NVENC0 and NVENC1 in python to compose a video realtime stream video? (lets assume that I have a frame: img1 = camera1.Capture(‘rgba32f’). Any examples?
Does SSD v1or its training (i use training what comes with jetson inference) some kind of augmentation? I think if it for example uses image histogram view or tilts or rotates or makes some kind of augmentation turing training and turing detecting?
Do you know baseboard for Jetson Orion or AGX where there are 4 or 6 CSI connectors?
