Jetson Nano OpenCV latency Problem

I am using NVDIA Jetson Nano Developer card (4 GB), PiCam v.2.1 and I am working with python version (3.6.9) and OpenCV version (4.5.3) . This code’s goal is that detect red cicle and find its pixel center about origin. When I run this code in Windows PC there is no any FPS drop but in Jetson Nano there is 2-3 second latency despite Power Mode MAXN. How can I prevent this latency, Thanks

import numpy as np
import cv2
from numpy.core.numeric import outer
GSTREAMER_PIPELINE = 'nvarguscamerasrc ! video/x-raw(memory:NVMM), width=1920, height=1080, format=(string)NV12, framerate=21/1 ! nvvidconv flip-method=0 ! video/x-raw, width=640, height=480, format=(string)BGRx ! videoconvert ! videobalance saturation=0.24 brightness=0.5 contrast=0.0 hue=-0.1 ! video/x-raw, format=(string)BGR ! appsink'


    # Capture frame-by-frame
    ret, captured_frame =
    output_frame = captured_frame.copy()
    # Convert original image to BGR, since Lab is only available from BGR
    captured_frame_bgr = cv2.cvtColor(captured_frame, cv2.COLOR_BGRA2BGR)
    # First blur to reduce noise prior to color space conversion
    captured_frame_bgr = cv2.medianBlur(captured_frame_bgr, 7)
    # Convert to Lab color space, we only need to check one channel (a-channel) for red here
    captured_frame_lab = cv2.cvtColor(captured_frame_bgr, cv2.COLOR_BGR2Lab)
    # Threshold the Lab image, keep only the red pixels
    # Possible yellow threshold: [20, 110, 170][255, 140, 215]
    # Possible blue threshold: [20, 115, 70][255, 145, 120]
    captured_frame_lab_red = cv2.inRange(captured_frame_lab, np.array([135,135,130]), np.array([255, 255, 255]))
    # Second blur to reduce more noise, easier circle detection
    captured_frame_lab_red = cv2.GaussianBlur(captured_frame_lab_red, (5, 5), 2, 2)
    # Use the Hough transform to detect circles in the image
    circles=cv2.HoughCircles(captured_frame_lab_red, cv2.HOUGH_GRADIENT, 1, captured_frame_lab_red.shape[0] / 8, param1=100, param2=18, minRadius=10, maxRadius=100)

    # If we have extracted a circle, draw an outline
    # We only need to detect one circle here, since there will only be one reference object
    if circles is not None:
        circles = np.round(circles[0, :]).astype("int")
        circle_center=(circles[0, 0], circles[0, 1])
        circle_radius=circles[0, 2]
        camera_center = (320,240)
        top_center = (320,0)
        bottom_center= (320,480)
        right_center = (0,240) 
        left_center = (640,240), circle_center,circle_radius, color=(255, 0, 0), thickness=2)
        cv2.putText(output_frame,"Target", circle_center ,cv2.FONT_HERSHEY_COMPLEX, 1, (255,0,0),2)
        cv2.line(output_frame, top_center, bottom_center, (0,255,0), 2)
        cv2.line(output_frame, right_center, left_center, (0,255,0), 2)
        cv2.line(output_frame, camera_center, circle_center, (255,0,0), 2)
        if(circle_center[0] >= 320 and circle_center[1] <= 240):
         quadrant="First quadrant"
         updated_circle_center= (circle_center[0]-320,240-circle_center[1])
         if(circle_center[0] == 320):
          updated_circle_center= (circle_center[0]-320,240-circle_center[1])
          quadrant="Positive y-axes"
         elif(circle_center[1] == 240):
          updated_circle_center= (circle_center[0]-320,240-circle_center[1])
          quadrant="Positive x-axes" 
         elif(circle_center[0] == 320 and circle_center[1] == 240 ):

        elif(circle_center[0] < 320 and circle_center[1]<240):
         quadrant="Second quadrant"
         updated_circle_center= (circle_center[0]-320,240-circle_center[1])

        elif(circle_center[0] <= 320 and circle_center[1] >= 240):
         quadrant="Third quadrant"
         if(circle_center[0] == 320):
          updated_circle_center= (circle_center[0]-320,240-circle_center[1])
          quadrant="Negative y-axes"
         elif(circle_center[1] == 240):
          updated_circle_center= (circle_center[0]-320,240-circle_center[1])
          quadrant="Negative x-axes" 

        elif(circle_center[0] > 320 and circle_center[1]>240):
         quadrant="Fourth quadrant"
        cv2.putText(output_frame,"X: "+str(updated_circle_center[0]), (40,42) ,cv2.FONT_HERSHEY_COMPLEX, 0.5, (0,0,255))
        cv2.putText(output_frame,"Y: "+str(updated_circle_center[1]), (40,70) ,cv2.FONT_HERSHEY_COMPLEX, 0.5, (0,0,255))
        cv2.putText(output_frame,"Quadrant: "+quadrant, (40,98) ,cv2.FONT_HERSHEY_COMPLEX, 0.5, (0,0,255))
        cv2.putText(output_frame,"FPS: "+ str(cap.get(cv2.CAP_PROP_FPS)), (40,126) ,cv2.FONT_HERSHEY_COMPLEX, 0.5, (0,0,255))
        cv2.putText(output_frame,"Radius(px): "+ str(circle_radius), (40,154) ,cv2.FONT_HERSHEY_COMPLEX, 0.5, (0,0,255))
    # Display the resulting frame, quit with q
    cv2.imshow('red', captured_frame_lab_red)
    cv2.imshow('frame', output_frame)
    if cv2.waitKey(1) & 0xFF == ord('q'):

# When everything done, release the capture

You may consider to run gstreamer pipeline and map the buffers to cv::gpu::gpuMat. Please refer to the sample:
Nano not using GPU with gstreamer/python. Slow FPS, dropped frames - #8 by DaneLLL

See if you can run your use-case based on it.

1 Like

Hi, I didn’t understand how can I implement c++ scripted code in python scripted code I think you suggest that whole the code written in c++ but I can’t because I will use another python libraries in this code. Is there any alternative way to decrease latency? (Note: github link in which you send, didn’t work ), thanks.

For python your code looks optimal. Python coding is not as flexible as C code and would require certain buffer copy. It may increase latency and performance drops.

You may run sudo nvpmodel-m 0 and sudo jetson_clocks. The steps will enable all CPU cores at max clock, and give optimal throughput in buffer copy.
And please set appsink sync=false.

The script for building OpenCV is updated to

JEP/ at master · AastaNV/JEP · GitHub

1 Like