cudaToNumpy -> cv2.imshow not responding, no video output, no Error - csi camera

I’m trying to feed the cuda detection output to OpenCV and having no luck!
Using: Python3, CSI RPi camera, Jetson Nano
Detection works fine, and I can save the Numpy image to disk, but cannot get it to render or display via cv2, and I don’t get any errors, and the process doesn’t complete.

See images
(1) no error, stuck at “RingBuffer – allocated 4 buffers (14745600 bytes each, 58982400 bytes total)”
(2) the printed output of the captured frame and cudaToNumpy objects

What am I doing wrong here ?

Here is my code:

import jetson.inference
import jetson.utils
import cv2

net = jetson.inference.detectNet(“pednet”, threshold=0.4)
camera = jetson.utils.gstCamera(1280, 720, “0”)
display = jetson.utils.videoOutput(“display://0”)

while display.IsStreaming():

img, width, height = camera.CaptureRGBA(zeroCopy=1)
jetson.utils.cudaDeviceSynchronize()
detections = net.Detect(img)
cv2imgRGBA = jetson.utils.cudaToNumpy(img, width, height, 4)
cv2img = cv2.cvtColor(cv2imgRGBA, cv2.COLOR_RGBA2BGR)
cv2.imshow('showDetectNet', cv2img)

Hi,
Could you check if you can get valid image in calling

cv2.imwrite("/tmp/dump.jpg", cv2img)

When you are using floating-point, you need to normalize the pixel values between [0,1], as this is what OpenCV expects. Try using jetson.utils.cudaNormalize() function:

img, width, height = camera.CaptureRGBA(zeroCopy=1)
detections = net.Detect(img)
jetson.utils.cudaNormalize(img, (0.0,255.0), img, (0.0, 1.0))
jetson.utils.cudaDeviceSynchronize()
cv2imgRGBA = jetson.utils.cudaToNumpy(img, width, height, 4)
cv2img = cv2.cvtColor(cv2imgRGBA, cv2.COLOR_RGBA2BGR)
cv2.imshow('showDetectNet', cv2img)

Also you would want to move cudaDeviceSynchronize() down so it is called after normalize, as shown above.

BTW if you use the new videoSource API instead, the frames will be captured in 8-bit uchar3 format, so you wouldn’t to normalize them.

thanks, yes, i tried that and i’m able to write the image.
Problem seems to be in displaying the image feed.

Dusty, I tried suggested changes, but still no luck; terminal window freezes at the same line of the print output.
I dont know what else to try. My goal is to pass the cuda object to openCV post detection, i don’t know if there’s any workaround to achieve that.

here is the print output, at the end i ctrl+c:

jetson.inference – detectNet loading build-in network ‘pednet’

detectNet – loading detection network model from:
– prototxt networks/ped-100/deploy.prototxt
– model networks/ped-100/snapshot_iter_70800.caffemodel
– input_blob ‘data’
– output_cvg ‘coverage’
– output_bbox ‘bboxes’
– mean_pixel 0.000000
– mean_binary NULL
– class_labels networks/ped-100/class_labels.txt
– threshold 0.500000
– batch_size 1

[TRT] TensorRT version 7.1.3
[TRT] loading NVIDIA plugins…
[TRT] Registered plugin creator - ::GridAnchor_TRT version 1
[TRT] Registered plugin creator - ::NMS_TRT version 1
[TRT] Registered plugin creator - ::Reorg_TRT version 1
[TRT] Registered plugin creator - ::Region_TRT version 1
[TRT] Registered plugin creator - ::Clip_TRT version 1
[TRT] Registered plugin creator - ::LReLU_TRT version 1
[TRT] Registered plugin creator - ::PriorBox_TRT version 1
[TRT] Registered plugin creator - ::Normalize_TRT version 1
[TRT] Registered plugin creator - ::RPROI_TRT version 1
[TRT] Registered plugin creator - ::BatchedNMS_TRT version 1
[TRT] Could not register plugin creator - ::FlattenConcat_TRT version 1
[TRT] Registered plugin creator - ::CropAndResize version 1
[TRT] Registered plugin creator - ::DetectionLayer_TRT version 1
[TRT] Registered plugin creator - ::Proposal version 1
[TRT] Registered plugin creator - ::ProposalLayer_TRT version 1
[TRT] Registered plugin creator - ::PyramidROIAlign_TRT version 1
[TRT] Registered plugin creator - ::ResizeNearest_TRT version 1
[TRT] Registered plugin creator - ::Split version 1
[TRT] Registered plugin creator - ::SpecialSlice_TRT version 1
[TRT] Registered plugin creator - ::InstanceNormalization_TRT version 1
[TRT] detected model format - caffe (extension ‘.caffemodel’)
[TRT] desired precision specified for GPU: FASTEST
[TRT] requested fasted precision for device GPU without providing valid calibrator, disabling INT8
[TRT] native precisions detected for GPU: FP32, FP16
[TRT] selecting fastest native precision for GPU: FP16
[TRT] attempting to open engine cache file /usr/local/bin/networks/ped-100/snapshot_iter_70800.caffemodel.1.1.7103.GPU.FP16.engine
[TRT] loading network plan from engine cache… /usr/local/bin/networks/ped-100/snapshot_iter_70800.caffemodel.1.1.7103.GPU.FP16.engine
[TRT] device GPU, loaded /usr/local/bin/networks/ped-100/snapshot_iter_70800.caffemodel
[TRT] Deserialize required 6995085 microseconds.
[TRT]
[TRT] CUDA engine context initialized on device GPU:
[TRT] – layers 68
[TRT] – maxBatchSize 1
[TRT] – workspace 0
[TRT] – deviceMemory 61315072
[TRT] – bindings 3
[TRT] binding 0
– index 0
– name ‘data’
– type FP32
– in/out INPUT
– # dims 3
– dim #0 3 (SPATIAL)
– dim #1 512 (SPATIAL)
– dim #2 1024 (SPATIAL)
[TRT] binding 1
– index 1
– name ‘coverage’
– type FP32
– in/out OUTPUT
– # dims 3
– dim #0 1 (SPATIAL)
– dim #1 32 (SPATIAL)
– dim #2 64 (SPATIAL)
[TRT] binding 2
– index 2
– name ‘bboxes’
– type FP32
– in/out OUTPUT
– # dims 3
– dim #0 4 (SPATIAL)
– dim #1 32 (SPATIAL)
– dim #2 64 (SPATIAL)
[TRT]
[TRT] binding to input 0 data binding index: 0
[TRT] binding to input 0 data dims (b=1 c=3 h=512 w=1024) size=6291456
[TRT] binding to output 0 coverage binding index: 1
[TRT] binding to output 0 coverage dims (b=1 c=1 h=32 w=64) size=8192
[TRT] binding to output 1 bboxes binding index: 2
[TRT] binding to output 1 bboxes dims (b=1 c=4 h=32 w=64) size=32768
[TRT]
[TRT] device GPU, /usr/local/bin/networks/ped-100/snapshot_iter_70800.caffemodel initialized.
[TRT] detectNet – number object classes: 1
[TRT] detectNet – maximum bounding boxes: 2048
[TRT] detectNet – loaded 1 class info entries
[TRT] detectNet – number of object classes: 1
[gstreamer] initialized gstreamer, version 1.14.5.0
[gstreamer] gstCamera – attempting to create device csi://0
[gstreamer] gstCamera pipeline string:
[gstreamer] nvarguscamerasrc sensor-id=0 ! video/x-raw(memory:NVMM), width=(int)1280, height=(int)720, framerate=30/1, format=(string)NV12 ! nvvidconv flip-method=2 ! video/x-raw ! appsink name=mysink
[gstreamer] gstCamera successfully created device csi://0
[OpenGL] glDisplay – X screen 0 resolution: 1920x1080
[OpenGL] glDisplay – X window resolution: 1920x1080
[OpenGL] glDisplay – display device initialized (1920x1080)
[gstreamer] opening gstCamera for streaming, transitioning pipeline to GST_STATE_PLAYING
[gstreamer] gstreamer changed state from NULL to READY ==> mysink
[gstreamer] gstreamer changed state from NULL to READY ==> capsfilter1
[gstreamer] gstreamer changed state from NULL to READY ==> nvvconv0
[gstreamer] gstreamer changed state from NULL to READY ==> capsfilter0
[gstreamer] gstreamer changed state from NULL to READY ==> nvarguscamerasrc0
[gstreamer] gstreamer changed state from NULL to READY ==> pipeline0
[gstreamer] gstreamer changed state from READY to PAUSED ==> capsfilter1
[gstreamer] gstreamer changed state from READY to PAUSED ==> nvvconv0
[gstreamer] gstreamer changed state from READY to PAUSED ==> capsfilter0
[gstreamer] gstreamer stream status CREATE ==> src
[gstreamer] gstreamer changed state from READY to PAUSED ==> nvarguscamerasrc0
[gstreamer] gstreamer changed state from READY to PAUSED ==> pipeline0
[gstreamer] gstreamer stream status ENTER ==> src
[gstreamer] gstreamer message new-clock ==> pipeline0
[gstreamer] gstreamer changed state from PAUSED to PLAYING ==> capsfilter1
[gstreamer] gstreamer changed state from PAUSED to PLAYING ==> nvvconv0
[gstreamer] gstreamer message stream-start ==> pipeline0
[gstreamer] gstreamer changed state from PAUSED to PLAYING ==> capsfilter0
[gstreamer] gstreamer changed state from PAUSED to PLAYING ==> nvarguscamerasrc0
GST_ARGUS: Creating output stream
CONSUMER: Waiting until producer is connected…
GST_ARGUS: Available Sensor modes :
GST_ARGUS: 3264 x 2464 FR = 21.000000 fps Duration = 47619048 ; Analog Gain range min 1.000000, max 10.625000; Exposure Range min 13000, max 683709000;

GST_ARGUS: 3264 x 1848 FR = 28.000001 fps Duration = 35714284 ; Analog Gain range min 1.000000, max 10.625000; Exposure Range min 13000, max 683709000;

GST_ARGUS: 1920 x 1080 FR = 29.999999 fps Duration = 33333334 ; Analog Gain range min 1.000000, max 10.625000; Exposure Range min 13000, max 683709000;

GST_ARGUS: 1280 x 720 FR = 59.999999 fps Duration = 16666667 ; Analog Gain range min 1.000000, max 10.625000; Exposure Range min 13000, max 683709000;

GST_ARGUS: 1280 x 720 FR = 120.000005 fps Duration = 8333333 ; Analog Gain range min 1.000000, max 10.625000; Exposure Range min 13000, max 683709000;

GST_ARGUS: Running with following settings:
Camera index = 0
Camera mode = 4
Output Stream W = 1280 H = 720
seconds to Run = 0
Frame Rate = 120.000005
GST_ARGUS: Setup Complete, Starting captures for 0 seconds
GST_ARGUS: Starting repeat capture requests.
CONSUMER: Producer has connected; continuing.
[gstreamer] gstCamera – onPreroll
[gstreamer] gstCamera – map buffer size was less than max size (1382400 vs 1382407)
[gstreamer] gstCamera recieve caps: video/x-raw, width=(int)1280, height=(int)720, framerate=(fraction)30/1, format=(string)NV12
[gstreamer] gstCamera – recieved first frame, codec=raw format=nv12 width=1280 height=720 size=1382407
RingBuffer – allocated 4 buffers (1382407 bytes each, 5529628 bytes total)
[gstreamer] gstreamer changed state from READY to PAUSED ==> mysink
[gstreamer] gstreamer message async-done ==> pipeline0
[gstreamer] gstreamer changed state from PAUSED to PLAYING ==> mysink
[gstreamer] gstreamer changed state from PAUSED to PLAYING ==> pipeline0
RingBuffer – allocated 4 buffers (14745600 bytes each, 58982400 bytes total)
^CTraceback (most recent call last):
File “project19.py”, line 44, in
cv2.imshow(‘showDetectNet’, cv2img)
KeyboardInterrupt
[gstreamer] gstCamera – stopping pipeline, transitioning to GST_STATE_NULL
CONSUMER: Done Success
GST_ARGUS: Cleaning up
GST_ARGUS: Done Success
[gstreamer] gstCamera – pipeline stopped

I’m not sure why cv2.imshow() would freeze, but since you are able to use cv2.imwrite() ok, that would indicate that openCV is getting the image ok.

Are you able to use cv2.imshow() in your script on an unrelated image - i.e. one that was loaded with cv2.imread()?

Also, if you put a print statement after cv2.imshow(), does it ever print?

What happens if you put a cv2.waitKey() after cv2.imshow()?

Yes, i’m able to cv2.imshow() from a local file, but when i do below i get an error
it looks like we are getting closer to the root-cause, but I still have no idea how to fix it…
I’ve tried different versions of OpenCV, and getting the same results…

import cv2
print(cv2.version)

cap = cv2.VideoCapture(“csi://0”)
while(True):
ret, frame = cap.read()
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

cv2.imshow('frame',gray)
if cv2.waitKey(1) & 0xFF == ord('q'):
    break

cap.release()
cv2.destroyAllWindows()

Error:

OpenCV Error: Assertion failed (scn == 3 || scn == 4) in cvtColor, file /build/opencv-XDqSFW/opencv-3.2.0+dfsg/modules/imgproc/src/color.cpp, line 9748
Traceback (most recent call last):
File “covid19.py”, line 48, in
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
cv2.error: /build/opencv-XDqSFW/opencv-3.2.0+dfsg/modules/imgproc/src/color.cpp:9748: error: (-215) scn == 3 || scn == 4 in function cvtColor

It it prints, I was printing the NumPy with no problem, almost as cv2.imshow() is not there…

OK, I did a test, and what I found is that if there is no call to cv2.waitKey(), then cv.imshow() never shows a window. However if you just do cv2.waitKey(1), it returns almost immediately (in 1ms), without actually waiting for user to press a key.

This camera test works for me:

import cv2

import jetson.inference
import jetson.utils

input_stream = jetson.utils.videoSource("csi://0")

while True:
	cuda_img = input_stream.Capture()
	jetson.utils.cudaDeviceSynchronize()
	print(cuda_img)
	cv_img_rgb = jetson.utils.cudaToNumpy(cuda_img)
	cv_img_bgr = cv2.cvtColor(cv_img_rgb, cv2.COLOR_RGB2BGR)
	cv2.imshow("Video Feed", cv_img_bgr)
	cv2.waitKey(1)

It worked!!!
Thank you so much for the support!

AFAIK, opencv highgui has a drawing thread that could be scheduled depending on if calling waitKey, as @dusty_nv mentioned.

Furthermore, be aware that this drawing thread on CPU is not very efficient on jetson. You may better use a jetson-utils videoOutput (say to “display://0”).

You would keep the RGB to BGR conversion for using most opencv CPU algorithms, and use BGR2RGB before display.
You may have a look to this C++ example (not using CPU).

1 Like

You could also use the GPU to perform the RGB->BGR conversion - see here for a demonstration:

Although in this case I thought it most straightforward just to get it working with cv2.cvtColor()

1 Like