CUDA Error Prev

jasper · January 29, 2021, 11:12am

hello,

I am currently attempting to get a cuda enabled docker working on my jetson-nano (for balenaOS). i am using 2 different detection networks. one with darknet and one with opencv+cudnn

the detections themself work fine, but when i first run a detection with opencv, then secondly with darknet, i get the following error:

 CUDA Error Prev: operation not supported
python3:

anybody know what could cause this?

-L4T 32.4.2
-opencv 4.5.0

AastaLLL · February 1, 2021, 2:31am

Hi,

Usually, “operation not supported” is from an implementation issue.
For example, inference a CPU buffer with GPU.

Could you do the following experiments first?

1. Please check if GPU can work correctly within the docker.
You can test this with deviceQuery app in the CUDA sample folder.

2. Could you check if your detection can run well on GPU outside of the docker.

Thanks.

jasper · February 1, 2021, 11:40am

thanks, something does seem to be faulty with my cuda installation when building the sample.

root@7c839ba:/usr/local/cuda/samples/1_Utilities/deviceQuery# make
/usr/local/cuda-10.2/bin/nvcc -ccbin g++   -m64      -gencode arch=compute_30,code=sm_30 -gencode arch=compute_32,code=sm_32 -gencode arch=compute_53,code=sm_53 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_62,code=sm_62 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_72,code=sm_72 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_75,code=compute_75 -o deviceQuery deviceQuery.o 
/usr/bin/ld: cannot find -lcudadevrt
/usr/bin/ld: cannot find -lcudart_static
collect2: error: ld returned 1 exit status
Makefile:303: recipe for target 'deviceQuery' failed
make: *** [deviceQuery] Error 1

note: this might be the only relevant to the devicequery

jasper · February 1, 2021, 2:35pm

my fault, i deleted some files to reduce disk space.
./devicequery returns:

root@7c839ba:/usr/local/cuda/samples/1_Utilities/deviceQuery# ./deviceQuery 
./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "NVIDIA Tegra X1"
  CUDA Driver Version / Runtime Version          10.2 / 10.2
  CUDA Capability Major/Minor version number:    5.3
  Total amount of global memory:                 3961 MBytes (4153769984 bytes)
  ( 1) Multiprocessors, (128) CUDA Cores/MP:     128 CUDA Cores
  GPU Max Clock rate:                            922 MHz (0.92 GHz)
  Memory Clock rate:                             13 Mhz
  Memory Bus Width:                              64-bit
  L2 Cache Size:                                 262144 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)
  Maximum Layered 1D Texture Size, (num) layers  1D=(16384), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(16384, 16384), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 32768
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  2048
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 1 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            Yes
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Device supports Compute Preemption:            No
  Supports Cooperative Kernel Launch:            No
  Supports MultiDevice Co-op Kernel Launch:      No
  Device PCI Domain ID / Bus ID / location ID:   0 / 0 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.2, CUDA Runtime Version = 10.2, NumDevs = 1
Result = PASS

i also ran the detection using --runtime nvidia, and the detection frequency is the same

jasper · February 2, 2021, 3:26pm

well, this seems to fix my problem as well
i used "rm -rf /usr/local/cuda/targets/aarch64-linux/lib/*.a " to clear a some space and never had the issue using --runtime nvidia.
thanks for the help

AastaLLL · February 3, 2021, 6:52am

Good to know this.
Thanks for the feedback.

Topic		Replies	Views
NVIDIA GPU 801 Operation Not Supported Error General Discussion cuda , driver , software-and-drivers	2	163	January 23, 2025
Using CUDA in l4t-cuda Docker container Jetson TX2 cuda , ubuntu , docker	3	1584	May 4, 2022
Jetson ORIN is not detecting my cuda instsallation Jetson AGX Orin cuda	5	2206	August 15, 2022
cudaGetDeviceCount error 3 (cudaErrorInitializationError) CUDA Programming and Performance	4	3441	March 22, 2021
Why it doesn't see CUDA docker container? Jetson Nano cuda , ubuntu , containers	4	1165	August 30, 2023
nvidia-docker seems unable to use GPU as non-root user Jetson TX2	8	9040	October 18, 2021
Devicequery in jetson nana works for docker but not for kubernetes Jetson Nano docker	6	1623	October 15, 2021
Error trying to use OpenCV with CUDA support on Docker: CUDA driver version is insufficient for CUDA runtime version Jetson Nano opencv , cuda , docker	4	2871	June 27, 2022
CUDA not found on DJI Manifold 2G NVIDIA Jetson TX2 CUDA Setup and Installation	0	634	January 3, 2022
Error while moving data from cuda-capable device to host memory - Error Code 1: Cuda Runtime (unspecified launch failure) Jetson Nano tensorrt , cuda	2	578	October 15, 2021

CUDA Error Prev

Related topics