Please provide the following info (tick the boxes after creating this topic): Software Version
DRIVE OS 6.0.6
DRIVE OS 6.0.5
DRIVE OS 6.0.4 (rev. 1)
DRIVE OS 6.0.4 SDK
other
Target Operating System
Linux
QNX
other
Hardware Platform
DRIVE AGX Orin Developer Kit (940-63710-0010-D00)
DRIVE AGX Orin Developer Kit (940-63710-0010-C00)
DRIVE AGX Orin Developer Kit (not sure its number)
other
SDK Manager Version
1.9.2.10884
other
Host Machine Version
native Ubuntu Linux 20.04 Host installed with SDK Manager
native Ubuntu Linux 20.04 Host installed with DRIVE OS Docker Containers
native Ubuntu Linux 18.04 Host installed with DRIVE OS Docker Containers
other
Hi ,
we wanted to use 6.0.6 docker container on top of drive AGX Orin. But we see that the size is 14.7GB . Is there minimal version or it is like we need to install this heavy ?
Could you share how you got the size? It would be helpful to know which specific Docker container you are referring to and what your specific use cases are. Thanks.
Vick,
The first link in your reply says : “you can pull target-side Docker images from NGC or Docker Hub, and run GPU-accelerated containers on the target right”
can you pls point me to the link that points to the reference target side docker image. I am not able to find the target side docker image ?
we have a docker environment that we want to merge with the nvidia docker with cuda and driver support and run that docker on top of orin. can you give any pointers on how this can be accomplished ?
In any case, we do not provide a target-side Docker image, only the Docker runtime with the NVIDIA Container Toolkit (nvidia-docker) stack to facilitate running GPU-accelerated applications with the Tegra.
To create a container on the target the customer only needs to follow standard Docker practices for writing your Dockerfiles and workflows to build ARM64 images (if you are building on x86 host) or a native Docker image (if you are building target-side).
To have NVIDIA Container Toolkit (nvidia-docker) support (to get some CUDA and driver support), you simply need to pass --runtime nvidia --gpus all when executing a docker run command. If you have any dependencies that need to be mounted, then we provide guidance in the blog for how to modify the drivers.csv and device.csv files to specify your dependencies, which will then be handled by the NVIDIA Container Toolkit (nvidia-docker) stack.
Depending on their use-case, I may also recommend mounting the CUDA directory to the container at runtime, using -v /usr/local/cuda-:/usr/local/cuda.
In total, to run your image, after having modified the drivers.csv and the devices.csv files appropriately, you may likely end up with a docker run command that looks similar to the following:
$ sudo docker run --rm --runtime nvidia --gpus all -v /usr/local/cuda-11.4:/usr/local/cuda
I am trying to run the docker with this command
sudo docker run --rm --network host --runtime=nvidia --gpus all -v /usr/local/cuda-11.4:/usr/local/cuda-11.4/ --name av-stack ed9214aa0b8e
but i still see that the nvidia container toolkit is not available and nvidia-smi returns nothing.
It worked well on my devkit with DRIVE OS 6.0.6. FYI.
$ cd /usr/local/cuda-11.4/samples/0_Simple/matrixMul && sudo make
>>> GCC Version is greater or equal to 4.7.0 <<<
/usr/local/cuda-11.4/bin/nvcc -ccbin g++ -I…/…/common/inc -m64 --threads 0 --std=c++11 -gencode arch=compute_53,code=sm_53 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_62,code=sm_62 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_72,code=sm_72 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_87,code=sm_87 -gencode arch=compute_87,code=compute_87 -o matrixMul.o -c matrixMul.cu
/usr/local/cuda-11.4/bin/nvcc -ccbin g++ -m64 -gencode arch=compute_53,code=sm_53 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_62,code=sm_62 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_72,code=sm_72 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_87,code=sm_87 -gencode arch=compute_87,code=compute_87 -o matrixMul matrixMul.o
mkdir -p …/…/bin/aarch64/linux/release
cp matrixMul …/…/bin/aarch64/linux/release
nvidia@tegra-ubuntu:/usr/local/cuda-11.4/samples/0_Simple/matrixMul$ sudo docker run --rm --runtime nvidia --gpus all -v $(pwd):$(pwd) -w $(pwd) ubuntu:20.04 ./matrixMul
WARNING: IPv4 forwarding is disabled. Networking will not work.
[Matrix Multiply Using CUDA] - Starting…
GPU Device 0: “Ampere” with compute capability 8.7
MatrixA(320,320), MatrixB(640,320)
Computing result using CUDA Kernel…
done
Performance= 617.89 GFlop/s, Time= 0.212 msec, Size= 131072000 Ops, WorkgroupSize= 1024 threads/block
Checking computed result for correctness: Result = PASS
NOTE: The CUDA Samples are not meant for performancemeasurements. Results may vary when GPU Boost is enabled.
I tried to mange and steer a little bit , i see that the veth driver is not loaded by default , so had to manually install it. But then i see that i hit into the cuda error with this
@tegra-ubuntu:/usr/local/cuda-11.4/samples/0_Simple/matrixMul$ sudo docker run --rm --runtime nvidia --gpus all -v $(pwd):$(pwd) -w $(pwd) ubuntu:20.04 ./matrixMul
WARNING: IPv4 forwarding is disabled. Networking will not work.
CUDA error at …/…/common/inc/helper_cuda.h:781 code=801(cudaErrorNotSupported) “cudaGetDeviceCount(&device_count)”
[Matrix Multiply Using CUDA] - Starting…
Any pointers why i see this issue with cuda ? what drivers might be missing in this case ?
It appears that you’re encountering an issue with the CUDA library, specifically when calling the cudaGetDeviceCount() function. Before we can provide you with specific guidance, could you please provide some additional information?
Have you reflashed the devkit before this try? Which version of DRIVE OS you are currently using? Furthermore, it would be helpful if you could share the complete output of building the ‘matrixMul’ application.
a clean installation of 6.0.6 on the orin resolved the issue. .
Now am able to get the cuda tested in the docker on orin.
tegra-ubuntu:/usr/local/cuda/samples/0_Simple/matrixMul# ./matrixMul
[Matrix Multiply Using CUDA] - Starting…
GPU Device 0: “Ampere” with compute capability 8.7
MatrixA(320,320), MatrixB(640,320)
Computing result using CUDA Kernel…
done
Performance= 617.98 GFlop/s, Time= 0.212 msec, Size= 131072000 Ops, WorkgroupSize= 1024 threads/block
Checking computed result for correctness: Result = PASS
NOTE: The CUDA Samples are not meant for performancemeasurements. Results may vary when GPU Boost is enabled.