recorder-tui crashes with dw::core::CudaException

Hello,
I am trying to capture a video of my one gmsl camera.
I followed the tutorial for basic recording.
Sadly the recoder-tui tool exits with this error message:

Recorder @ release_nvidia: 
Last output: Rig: release_nvidia.json recorder-tui: started

Last error: Aborted (core dumped)

Press s<Enter> to start/stop. Press q<Enter> to quit.
Error: one of the recorders abruptly exited!
WARNING: Forcibly exited. Some data might have been lost!
WARNING: Use q<Enter> to exit gracefully, next time.

also shortly before that message this output can be seen:

Recorder @ release_nvidia:
Last output: terminate called after throwing an instance of 'dw::core::CudaException'

I made sure that the sample code ./sample_camera_gmsl works and displays the image of my camera.
I only have a HDD at hand so I used that one. Might that be the cause of the problem?
Here is the line from /proc/mounts:

/dev/sda1 /media/nvidia/HDDExt4 ext4 rw,nosuid,nodev,relatime,stripe=8191,data=ordered 0 0

The HDD is 2Tb big and is completely empty.

The rig file i am using looks like this:

{
    "rig": {
        "sensors": [
			    {
		    "name": "camera:front:center:120fov",
		    "nominalSensor2Rig": {
			"quaternion": [
			    -0.502444,
			    0.507493,
			    -0.497444,
			    0.492494
			],
			"t": [
			    1.749,
			    -0.1,
			    1.47
			]
		    },
		    "parameter": "camera-type=ar0231-rccb-bae-sf3324,csi-port=a,camera-count=1,format=h264,output-format=yuv",
		    "properties": {
			"Model": "ftheta",
			"bw-poly": "0.0 0.000545421498827636 -1.6216719633103e-10 -4.64720492990289e-12 2.85224527762934e-16",
			"cx": "960",
			"cy": "604",
			"height": "1208",
			"width": "1920"
		    },
		    "protocol": "camera.gmsl",
		    "sensor2Rig": {
			"quaternion": [
			    -0.502444,
			    0.507493,
			    -0.497444,
			    0.492494
			],
			"t": [
			    1.749,
			    -0.1,
			    1.47
			]
		    }
		}
        ],
        "vehicle": {
            "valid": true,
            "value": {
                "COMMENT": "steeringCoefficient is not validated",
                "axlebaseFront": 1.582,
                "axlebaseRear": 1.575,
                "bumperFront": 0.912,
                "bumperRear": 1.109,
                "centerOfMassToRearAxle": 1.564,
                "frontCorneringStiffness": 30654.0,
                "height": 1.473,
                "inertia": 1780.8,
                "length": 4.872,
                "mass": 1779.4,
                "rearCorneringStiffness": 36407.0,
                "steeringCoefficient": 14.8,
                "wheelDiameter": 0.673,
                "wheelbase": 2.85,
                "width": 1.852,
                "widthWithMirrors": 2.121
            }
        }
    },
    "version": 2
}

nvcc --version:

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sun_Apr__1_21:15:38_CDT_2018
Cuda compilation tools, release 9.2, V9.2.78

Thanks

Dear Sebastian6j1z2,

Could you please check and try to run below items?

-. Could you please help to check the HDD file format if ext4 or not?
-. Power off Drive PX2 and power on Drive PX2 after 1 min
-. Run “export CUDA_VISIBLE_DEVICES=0” for dGPU before the recoder-tui tool

When using lsblk -f the output is:

sda                                                             
└─sda1      ext4   HDDExt4 395acd52-019b-4e7b-85f7-350461e4fa89 /media/nvidia/HDDExt4

I restarted the device multiple times and checked on both tegras without luck.
The only thing that comes to mind is that on my px2 the dGPU are damaged and therefore not working.
Setting the CUDA_VISIBLE_DEVICE to 0 didn’t help either.

Dear Sebastian6j1z2,

Did you setup CUDA environment properly?
If not please refer to the link and check it.

After finish CUDA setting and please run below step.

$ /usr/local/cuda-8.0/bin/cuda-install-samples-8.0.sh ~/
$cd ~/NVIDIA_CUDA8.0_Samples/1_Utilities/deviceQuery
$make
$./deviceQuery

nvidia@tegra-ubuntu:~/NVIDIA_CUDA-9.2_Samples/1_Utilities/deviceQuery$ ./deviceQuery
./deviceQuery Starting…

CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 2 CUDA Capable device(s)

Device 0: “DRIVE PX 2 AutoChauffeur”
CUDA Driver Version / Runtime Version 9.2 / 9.2
CUDA Capability Major/Minor version number: 6.1
Total amount of global memory: 3840 MBytes (4026466304 bytes)
( 9) Multiprocessors, (128) CUDA Cores/MP: 1152 CUDA Cores
GPU Max Clock rate: 1290 MHz (1.29 GHz)
Memory Clock rate: 3003 Mhz
Memory Bus Width: 128-bit
L2 Cache Size: 1048576 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 2 copy engine(s)
Run time limit on kernels: No
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
Device supports Unified Addressing (UVA): Yes
Device supports Compute Preemption: Yes
Supports Cooperative Kernel Launch: Yes
Supports MultiDevice Co-op Kernel Launch: Yes
Device PCI Domain ID / Bus ID / location ID: 0 / 4 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

Device 1: “NVIDIA Tegra X2”
CUDA Driver Version / Runtime Version 9.2 / 9.2
CUDA Capability Major/Minor version number: 6.2
Total amount of global memory: 6402 MBytes (6712545280 bytes)
( 2) Multiprocessors, (128) CUDA Cores/MP: 256 CUDA Cores
GPU Max Clock rate: 1275 MHz (1.27 GHz)
Memory Clock rate: 1600 Mhz
Memory Bus Width: 128-bit
L2 Cache Size: 524288 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 32768
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 1 copy engine(s)
Run time limit on kernels: No
Integrated GPU sharing Host Memory: Yes
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
Device supports Unified Addressing (UVA): Yes
Device supports Compute Preemption: Yes
Supports Cooperative Kernel Launch: Yes
Supports MultiDevice Co-op Kernel Launch: Yes
Device PCI Domain ID / Bus ID / location ID: 0 / 0 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

Peer access from DRIVE PX 2 AutoChauffeur (GPU0) → NVIDIA Tegra X2 (GPU1) : No
Peer access from NVIDIA Tegra X2 (GPU1) → DRIVE PX 2 AutoChauffeur (GPU0) : No

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 9.2, CUDA Runtime Version = 9.2, NumDevs = 2
Result = PASS