I was able to go around this. From the sample projects, when running remotely to my GPU Server, in specific the LaneDetection sample, it crashes due to X11 which should be working
~/build/src/laneDetection$ ./sample_lane_detection
WindowGLFW: Failed create window
terminate called after throwing an instance of âstd::exceptionâ
what(): std::exception
Aborted (core dumped)
Host System shows:
~/build/src/hello_world$ ./sample_hello_world
Welcome to Driveworks SDK
[13-07-2020 16:50:45] Platform: Detected Generic x86 Platform
[13-07-2020 16:50:45] TimeSource: monotonic epoch time offset is 1588339735176813
[13-07-2020 16:50:45] Platform: number of GPU devices detected 4
[13-07-2020 16:50:45] Platform: currently selected GPU device discrete ID 0
[13-07-2020 16:50:45] SDK: Resources mounted from /usr/local/driveworks-2.2/data/
[13-07-2020 16:50:45] TimeSource: monotonic epoch time offset is 1588339735176813
[13-07-2020 16:50:45] Initialize DriveWorks SDK v2.2.3136
[13-07-2020 16:50:45] Release build with GNU 7.4.0 from heads/buildbrain-branch-0-gca7b4b26e65
Context of Driveworks SDK successfully initialized.
Version: 2.2.3136
GPU devices detected: 4
[13-07-2020 16:50:45] Platform: currently selected GPU device discrete ID 0
Device: 0, Tesla V100-SXM2-16GB
CUDA Driver Version / Runtime Version : 10.2 / 10.2
CUDA Capability Major/Minor version number: 7.0
Total amount of global memory in MBytes:16160.5
Memory Clock rate Khz: 877000
Memory Bus Width bits: 4096
L2 Cache Size: 6291456
Maximum 1D Texture Dimension Size (x): 131072
Maximum 2D Texture Dimension Size (x,y): 131072, 65536
Maximum 3D Texture Dimension Size (x,y,z): 16384, 16384, 16384
Maximum Layered 1D Texture Size, (x): 32768 num: 2048
Maximum Layered 2D Texture Size, (x,y): 32768, 32768 num: 2048
Total amount of constant memory bytes: 65536
Total amount of shared memory per block bytes: 49152
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): 1024,1024,64
Max dimension size of a grid size (x,y,z): 2147483647,65535,65535
Maximum memory pitch bytes: 2147483647
Texture alignment bytes: 512
Concurrent copy and kernel execution: Yes, copy engines num: 5
Run time limit on kernels: No
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Enabled
Device supports Unified Addressing (UVA): Yes
Device PCI Domain ID: 0, Device PCI Bus ID: 4, Device PCI location ID: 0
Compute Mode: Default (multiple host threads can use ::cudaSetDevice() with device simultaneously)
Concurrent kernels: 1
Concurrent memory: 1
[13-07-2020 16:50:45] Platform: currently selected GPU device discrete ID 1
Device: 1, Tesla V100-SXM2-16GB
CUDA Driver Version / Runtime Version : 10.2 / 10.2
CUDA Capability Major/Minor version number: 7.0
Total amount of global memory in MBytes:16160.5
Memory Clock rate Khz: 877000
Memory Bus Width bits: 4096
L2 Cache Size: 6291456
Maximum 1D Texture Dimension Size (x): 131072
Maximum 2D Texture Dimension Size (x,y): 131072, 65536
Maximum 3D Texture Dimension Size (x,y,z): 16384, 16384, 16384
Maximum Layered 1D Texture Size, (x): 32768 num: 2048
Maximum Layered 2D Texture Size, (x,y): 32768, 32768 num: 2048
Total amount of constant memory bytes: 65536
Total amount of shared memory per block bytes: 49152
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): 1024,1024,64
Max dimension size of a grid size (x,y,z): 2147483647,65535,65535
Maximum memory pitch bytes: 2147483647
Texture alignment bytes: 512
Concurrent copy and kernel execution: Yes, copy engines num: 5
Run time limit on kernels: No
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Enabled
Device supports Unified Addressing (UVA): Yes
Device PCI Domain ID: 0, Device PCI Bus ID: 6, Device PCI location ID: 0
Compute Mode: Default (multiple host threads can use ::cudaSetDevice() with device simultaneously)
Concurrent kernels: 1
Concurrent memory: 1
[13-07-2020 16:50:45] Platform: currently selected GPU device discrete ID 2
Device: 2, Tesla V100-SXM2-16GB
CUDA Driver Version / Runtime Version : 10.2 / 10.2
CUDA Capability Major/Minor version number: 7.0
Total amount of global memory in MBytes:16160.5
Memory Clock rate Khz: 877000
Memory Bus Width bits: 4096
L2 Cache Size: 6291456
Maximum 1D Texture Dimension Size (x): 131072
Maximum 2D Texture Dimension Size (x,y): 131072, 65536
Maximum 3D Texture Dimension Size (x,y,z): 16384, 16384, 16384
Maximum Layered 1D Texture Size, (x): 32768 num: 2048
Maximum Layered 2D Texture Size, (x,y): 32768, 32768 num: 2048
Total amount of constant memory bytes: 65536
Total amount of shared memory per block bytes: 49152
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): 1024,1024,64
Max dimension size of a grid size (x,y,z): 2147483647,65535,65535
Maximum memory pitch bytes: 2147483647
Texture alignment bytes: 512
Concurrent copy and kernel execution: Yes, copy engines num: 5
Run time limit on kernels: No
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Enabled
Device supports Unified Addressing (UVA): Yes
Device PCI Domain ID: 0, Device PCI Bus ID: 7, Device PCI location ID: 0
Compute Mode: Default (multiple host threads can use ::cudaSetDevice() with device simultaneously)
Concurrent kernels: 1
Concurrent memory: 1
[13-07-2020 16:50:45] Platform: currently selected GPU device discrete ID 3
Device: 3, Tesla V100-SXM2-16GB
CUDA Driver Version / Runtime Version : 10.2 / 10.2
CUDA Capability Major/Minor version number: 7.0
Total amount of global memory in MBytes:16160.5
Memory Clock rate Khz: 877000
Memory Bus Width bits: 4096
L2 Cache Size: 6291456
Maximum 1D Texture Dimension Size (x): 131072
Maximum 2D Texture Dimension Size (x,y): 131072, 65536
Maximum 3D Texture Dimension Size (x,y,z): 16384, 16384, 16384
Maximum Layered 1D Texture Size, (x): 32768 num: 2048
Maximum Layered 2D Texture Size, (x,y): 32768, 32768 num: 2048
Total amount of constant memory bytes: 65536
Total amount of shared memory per block bytes: 49152
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): 1024,1024,64
Max dimension size of a grid size (x,y,z): 2147483647,65535,65535
Maximum memory pitch bytes: 2147483647
Texture alignment bytes: 512
Concurrent copy and kernel execution: Yes, copy engines num: 5
Run time limit on kernels: No
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Enabled
Device supports Unified Addressing (UVA): Yes
Device PCI Domain ID: 0, Device PCI Bus ID: 8, Device PCI location ID: 0
Compute Mode: Default (multiple host threads can use ::cudaSetDevice() with device simultaneously)
Concurrent kernels: 1
Concurrent memory: 1
[13-07-2020 16:50:45] Releasing Driveworks SDK Context
Happy autonomous driving!