Please provide the following info (check/uncheck the boxes after clicking “+ Create Topic”):
Software Version
DRIVE OS Linux 5.2.0
DRIVE OS Linux 5.2.0 and DriveWorks 3.5
NVIDIA DRIVE™ Software 10.0 (Linux)
NVIDIA DRIVE™ Software 9.0 (Linux)
other DRIVE OS version
other
Target Operating System
Linux
QNX
other
Hardware Platform
NVIDIA DRIVE™ AGX Xavier DevKit (E3550)
NVIDIA DRIVE™ AGX Pegasus DevKit (E3550)
other
SDK Manager Version
1.4.0.7363
other
Host Machine Version
native Ubuntu 18.04
other
Hello,
First, I would like to mention that I am not sure wether this issue belongs to Tensorflow or Drive AGX but I believe that it is appropriate to seek a solution here. I have seen a couple of similar issue encounters on various forums but none has been able to solve my problem. I am also aware that Tensorflow isn’t officially supported for the AGX.
I am having a problem building Tensorflow from source onto my AGX. I am attempting to build Tensorflow 2.4 . I have Bazel 3.7.2 installed and working well. I am able to run the configuration file inside of tensorflow but my issue does not come up until I attempt to build Tensorflow with CUDA supports. I was able to properly configure the installation without CUDA support. This is my output:
Please specify the CUDA SDK version you want to use. [Leave empty to default to CUDA 10]: 10.2
Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7]: 7.6
Please specify the TensorRT version you want to use. [Leave empty to default to TensorRT 6]: 6.3
Please specify the locally installed NCCL version you want to use. [Leave empty to use http://github.com/nvidia/nccl]:
Please specify the comma-separated list of base paths to look for CUDA libraries and headers. [Leave empty to use the default]: /usr/include/linux/
Traceback (most recent call last):
File "third_party/gpus/find_cuda_config.py", line 653, in <module>
main()
File "third_party/gpus/find_cuda_config.py", line 645, in main
for key, value in sorted(find_cuda_config().items()):
File "third_party/gpus/find_cuda_config.py", line 583, in find_cuda_config
result.update(_find_cuda_config(cuda_paths, cuda_version))
File "third_party/gpus/find_cuda_config.py", line 257, in _find_cuda_config
get_header_version)
File "third_party/gpus/find_cuda_config.py", line 244, in _find_header
required_version, get_version)
File "third_party/gpus/find_cuda_config.py", line 233, in _find_versioned_file
actual_version = get_version(file)
File "third_party/gpus/find_cuda_config.py", line 250, in get_header_version
version = int(_get_header_version(path, "CUDA_VERSION"))
ValueError: invalid literal for int() with base 10: ''
Asking for detailed CUDA configuration...
Please specify the CUDA SDK version you want to use. [Leave empty to default to CUDA 10]:
This might also be an issue with Python I think. I tried going ahead and hardcoding the configuration file to force it into accepting the cuda.h file found but I don’t think that method was very effective.
Has anyone encountered an issue like this before? Does anyone know how to fix this? Any advice would be greatly appreciated. Thank you,
Zeus