Problems in building Google Ceres -- gpu architecture not supported (but it is)

My system :
Ubuntu 20.04 , CUDA 12.3 , CMake 3.22.5 , RTX A2000.

Device 0: “NVIDIA RTX A2000 Laptop GPU”
CUDA Driver Version / Runtime Version 12.3 / 12.3
CUDA Capability Major/Minor version number: 8.6

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 12.3, CUDA Runtime Version = 12.3, NumDevs = 1
Result = PASS

$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Fri_Nov__3_17:16:49_PDT_2023
Cuda compilation tools, release 12.3, V12.3.103
Build cuda_12.3.r12.3/compiler.33492891_0

Background:

I downloaded the example of using CMake with CUDA that is found at :

This for me works fine, I can also write the following in CMakeLists and it works fine:

set(CMAKE_CUDA_COMPILER_WORKS 1)
set(CMAKE_CUDA_FLAGS “-gencode arch=compute_86,code=sm_86”)
project(cmake_cuda LANGUAGES CUDA)
add_executable(cmake_cuda
src/knn.cu)
set_target_properties(cmake_cuda PROPERTIES CUDA_SEPARABLE_COMPILATION ON)
set_target_properties(cmake_cuda PROPERTIES CUDA_ARCHITECTURES “86”)
install (TARGETS cmake_cuda)

Note that I am disabling the CUDA compiler test, which persistently wants to compile for sm_30 and triggers an error.

Problem:

Now the problem is when I go to build Ceres solver. This comes with a lengthy CMakeLists which includes a switch USE_CUDA. Surprisingly, it’s not as easy as that.

On the ‘cmake --build’ step, I get a message (which is not correct) :

nvcc fatal : Unsupported gpu architecture ‘compute_80’

I then decide to go a step further and compile for my actual architecture, which is compute_86.
Ceres has 2 CMakeLists – one in its root, one ‘internal’. Attached my versions.
My modifications to the outer Ceres CMakeLists look like :

set(CMAKE_CUDA_COMPILER_WORKS 1)
set(CMAKE_CUDA_FLAGS “-gencode arch=compute_86,code=sm_86”)

and my modifications to the inner one look like:

set(CMAKE_CUDA_COMPILER_WORKS 1)
set(CMAKE_CUDA_FLAGS “-gencode arch=compute_86,code=sm_86”)

add_library(ceres $<TARGET_OBJECTS:ceres_internal> ${CERES_LIBRARY_SOURCE})
set_target_properties(ceres PROPERTIES CUDA_ARCHITECTURES “86”)

Accordingly, I get

nvcc fatal : Unsupported gpu architecture ‘compute_86’

Ceres CMakeLists can be found in the repo at https://github.com/ceres-solver/ceres-solver and I downloaded this recently (December 2023).

In summary:
We know full well that the architecture 86 is supported, only CUDA 12.3 is installed. It works fine for a simple example build. For Ceres, CMake gives an error and says compute_86 is not supported

I realize that debugging this is probably not going to be easy. Any suggestions welcome.

Any suggestions to improve the question also welcome.

CMake is finding another version of CUDA besides 12.3. Yes I realize that you said:

but I think you will eventually discover that is not the case.

You might wish to do a search for nvcc that is exhaustive, such as sudo find / -name nvcc or similar.

Thanks.

A reason I know that it’s the only version installed is that I have had this machine and completely wiped it, installing CUDA only recently. I only installed 12.3, it wasn’t previously installed. My best guess, somehow it manages to identify that it is using a version that has never been installed.

$ sudo find / -name nvcc
/usr/bin/nvcc
/usr/local/cuda-12.3/bin/nvcc
/usr/lib/nvidia-cuda-toolkit/bin/nvcc

Thank you

well, your test indicates 3 instances of nvcc.

A typical/proper CUDA install will only install 1.

One of those is old enough that it doesn’t support cc8.0 work.

Try:

/usr/bin/nvcc --version

and

/usr/lib/nvidia-cuda-toolkit/bin/nvcc --version

One of those two is older and is also what CMake is finding and using when it produces those unsupported messages.

You’ll need to clean that up. CMake is not a NVIDIA product, and I personally don’t have instructions for how CMake chooses its nvcc (ie. where does it look in your filesystem). But CMake has numerous questions on various forums, a bit of searching or reading CMake documentation will probably uncover the next steps for you.

If you do actually do a clean OS install followed by (a single) installation of CUDA, you will only get one instance of nvcc. I realize you dispute that, perhaps we can agree to disagree.

1 Like

Thanks, I don’t know how it came to be this way in that case. Hopefully it will not really require to reinstall linux.

I’m not suggesting you have to reinstall the OS to clean this up. I don’t have a specific clean up recipe for you, but if you’re in a hurry, you could just try deleting the nvcc files at each of these 2 locations:

leaving only the one that is obviously the 12.3 version. Then rerun your CMake build process and see what happens. If that works, it means that CMake has its own search heuristic/priority/order. If it doesn’t work it means that CMake must require an indication of where the nvcc is. The CUDA standard way to do this is defined in the CUDA linux install guide, and it includes setting the PATH env var correctly. You seem to have already done that, as indicated by your nvcc -V test.

1 Like

I want to close the topic by noting that this fix worked, alongside adding

set (CMAKE_CUDA_COMPILER /usr/local/cuda-12.3/bin/nvcc)

to CMakeLists.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.