__builtin_ia32_ldtilecfg and __builtin_ia32_sttilecfg are undefined

Hi everyone. I am currently trying to compile the CUDA project from GitHub - HareInWeed/gec: elliptic curve cryptography with GPU acceleration (Elliptic curve cryptography) on my Ubuntu 24.04 laptop using “Cuda compilation tools, release 12.4, V12.4.131” and gcc version (Ubuntu 13.2.0-23ubuntu4) 13.2.0. After enabling the CUDA option in top CMakeLists.txt file, I get a compile error:

[  2%] Building CUDA object CMakeFiles/secp256k1.dir/src/secp256k1.cpp.o
[  5%] Building CUDA object CMakeFiles/sm2.dir/src/sm2.cpp.o
/usr/local/cuda-12.4/bin/nvcc -forward-unknown-to-host-compiler -DGEC_ENABLE_CUDA --options-file CMakeFiles/secp256k1.dir/includes_CUDA.rsp -O3 -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75,sm_75]" -MD -MT CMakeFiles/secp256k1.dir/src/secp256k1.cpp.o -MF CMakeFiles/secp256k1.dir/src/secp256k1.cpp.o.d -x cu -rdc=true -c /.../gec/src/secp256k1.cpp -o CMakeFiles/secp256k1.dir/src/secp256k1.cpp.o
/usr/local/cuda-12.4/bin/nvcc -forward-unknown-to-host-compiler -DGEC_ENABLE_CUDA --options-file CMakeFiles/sm2.dir/includes_CUDA.rsp -O3 -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75,sm_75]" -MD -MT CMakeFiles/sm2.dir/src/sm2.cpp.o -MF CMakeFiles/sm2.dir/src/sm2.cpp.o.d -x cu -rdc=true -c /.../gec/src/sm2.cpp -o CMakeFiles/sm2.dir/src/sm2.cpp.o
/usr/lib/gcc/x86_64-linux-gnu/13/include/amxtileintrin.h(42): error: identifier "__builtin_ia32_ldtilecfg" is undefined
    __builtin_ia32_ldtilecfg (__config);
    ^

/usr/lib/gcc/x86_64-linux-gnu/13/include/amxtileintrin.h(49): error: identifier "__builtin_ia32_sttilecfg" is undefined
    __builtin_ia32_sttilecfg (__config);
    ^
2 errors detected in the compilation of "/.../gec/src/sm2.cpp".

Does anyone know how to fix this error?

Thanks, in advance

check the suggestions here

try switching to CUDA 12.4 update 1 or newer.

I’ve switched to CUDA 12.5 (now available for Ubuntu 24.04.) and checked the suggestions in your link. Unfortunately, there is still the described compiler error with gcc version 13.2.0 (and 13.3.0 which I tested).

Now I switched to Ubuntu 22.04, installing cuda version 11.5 via ‘sudo apt install nvidia-cuda-toolkit’ and compilng with gcc-11. The above error is gone. Instead, I get the error: “identifier “__builtin_ia32_serialize” is undefined”. Is there a known combination of cuda toolkit and gcc, which can compile the sources in “HareInWeed/gec” repository with “cuda enabled”?

Any help is appreciated.

I have CUDA 12.4.99 (12.4.0) and g++ 11.4 (default for 22.04, I believe) installed on Ubuntu 22.04. I set the with CUDA property in CMakeLists.txt to “ON”. I didn’t seem to have any trouble compiling that module (sm2.cpp):

login as: bob
bob@192.168.1.109's password:
Welcome to Ubuntu 22.04.2 LTS (GNU/Linux 6.5.0-26-generic x86_64)

 * Documentation:  https://help.ubuntu.com
 * Management:     https://landscape.canonical.com
 * Support:        https://ubuntu.com/advantage

 * Introducing Expanded Security Maintenance for Applications.
   Receive updates to over 25,000 software packages with your
   Ubuntu Pro subscription. Free for personal use.

     https://ubuntu.com/pro

Expanded Security Maintenance for Applications is not enabled.

99 updates can be applied immediately.
To see these additional updates run: apt list --upgradable

Enable ESM Apps to receive additional future security updates.
See https://ubuntu.com/esm or run: sudo pro status


The list of available updates is more than a week old.
To check for new updates run: sudo apt update
Last login: Tue Jul  9 15:31:07 2024
bob@bob-Precision-WorkStation-T7500:~$ g++ --version
g++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

bob@bob-Precision-WorkStation-T7500:~$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Tue_Feb_27_16:19:38_PST_2024
Cuda compilation tools, release 12.4, V12.4.99
Build cuda_12.4.r12.4/compiler.33961263_0

bob@bob-Precision-WorkStation-T7500:~$ cd junk
bob@bob-Precision-WorkStation-T7500:~/junk$ git clone https://github.com/HareInWeed/gec
Cloning into 'gec'...
remote: Enumerating objects: 1374, done.
remote: Counting objects: 100% (1374/1374), done.
remote: Compressing objects: 100% (346/346), done.
remote: Total 1374 (delta 927), reused 1374 (delta 927), pack-reused 0
Receiving objects: 100% (1374/1374), 300.22 KiB | 209.00 KiB/s, done.
Resolving deltas: 100% (927/927), done.
bob@bob-Precision-WorkStation-T7500:~/junk$ ls
gec
bob@bob-Precision-WorkStation-T7500:~/junk$ cd gec
bob@bob-Precision-WorkStation-T7500:~/junk/gec$ ls
bench           docs     LICENSE    src    TODO.md
CMakeLists.txt  include  README.md  tests  vcpkg.json
bob@bob-Precision-WorkStation-T7500:~/junk/gec$ vi CMakeLists.txt
bob@bob-Precision-WorkStation-T7500:~/junk/gec$ cmake .
-- The CXX compiler identification is GNU 11.4.0
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- The CUDA compiler identification is NVIDIA 12.4.99
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc - skipped
-- Detecting CUDA compile features
-- Detecting CUDA compile features - done
-- Found CUDAToolkit: /usr/local/cuda/include (found version "12.4.99")
-- Looking for C++ include pthread.h
-- Looking for C++ include pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
CMake Error at tests/CMakeLists.txt:3 (find_package):
  Could not find a package configuration file provided by "Catch2" with any
  of the following names:

    Catch2Config.cmake
    catch2-config.cmake

  Add the installation prefix of "Catch2" to CMAKE_PREFIX_PATH or set
  "Catch2_DIR" to a directory containing one of the above files.  If "Catch2"
  provides a separate development package or SDK, be sure it has been
  installed.


-- Configuring incomplete, errors occurred!
See also "/home/bob/junk/gec/CMakeFiles/CMakeOutput.log".
bob@bob-Precision-WorkStation-T7500:~/junk/gec$ sudo apt install catch2
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following NEW packages will be installed:
  catch2
0 upgraded, 1 newly installed, 0 to remove and 210 not upgraded.
Need to get 490 kB of archives.
After this operation, 2,693 kB of additional disk space will be used.
Get:1 http://us.archive.ubuntu.com/ubuntu jammy/universe amd64 catch2 amd64 2.13.8-1 [490 kB]
Fetched 490 kB in 7s (71.3 kB/s)
Selecting previously unselected package catch2.
(Reading database ... 224928 files and directories currently installed.)
Preparing to unpack .../catch2_2.13.8-1_amd64.deb ...
Unpacking catch2 (2.13.8-1) ...
Setting up catch2 (2.13.8-1) ...
bob@bob-Precision-WorkStation-T7500:~/junk/gec$ cmake .
-- Configuring done
CMake Warning (dev) in CMakeLists.txt:
  Policy CMP0104 is not set: CMAKE_CUDA_ARCHITECTURES now detected for NVCC,
  empty CUDA_ARCHITECTURES not allowed.  Run "cmake --help-policy CMP0104"
  for policy details.  Use the cmake_policy command to set the policy and
  suppress this warning.

  CUDA_ARCHITECTURES is empty for target "secp256k1".
This warning is for project developers.  Use -Wno-dev to suppress it.

CMake Warning (dev) in CMakeLists.txt:
  Policy CMP0104 is not set: CMAKE_CUDA_ARCHITECTURES now detected for NVCC,
  empty CUDA_ARCHITECTURES not allowed.  Run "cmake --help-policy CMP0104"
  for policy details.  Use the cmake_policy command to set the policy and
  suppress this warning.

  CUDA_ARCHITECTURES is empty for target "secp256k1".
This warning is for project developers.  Use -Wno-dev to suppress it.

CMake Warning (dev) in CMakeLists.txt:
  Policy CMP0104 is not set: CMAKE_CUDA_ARCHITECTURES now detected for NVCC,
  empty CUDA_ARCHITECTURES not allowed.  Run "cmake --help-policy CMP0104"
  for policy details.  Use the cmake_policy command to set the policy and
  suppress this warning.

  CUDA_ARCHITECTURES is empty for target "secp256k1".
This warning is for project developers.  Use -Wno-dev to suppress it.

CMake Warning (dev) in CMakeLists.txt:
  Policy CMP0104 is not set: CMAKE_CUDA_ARCHITECTURES now detected for NVCC,
  empty CUDA_ARCHITECTURES not allowed.  Run "cmake --help-policy CMP0104"
  for policy details.  Use the cmake_policy command to set the policy and
  suppress this warning.

  CUDA_ARCHITECTURES is empty for target "sm2".
This warning is for project developers.  Use -Wno-dev to suppress it.

CMake Warning (dev) in CMakeLists.txt:
  Policy CMP0104 is not set: CMAKE_CUDA_ARCHITECTURES now detected for NVCC,
  empty CUDA_ARCHITECTURES not allowed.  Run "cmake --help-policy CMP0104"
  for policy details.  Use the cmake_policy command to set the policy and
  suppress this warning.

  CUDA_ARCHITECTURES is empty for target "sm2".
This warning is for project developers.  Use -Wno-dev to suppress it.

CMake Warning (dev) in CMakeLists.txt:
  Policy CMP0104 is not set: CMAKE_CUDA_ARCHITECTURES now detected for NVCC,
  empty CUDA_ARCHITECTURES not allowed.  Run "cmake --help-policy CMP0104"
  for policy details.  Use the cmake_policy command to set the policy and
  suppress this warning.

  CUDA_ARCHITECTURES is empty for target "sm2".
This warning is for project developers.  Use -Wno-dev to suppress it.

CMake Warning (dev) in tests/cuda/CMakeLists.txt:
  Policy CMP0104 is not set: CMAKE_CUDA_ARCHITECTURES now detected for NVCC,
  empty CUDA_ARCHITECTURES not allowed.  Run "cmake --help-policy CMP0104"
  for policy details.  Use the cmake_policy command to set the policy and
  suppress this warning.

  CUDA_ARCHITECTURES is empty for target "cu_unit_test".
This warning is for project developers.  Use -Wno-dev to suppress it.

CMake Warning (dev) in tests/cuda/CMakeLists.txt:
  Policy CMP0104 is not set: CMAKE_CUDA_ARCHITECTURES now detected for NVCC,
  empty CUDA_ARCHITECTURES not allowed.  Run "cmake --help-policy CMP0104"
  for policy details.  Use the cmake_policy command to set the policy and
  suppress this warning.

  CUDA_ARCHITECTURES is empty for target "cu_unit_test".
This warning is for project developers.  Use -Wno-dev to suppress it.

-- Generating done
-- Build files have been written to: /home/bob/junk/gec
bob@bob-Precision-WorkStation-T7500:~/junk/gec$ ls
bench                CMakeLists.txt         include    src      vcpkg.json
CMakeCache.txt       CTestTestfile.cmake    LICENSE    Testing
CMakeFiles           DartConfiguration.tcl  Makefile   tests
cmake_install.cmake  docs                   README.md  TODO.md
bob@bob-Precision-WorkStation-T7500:~/junk/gec$ make
[  2%] Building CUDA object CMakeFiles/secp256k1.dir/src/secp256k1.cpp.o
/home/bob/junk/gec/src/secp256k1.cpp(20): warning #23-D: integer constant is too large
  const FBase RR(
                ^

Remark: The warnings can be suppressed with "-diag-suppress <warning-number>"

/home/bob/junk/gec/src/secp256k1.cpp(34): warning #23-D: integer constant is too large
     const FBase d_RR(
                     ^

/home/bob/junk/gec/src/secp256k1.cpp(20): warning #23-D: integer constant is too large
  const FBase RR(
                ^

Remark: The warnings can be suppressed with "-diag-suppress <warning-number>"

/home/bob/junk/gec/src/secp256k1.cpp(34): warning #23-D: integer constant is too large
     const FBase d_RR(
                     ^

[  5%] Linking CUDA device code CMakeFiles/secp256k1.dir/cmake_device_link.o
[  8%] Linking CUDA static library libsecp256k1.a
[  8%] Built target secp256k1
[ 11%] Building CUDA object CMakeFiles/sm2.dir/src/sm2.cpp.o
[ 14%] Linking CUDA device code CMakeFiles/sm2.dir/cmake_device_link.o
[ 17%] Linking CUDA static library libsm2.a
[ 17%] Built target sm2
[ 20%] Building CXX object tests/CMakeFiles/unit_test_shared.dir/test_main.cpp.o
[ 22%] Linking CXX static library libunit_test_shared.a
[ 22%] Built target unit_test_shared
[ 25%] Building CXX object tests/cpu/CMakeFiles/unit_test.dir/common.cpp.o
[ 28%] Building CXX object tests/cpu/CMakeFiles/unit_test.dir/curve.cpp.o
[ 31%] Building CXX object tests/cpu/CMakeFiles/unit_test.dir/test_arithmetic.cpp.o
[ 34%] Building CXX object tests/cpu/CMakeFiles/unit_test.dir/test_bigint.cpp.o
[ 37%] Building CXX object tests/cpu/CMakeFiles/unit_test.dir/test_curve.cpp.o
[ 40%] Building CXX object tests/cpu/CMakeFiles/unit_test.dir/test_dlp.cpp.o
[ 42%] Building CXX object tests/cpu/CMakeFiles/unit_test.dir/test_field.cpp.o
[ 45%] Building CXX object tests/cpu/CMakeFiles/unit_test.dir/test_lift_x.cpp.o
[ 48%] Building CXX object tests/cpu/CMakeFiles/unit_test.dir/test_misc.cpp.o
[ 51%] Linking CXX executable unit_test
...

OK, I switched to the suggested versions (CUDA 12.4.1with g++ 11.4), and now it compiles without error. Due to runtime errors, I had to update my Nvidia driver. But after that, all ‘Cuda enabled’ test cases passed.

Thank you for support.