__nv_bool mixup with bool

imanhosseini.17 · April 14, 2022, 3:44am

sls.cu(200): error: no instance of overloaded function "std::condition_variable::wait" matches the argument list
            argument types are: (std::lock_guard<std::mutex>, lambda []()->__nv_bool)
            object type is: std::condition_variable

I have a ‘.cu’ file and part of the code there is host code, why should I not be able to do this? (it’s in a host function, HV_rdy is in host memory, everything is just host here)

bool HV_rdy = false;
...
cv.wait(lk, []{return HV_rdy;});

So you cannot even do std::sort with a custom comparator :(
Minimum reproducible example: #include <iostream>#include <assert.h>#include <algorithm>#include <vector - Pastebin.com [Thanks to @blelbach for pointing out this example is wrong due to me forgetting an iterator arg, I mean ARGHHHH! sorry!) But here is another example with the original case of cv.wait: Compiler Explorer
It works with g++/clang, but spits out the error I have up on this post. My guess is that during SFINAE it mixes up the resolution and return type ends up being __nv_bool instead of bool.

blelbach · April 15, 2022, 5:48pm

I still can’t reproduce this. Can you please provide a minimal example showing the issue on Godbolt, with NVCC?

This works just fine: Compiler Explorer

I’m not sure what the relation to this CV code and std::sort is.

imanhosseini.17 · April 15, 2022, 6:22pm

They aren’t related directly, it is a big program and throws multiple errors, as these 2 both had the same “__nv_bool” / “bool” in their error diagnostics, I thought they are the same issue.
I am confused, I see that godbolt throws no error, but godbolt “NVCC 11.3.0 sm_52” seems to only emit device code (which in this case there isn’t any).
This is the verbose output from my machine (nvcc -v):

#$ _NVVM_BRANCH_=nvvm
#$ _SPACE_= 
#$ _CUDART_=cudart
#$ _HERE_=/usr/local/cuda-11.3/bin
#$ _THERE_=/usr/local/cuda-11.3/bin
#$ _TARGET_SIZE_=
#$ _TARGET_DIR_=
#$ _TARGET_DIR_=targets/x86_64-linux
#$ TOP=/usr/local/cuda-11.3/bin/..
#$ NVVMIR_LIBRARY_DIR=/usr/local/cuda-11.3/bin/../nvvm/libdevice
#$ LD_LIBRARY_PATH=/usr/local/cuda-11.3/bin/../lib:/usr/local/cuda-11.3/lib64:/usr/local/cuda-11.3/lib64
#$ PATH=/usr/local/cuda-11.3/bin/../nvvm/bin:/usr/local/cuda-11.3/bin:/usr/local/cuda-11.3/bin:/home/iman/.vscode-server/bin/8dfae7a5cd50421d10cd99cb873990460525a898/bin/remote-cli:/home/iman/.cargo/bin:/home/iman/.local/bin:/usr/local/cuda-11.3/bin:/home/iman/.nvm/versions/node/v16.6.2/bin:/home/iman/anaconda3/condabin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/usr/local/go/bin:/opt/gradle/gradle-6.0.1/bin:/home/iman/.cargo/bin/:/opt/gradle/gradle-6.0.1/bin:/home/iman/.cargo/bin/
#$ INCLUDES="-I/usr/local/cuda-11.3/bin/../targets/x86_64-linux/include"  
#$ LIBRARIES=  "-L/usr/local/cuda-11.3/bin/../targets/x86_64-linux/lib/stubs" "-L/usr/local/cuda-11.3/bin/../targets/x86_64-linux/lib"
#$ CUDAFE_FLAGS=
#$ PTXAS_FLAGS=
#$ gcc -std=c++17 -D__CUDA_ARCH__=520 -E -x c++  -DCUDA_DOUBLE_MATH_FUNCTIONS -D__CUDACC__ -D__NVCC__  "-I/usr/local/cuda-11.3/bin/../targets/x86_64-linux/include"    -D__CUDACC_VER_MAJOR__=11 -D__CUDACC_VER_MINOR__=3 -D__CUDACC_VER_BUILD__=58 -D__CUDA_API_VER_MAJOR__=11 -D__CUDA_API_VER_MINOR__=3 -include "cuda_runtime.h" -m64 "cv.cu" -o "/tmp/tmpxft_00213fc9_00000000-9_cv.cpp1.ii" 
#$ cicc --c++17 --gnu_version=90400 --orig_src_file_name "cv.cu" --allow_managed   -arch compute_52 -m64 --no-version-ident -ftz=0 -prec_div=1 -prec_sqrt=1 -fmad=1 --include_file_name "tmpxft_00213fc9_00000000-3_cv.fatbin.c" -tused --gen_module_id_file --module_id_file_name "/tmp/tmpxft_00213fc9_00000000-4_cv.module_id" --gen_c_file_name "/tmp/tmpxft_00213fc9_00000000-6_cv.cudafe1.c" --stub_file_name "/tmp/tmpxft_00213fc9_00000000-6_cv.cudafe1.stub.c" --gen_device_file_name "/tmp/tmpxft_00213fc9_00000000-6_cv.cudafe1.gpu"  "/tmp/tmpxft_00213fc9_00000000-9_cv.cpp1.ii" -o "/tmp/tmpxft_00213fc9_00000000-6_cv.ptx"
cv.cu(22): error: no instance of overloaded function "std::condition_variable::wait" matches the argument list
            argument types are: (std::lock_guard<std::mutex>, lambda []()->__nv_bool)
            object type is: std::condition_variable

1 error detected in the compilation of "cv.cu".
# --error 0x1 --

here is the cv.cpp1.ii: https://gist.github.com/ImanHosseini/1c45a8d07288701a9d3cabf5d52dc282
Update: & it’s not due to g++ version, I tried “-ccbin /usr/bin/g+±10” to no avail.
Update II: I tried multiple versions of nvcc, only this one fails:

Cuda compilation tools, release 11.3, V11.3.58
Build cuda_11.3.r11.3/compiler.29745058_0

Every other version I tried has no issue. & I don’t know why it cant be reproduced in godbolt’s 11.3.0.
Here is where I got this version from hell: https://developer.download.nvidia.com/compute/cuda/11.3.0/local_installers/cuda_11.3.0_465.19.01_linux.run
Couple of github issues with similar error: Compile error in core/context_gpu.cu · Issue #278 · facebookarchive/caffe2 · GitHub But I don’t know if they are related or not.

imanhosseini.17 · April 16, 2022, 2:23am

So my problem is actually fixed by simply using another nvcc/cuda version, but I really had an itch to see what the problem with that bad build/ version was. I took up the proverbial drill (strace), and:

strace -f -s 1000000 nvcc -std=c++17 cv.cu 2>trace.txt

Once for the working nvcc, and once for the one that did not work. The dumps are huge but here is the main difference, in the correct version, we have this:

openat(AT_FDCWD, "/tmp/tmpxft_001b51eb_00000000-6_cv.cudafe1.gpu", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 4
...
write(4, "typedef char __nv_bool;\nstruct __EDG_type_info;struct __class_type_info;\n ...")

Whereas in the wrong version, this typedef is nowhere to be found. Well as we know the name of the file we should be able to see into it with ‘-keep’ right? (more fun than extracting it from strace output!) Nope, even with ‘-keep’ nvcc calls unlink on those tmp files but fear not: So I made a dummy unlink implementation that doesn’t actually unlink & just prints out what file unlink was called on: dummy unlink · GitHub now I can LD_PRELOAD it:

LD_PRELOAD=./myunlink.so nvcc -std=c++17 cv.cu

Now does files aren’t going anywhere! Here is what should be generated (the one that worked): correct & troublesome cudafe1.gpu · GitHub
So for some reason, the error stems from the fact that cicc fails and the cudafe1.gpu file which typedefs __nv_bool doesn’t get made. We could’ve deducted as much just with ‘-v’ but now we can shoehorn in the (tmpx files) which I don’t recommend
So for some odd reason, the cicc that I have is busted. → nope. it wasn’t this. it’s not related to nvcc version at all, I have 2 machines, on 1 this occurs for every version. So whatever the reason is, is some weird reason unrelated to CICC itself. But what else, besides the CICC binary matters here?

imanhosseini.17 · April 16, 2022, 5:49pm

so that cicc command that took in a ‘.ii’ file was leading to that __nv_bool error. & it was happening only on one of my machines (for any cuda). So something else on my system must be broken. I traced “openat” syscalls that cicc makes (removing those ENOENTs):

openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libpthread.so.0", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/librt.so.1", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libdl.so.2", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/usr/lib/x86_64-linux-gnu/libstdc++.so.6", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libm.so.6", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libgcc_s.so.1", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/dev/urandom", O_RDONLY) = 3
openat(AT_FDCWD, "/tmp/tmpxft_00249675_00000000-9_cv-9245cd..lgenfe.bc", O_RDWR|O_CREAT|O_EXCL|O_CLOEXEC, 0600) = 3
openat(AT_FDCWD, "/etc/localtime", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/tmp/tmpxft_00249675_00000000-9_cv.cpp1.ii", O_RDONLY) = 3
cv.cu(22): error: no instance of overloaded function "std::condition_variable::wait" matches the argument list
            argument types are: (std::lock_guard<std::mutex>, lambda []()->__nv_bool)
            object type is: std::condition_variable

1 error detected in the compilation of "cv.cu".
+++ exited with 1 +++

hmm…

imanhosseini.17 · April 16, 2022, 7:02pm

What if I copy the “cpp1.ii” file from the machine which works, to the one that doesn’t work and overwrite it? I did this and it worked → so if CICC is fed the correct “.ii” file, it would not throw that error, the issue is in “cpp1.ii” file itself. (the BAD and correct ii files for reference: https://gist.github.com/ImanHosseini/d1b3c85690ab505de642e23266cfdbfb)
Where does it come from? Here:

gcc -std=c++17 -D__CUDA_ARCH__=520 -D__CUDA_ARCH_LIST__=520 -E -x c++  -DCUDA_DOUBLE_MATH_FUNCTIONS -D__CUDACC__ -D__NVCC__  "-I/usr/local/cuda-11.6/bin/../targets/x86_64-linux/include"    -D__CUDACC_VER_MAJOR__=11 -D__CUDACC_VER_MINOR__=6 -D__CUDACC_VER_BUILD__=124 -D__CUDA_API_VER_MAJOR__=11 -D__CUDA_API_VER_MINOR__=6 -D__NVCC_DIAG_PRAGMA_SUPPORT__=1 -include "cuda_runtime.h" -m64 "cv.cu" -o "/tmp/tmpxft_00279afd_00000000-9_cv.cpp1.ii""

So: on “gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0”, this leads to the problematic ‘.ii’ file. (btw, this is the default gcc version for cuda 11.6).
godbolt repro on cuda 11.5: Compiler Explorer
Ok this is weird. This is slightly different than the other testcase that I had, so this is a super-sly bug? Nope: you can’t pass lock_guard wrapper to cv.wait(…), it was a simple c++ bug after all! So I guess the moral is if you rely too much on g++ diagnostics to catch these, with nvcc you don’t get those nice diagnostics!
I had 2 different .cu files on the 2 machine, they were the same except one had “lock_guard”, and I didn’t realize it! I thought something is wrong with one of systems. It would really payoff to separate the host code into separate files as much as possible: you get better errors if you are doing something wrong like this.

system · April 30, 2022, 7:02pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Ubuntu 20.04, GCC 9.3, Cuda Toolkit 11.3 - not a supported combination? CUDA Programming and Performance	11	8978	November 4, 2021
Nvcc 12.3 with gcc 13.2 not working CUDA NVCC Compiler	11	10004	March 12, 2024
NVCC forces c++ compilation of .cu files CUDA Programming and Performance	11	25672	December 11, 2011
Inheritence issue in Cuda CUDA Programming and Performance	12	1137	December 25, 2021
CUDA compile trouble CUDA Programming and Performance	47	5119	November 8, 2010
nvcc and googletest CUDA Programming and Performance	5	16376	July 7, 2011
Starting with CUDA 12.4, nvcc can't deduce a template type in template function under weird conditions CUDA NVCC Compiler	6	295	July 19, 2024
Weird error when compiling CUDA Programming and Performance	3	1332	August 2, 2011
nvcc error? on vista 64bit build error CUDA Programming and Performance	8	12833	October 26, 2009
Nested container and template problem CUDA Programming and Performance	4	2430	October 2, 2009

__nv_bool mixup with bool

Related topics