I am also getting an error I saw quoted in the thread relating to CUDA 2.3a and Thrust, but I get it without even using Thrust. It doesn’t seem to take much to produce it.
Unfortunately, this workaround does not always work. Some external libraries, e.g. the boost shared_ptr version 1.40.0, seem to require some functions which are disabled by nvcc. This code, for example,
/opt/boost/boost_1_40_0/include/boost/smart_ptr/detail/sp_counted_base_gcc_x86.hpp(75): warning: variable “tmp” was set but never used
/opt/boost/boost_1_40_0/include/boost/smart_ptr/detail/spinlock_sync.hpp(40): error: identifier “__sync_lock_test_and_set” is undefined
/opt/boost/boost_1_40_0/include/boost/smart_ptr/detail/spinlock_sync.hpp(54): error: identifier “__sync_lock_release” is undefined
[/codebox]
Compiling the same code with g++ does not even generate a warning. Does anybody have an idea what to do about this, other than wait for a fix from NVIDIA?
Why are you compiling C++ code with nvcc? I only compile CUDA kernels and C host functions that call them with nvcc. I compile all of my C++ code with gcc and then link to the C functions from the nvcc compile.
If you’ve carefully separated out all of your host code into its own files, and made a pure C interface between them so they don’t need to see any C++ data types or call any C++ functions or methods, that’s great. Not all CUDA programs are written that way!
Maybe I just like to live dangerously, but I actually quite like the runtime API :)
It’s true the that runtime has some shortcomings and limitations. On the other hand, achieving close coupling between host/device code is a hard problem, and significantly more ambitious than the driver API. Consider how hard it would be to implement something like Thrust [1] with only the CUDA driver API (or OpenCL, for that matter). At minimum you’d need a C++ compiler front-end that was able to instantiate templates for both the host and device.
In other words, you’d need something roughly equivalent to nvcc :)
Anyway, I don’t mean to dismiss or trivialize the criticism of the runtime API. We’ve certainly encountered our fair share of nvcc bugs in developing Thrust. However, as we reported these issues support for templates, namespaces, and other C++ features rapidly improved. This bug on Snow Leopard notwithstanding, nvcc is now a fairly competent C++ compiler and the preferred interface for many smaller-scale CUDA developers.
I see a big flaw with CUDA in that NVCC is called at compile time, so no mater how nice support for templates etc is, you always need to recompile your code. That makes it impossible to generalize complex algorithms. It also makes it impossible to take advantage of the actual GPU capabilities. No mater which api you use.
So much development time spent at CUDA, but nobody saw that as CUDA’s major shortcoming?
OpenCL has gone a much better route here. Through the use of #define’s you can write algorithms in OpenCL that are not possible in CUDA right now. OpenCL sure is missing a nice templates preprocessor to make it even more flexible - but that should be a minor task to develop.
You can argue that the runtime api makes some things easier to develop. But does that justify the added complexity of having to maintain two different api’s?
I really dislike all these SDK samples that ship in a single .CU file. All would be much easier to understand if device and host code where never mixed into one file, which is exactly what dwalthour suggests.
In my opinion the runtime api makes no sense at all and I can imagine how much development time it is taking from the CUDA development team. I don’t want NVIDIA compiling my host code, I trust other tools for that. I want NVCC excelling at device code generation - nothing else.
It will be interesting to see how much CUDA developers will be leaving to OpenCL in the next month’s/years.
Could you elaborate? What exactly is the feature we’re not supposed to use in .cu files? The code that Brett posted as triggering the error was little more than an empty main(). Was it the use of std::string? Something else?