CUDA 2.3a/nvcc frustrations

blucey · September 14, 2009, 5:32pm

I am also getting an error I saw quoted in the thread relating to CUDA 2.3a and Thrust, but I get it without even using Thrust. It doesn’t seem to take much to produce it.

This code bit will cause it:

[codebox]#include

int main()

{

std::string s = "Hello, world.";

return 0;

}

[/codebox]

This results in:

[codebox]

/usr/include/c++/4.2.1/ext/atomicity.h(51): error: identifier “__sync_fetch_and_add” is undefined

/usr/include/c++/4.2.1/ext/atomicity.h(55): error: identifier “__sync_fetch_and_add” is undefined

[/codebox]

Any ideas on how to fix this?

Not using std::string is not an option. :)

-Brett

JaredHoberock · September 14, 2009, 5:57pm

As a workaround, try making this the first line of your program:

[codebox]

#undef _GLIBCXX_ATOMIC_BUILTINS

[/codebox]

Darspell · September 17, 2009, 8:32am

Unfortunately, this workaround does not always work. Some external libraries, e.g. the boost shared_ptr version 1.40.0, seem to require some functions which are disabled by nvcc. This code, for example,

[codebox]

#undef _GLIBCXX_ATOMIC_BUILTINS

#include

#include <boost/shared_ptr.hpp>

int main() {

boost::shared_ptr< int > i( new int ) ;

*i = 10 ;

std::cout << *i << std::endl ;

return 0 ;

}

[/codebox]

produces

[codebox]

/opt/boost/boost_1_40_0/include/boost/smart_ptr/detail/sp_counted_base_gcc_x86.hpp(49): warning: “cc” clobber ignored

/opt/boost/boost_1_40_0/include/boost/smart_ptr/detail/sp_counted_base_gcc_x86.hpp(65): warning: “cc” clobber ignored

/opt/boost/boost_1_40_0/include/boost/smart_ptr/detail/sp_counted_base_gcc_x86.hpp(91): warning: “cc” clobber ignored

/opt/boost/boost_1_40_0/include/boost/smart_ptr/detail/sp_counted_base_gcc_x86.hpp(75): warning: variable “tmp” was set but never used

/opt/boost/boost_1_40_0/include/boost/smart_ptr/detail/spinlock_sync.hpp(40): error: identifier “__sync_lock_test_and_set” is undefined

/opt/boost/boost_1_40_0/include/boost/smart_ptr/detail/spinlock_sync.hpp(54): error: identifier “__sync_lock_release” is undefined

[/codebox]

Compiling the same code with g++ does not even generate a warning. Does anybody have an idea what to do about this, other than wait for a fix from NVIDIA?

dwalthour · September 18, 2009, 5:50pm

I am also getting an error I saw quoted in the thread relating to CUDA 2.3a and Thrust, but I get it without even using Thrust. It doesn’t seem to take much to produce it.

This code bit will cause it:

[codebox]include

int main()

{
std::string s = "Hello, world.";

return 0;
}

[/codebox]

This results in:

[codebox]

/usr/include/c++/4.2.1/ext/atomicity.h(51): error: identifier “__sync_fetch_and_add” is undefined

/usr/include/c++/4.2.1/ext/atomicity.h(55): error: identifier “__sync_fetch_and_add” is undefined

[/codebox]

Any ideas on how to fix this?

Not using std::string is not an option. :)

-Brett

Why are you compiling C++ code with nvcc? I only compile CUDA kernels and C host functions that call them with nvcc. I compile all of my C++ code with gcc and then link to the C functions from the nvcc compile.

peastman · September 18, 2009, 11:36pm

If you’ve carefully separated out all of your host code into its own files, and made a pure C interface between them so they don’t need to see any C++ data types or call any C++ functions or methods, that’s great. Not all CUDA programs are written that way!

Peter

maolimu · September 20, 2009, 11:21am

I think one of NVIDIAs greatest errors was creating two APIs for using CUDA: driver and runtime.

The runtime API is really just an invitation to trouble.

nbell · September 20, 2009, 11:12pm

Maybe I just like to live dangerously, but I actually quite like the runtime API :)

It’s true the that runtime has some shortcomings and limitations. On the other hand, achieving close coupling between host/device code is a hard problem, and significantly more ambitious than the driver API. Consider how hard it would be to implement something like Thrust [1] with only the CUDA driver API (or OpenCL, for that matter). At minimum you’d need a C++ compiler front-end that was able to instantiate templates for both the host and device.

In other words, you’d need something roughly equivalent to nvcc :)

Anyway, I don’t mean to dismiss or trivialize the criticism of the runtime API. We’ve certainly encountered our fair share of nvcc bugs in developing Thrust. However, as we reported these issues support for templates, namespaces, and other C++ features rapidly improved. This bug on Snow Leopard notwithstanding, nvcc is now a fairly competent C++ compiler and the preferred interface for many smaller-scale CUDA developers.

[1] http://code.google.com/p/thrust/

baarts · September 21, 2009, 2:51am

This will be fixed in the 3.0 release.

until then, not using this feature in .cu files is the only WAR

maolimu · September 21, 2009, 12:45pm

I see a big flaw with CUDA in that NVCC is called at compile time, so no mater how nice support for templates etc is, you always need to recompile your code. That makes it impossible to generalize complex algorithms. It also makes it impossible to take advantage of the actual GPU capabilities. No mater which api you use.

So much development time spent at CUDA, but nobody saw that as CUDA’s major shortcoming?

OpenCL has gone a much better route here. Through the use of #define’s you can write algorithms in OpenCL that are not possible in CUDA right now. OpenCL sure is missing a nice templates preprocessor to make it even more flexible - but that should be a minor task to develop.

You can argue that the runtime api makes some things easier to develop. But does that justify the added complexity of having to maintain two different api’s?

I really dislike all these SDK samples that ship in a single .CU file. All would be much easier to understand if device and host code where never mixed into one file, which is exactly what dwalthour suggests.

In my opinion the runtime api makes no sense at all and I can imagine how much development time it is taking from the CUDA development team. I don’t want NVIDIA compiling my host code, I trust other tools for that. I want NVCC excelling at device code generation - nothing else.

It will be interesting to see how much CUDA developers will be leaving to OpenCL in the next month’s/years.

peastman · September 21, 2009, 9:29pm

Could you elaborate? What exactly is the feature we’re not supposed to use in .cu files? The code that Brett posted as triggering the error was little more than an empty main(). Was it the use of std::string? Something else?

Peter

nbell · September 21, 2009, 10:02pm

Including standard C++ headers in .cu files on Snow Leopard are problematic. The workarounds are to either

(1) add #undef _GLIBCXX_ATOMIC_BUILTINS to the top of your .cu files, or

(2) move the C++ code to a .cpp file, or

(3) revert to an earlier compiler (i.e. not GCC 4.2 on Snow Leopard).

nbell · September 21, 2009, 10:09pm

Left as an exercise for the reader?

maolimu · September 21, 2009, 10:29pm

Off course - mine is working fine already :)

peastman · September 22, 2009, 10:15pm

Thanks! That was the clue I needed to get things working. I created a symbolic link to gcc 4.0 in my build directory:

ln -s /usr/bin/gcc-4.0 gcc

I then used the --compiler-bindir option to nvcc to tell it to use that as the compiler. I can finally do CUDA development on my Mac again!

Peter

Topic		Replies	Views
Thrust and CUDA 2.3a problem CUDA Programming and Performance	6	16802	November 28, 2009
NVCC forces c++ compilation of .cu files CUDA Programming and Performance	11	25783	December 11, 2011
Boost Functional compile error CUDA Programming and Performance	8	3558	February 13, 2011
Cannot use STD namespace in my cuda files CUDA Programming and Performance	4	8984	January 3, 2010
CUDA and boost::test CUDA Programming and Performance	0	4675	September 18, 2009
CUDA & the stl CUDA Programming and Performance	4	10254	June 1, 2007
threadIdx undeclared identifier CUDA Programming and Performance	7	20213	November 9, 2009
CUDA 4.0 RC linking error on MSVC and dynamics runtime libraries /MD bug in nvcc? CUDA Programming and Performance	13	21963	May 30, 2011
strange error in host code CUDA Programming and Performance	5	11357	January 23, 2008
FC11 x86_64: __sync_fetch_and_add error... compiler error with nvcc... CUDA Programming and Performance	3	5324	November 30, 2009

CUDA 2.3a/nvcc frustrations

Related topics