Wanna integrate C++ with CUDA You better not to do that..

Sarnath · November 14, 2008, 1:38pm

Guys,

Just To inform you all that trying to integrate C++ and CUDA is a nightmare. Just keep them separate and be happy. God bless you…

MisterAnderson42 · November 14, 2008, 2:33pm

And this is news? It has been this way since the first public beta release of CUDA a year and a half ago. There is a reason the cppIntegration sample in the SDK compiles the C++ code with the host compiler and the CUDA C code with nvcc.

Nor is it really that much of a pain to keep them separate, as all the produced object files can be linked together without any problems.

Sarnath · November 15, 2008, 3:27am

Like, earlier, some1 had said that they had success integrating template library code into CUDA etc… And classes like

class xxx {

public:

device global void mykernel(float *);

};

were possible…

But yeah, it all works fine with VS2005 SP1 when u start minimal coding. But as you add complexity everything breaks down. FOr example, when I added std::vector into the classes, I started getting un-identifiable linker errors making life horrible…

This may not be news to people like you… Yet it might help some1 (espescially the forum search is not working as well as before).

I have just started admiring the beauty of C++ and I just found a pitfall like this… The frustration is what made me post it to the forums… I just wanted to crib to some1… This forum makes a nice place for that :)

Sarnath · November 15, 2008, 3:37am

My guess:

I think its all got to do with cudafe. It just spits warnings even if things r ok. For exampe, I had declared a struct like “AcquireRequest_T” inside a class and it says “Multiply defined” – sure this is NOT the case… I think this is also the reason for all the linker errors.

If nvidia fixes cudafe – C++ support could b avilable, i suppose…

btw,
What is NVIDIA’s official stand on this issue? Tmurray, Can you say sthg on this?

Thank you
Best Regards,
Sarnath

alex_dubinsky · November 15, 2008, 10:55pm

You were using std::vector in Device code, or Host?

tmurray · November 16, 2008, 5:39pm

Putting kernels inside classes is specifically not supported.

FROL · November 16, 2008, 9:25pm

you can see MGML library [url=“http://ray-tracing.ru/upload/free/MGML_MATH/MGML_MATH_EXAMPLE_17_nov_2008.zip”]http://ray-tracing.ru/upload/free/MGML_MAT...17_nov_2008.zip[/url]
It should work under CUDA 1.1 but not under CUDA 2.0 yet. Yes, there are problems with C++ in CUDA 2.0 :(

But in general C++ works fine. Even with template metaprogramming.
Most of people who see my library believe that MGML is real C++ nightmare. But nvcc works fine with it.
You even can see in ptx code that there is no differenct between handcoding and generated from templates code.

Of course only subset of C++ can be used. But i think it is sufficient. So C++ works statisfactory i think.

Sarnath · November 17, 2008, 5:55am

Host code. Actually, I had used it in a normal “class” and used it in a “.cu” file in a pure CPP function…

It was like – you have a “.cu” file having a kernel along with a member class function (CPP) which does some std::vector operations and calls this kernel (via GPUWorker-like setup). Even this did NOT work. It was giving strange linker problems.

Separating out the code worked fine. I am using CUDA 2.0

Now my “.cu” file just contains the kernel and the kernel caller (a small stub that calls the kernel so that I could use GPUWorker like setup)

and the class member functions using vector are in a separate CPP file.

FROL,

THanks for the link. I downloaded your code and browsed through it. I saw the template calls that you make from your CUDA kernel. That looks really cool. I have seen that NVCC actually inlines all the calls that you make when you deal with objects. This might increase your reg-count a lot , I believe. Is there any specifc reason why you use a “struct” instead of a “class”??

Tmurray,

Thanks for your inputs. Is usage of classes and objects inside a kernel officially supported? I played around with it and found it to be working. Also, looking @ FROL’s code, it all seem to work cooly. Can you comment on this? Thanks. Also, Can you tell us what are the exact drawbacks on using objects inside kernel? WHen I played around with it, I found that NVCC actually inlines all the call that is being made and it holds the object’s data-members in registers (with the little data that I used to test). Will this increase the register-count of the kernel a way too high?

THanks,

Best Regards,

Sarnath

tmurray · November 17, 2008, 6:28am

No C++ bits are officially supported inside of kernels. Even templates! Templates generally work, but there are a few weird corner cases where they can break so we don’t officially support them at the moment. Official template support is coming, though, as well as better explanations of what we do actually support right now.

Sarnath · November 17, 2008, 9:00am

Thanks for the clarifications.

Assuming that C++ would get supported one day — I am placing some more questions for you. Hope you can give some perspective.

–

When I declare an object in shared memory, all the data-members find a place in shared memory & All function calls are just plain inlined. Is that right?

Similarly for global and local spaces… Am I right?

And, declaring an object as a local variable – would just drink registers (depending on how big your object data is…)… Is that right?

Thanks.

FROL · November 17, 2008, 11:35am

Yes, may be. On some simple tests (like ray-triangle intersection) there is no regcount increase compare to handcoding. But may be we need more complex tests.

As for “class” and “struct” - no reasons and no difference. I just want to make all data public to peolple can use them as structures if they don’t like C++.

Sarnath · November 17, 2008, 12:02pm

Thanks,

I just examined PTX and realized that registers are being reserved to hold data-members. But mine was a very simple test. Hence…

But NVCC could also repeat a set of operations to generate a data-member as it is needed to free up registers… possible… not seen such a behaviour though…

Also, nvcc was not compiling objects with complex constuctors… (like a for loop inside a constructor).