GPUWorker master/slave multi-gpu approach

Just a question, if I wanted to run the test bench that you posted a link to, what else do I need? So I have installed boost (and verified it works). But what else do I need if I just want to run the test bench here? Anything else? Sorry, just have never used boost before and I’m having problems compiling. Should I be using a Makefile for this?

Thanks!

Compiling that unit test would be complicated I think. It depends on a number of parameters set in the main HOOMD build system that has grown to have a lot of options over the years. To compile simpler programs, you just need to link with the boost_thread library, I think.

nvcc -c kernel.cu

g++ -o myexec main.cc kernel.o -lboost_thread

CMake makes generating makefiles/visual studio projects that do this much easier.

I will be making some free time in the next several weeks to create a separate web page just for GPUWorker and will document a build strategy using CMake.

Thank you for the time you’re putting into this.

From my own experience integrating the GPUWorker in a larger project, I found that the looking at the structure of HOOMD helped me in separating the code compiled with nvcc and gcc, (necessary due to GPUWorker’s inclusion of Boost::bind/function).

Perhaps along with the build strategy you could create a small sample project, showing how to structure properly along the lines of HOOMD.

Btw, I’m happy to report that GPUWorker also works nicely in a setup with multiple master threads calling a single GPUWorker (An extra mutex does the job), along with heavy usage of the asynchronous stream API.

What is the easiest way to link the Boost libraries then? I appear to not be doing so, as upon running the makefile (generated by CMAKE), I get a host of errors starting with:

Linking CXX executable test

CMakeFiles/test.dir/GPUWorker.cc.o: In function `GPUWorker::~GPUWorker()’:

GPUWorker.cc:(.text+0x67f): undefined reference to `boost::thread::join()’

My CMakeLists.txt file in my project directory (also place where GPUWorker is) looks like

[codebox]# Add current directory to the nvcc include line.

CUDA_INCLUDE_DIRECTORIES(

${CMAKE_CURRENT_SOURCE_DIR}

${FOUND_CUT_INCLUDE}

)

INCLUDE_DIRECTORIES( {FOUND_CUDA_NVCC_INCLUDE} {FOUND_CUT_INCLUDE})

############################################################

##################

Use one executable only.

CUDA_ADD_EXECUTABLE(test

HierarchicalPM.cc

PMModule.cc

GPUWorker.cc

hpm.cu

)

##################################################

TARGET_LINK_LIBRARIES(test

{CUDA_TARGET_LINK} {FOUND_CUT}

)

[/codebox]

And the code in my main CMakeLists.txt file looks like

[codebox]cmake_minimum_required (VERSION 2.6)

project (HPM)

INCLUDE(${CMAKE_SOURCE_DIR}/CMake/cuda/FindCUDA.cmake)

INCLUDE(/home/nere/cmake-2.6.3/Modules/FindThreads.cmake)

INCLUDE(/home/nere/cmake-2.6.3/Modules/FindBoost.cmake)

SUBDIRS(

src

)

[/codebox]

[codebox]-lboost_thread-mt[/codebox]

might do the trick.

FindBoost found boost, but you actually need to link it yourself.

TARGET_LINK_LIBRARIES(… ${Boost_THREAD_LIBRARY})

Is this supposed to be just as easy as including Boost_THREAD_LIBRARY in my CMakeLists.txt file in my source code (project) directory? Like :

[codebox]CUDA_INCLUDE_DIRECTORIES(

${CMAKE_CURRENT_SOURCE_DIR}

${FOUND_CUT_INCLUDE}

)

INCLUDE_DIRECTORIES( {FOUND_CUDA_NVCC_INCLUDE} {FOUND_CUT_INCLUDE})

############################################################

##################

Use one executable only.

CUDA_ADD_EXECUTABLE(test

PMModule.cc

GPUWorker.cc

hpm.cu

)

##################################################

TARGET_LINK_LIBRARIES(test

{CUDA_TARGET_LINK} {FOUND_CUT} ${Boost_THREAD_LIBRARY}

)

[/codebox]

Because I tried this solution above, with the same errors.

Yes, it is supposed to be that easy.

Check to make sure that boost is actually being found. Open up ccmake (or cmake-gui) and display the advanced variables. Scroll down and see if Boost_THREAD_LIBRARY is being assigned correctly.

Please if anyone can offer advice:

I have been happily using GPUWorker (thank you, MrAnderson) with Visual Studio 2008, CUDA 2.2, Windows XP 32. I’m trying to switch over to Vista 64, and am having trouble with boost bind and the fact that stdcall turns into cdecl, causing many “function template has already been defined” problems.

Do I have to use a 64 bit build of boost? Will this help? Or is there a simple wrapper / workaround / setting that someone could suggest? It seems that people are using GPUWorker successfully on Vista 64 machines.

Any help would be much appreciated.

Yes, you need to build a 64-bit version of boost. I include a outline of the procedure in my documentation for building HOOMD: http://www.ameslab.gov/hoomd/doc/page_soft…ll_windows.html

The other thing you need is to define the symbol BOOST_BIND_ENABLE_STDCALL in the preprocessor. This fixes up the std vs cdecl calling stuff (that took me more than a week to find: stupid windows 64 bit and barely documented boost library build procedures…).

Mr. Anderson,

Thank you for your speedy response, it was very helpful. And thank you again for GPUWorker, it’s great that you have made it available.

Mr. Anderson,

Thanks, the suggestion worked and I finally got GPUWorker working with CMake. Appreciated the help, just forgot to post a “thanks”!

If anyone is having problems with this, I can post the CMake code for the fix. Just let me know.

Please post it, if u have time. It might be useful to some1.

Very nice solution… I always thought about something like that, but I never implemented it… thank you for that!!!

Unfortunately the are two drawbacks for this solution:

  • texture declaration and access
  • template kernel functions

Both problems are related to the fact that the corresponding functions have to be called within a .cu file. To be more specific, a texture bind/unbind-api-call has to be done inside a .cu file, so the .cu file needs to know the GPUWorker class and therefore needs to parse some BOOST-stuff (maybe not a good idea). And as far as I know, template kernels are only possible via Runtime-Api call (bla<<<block, grid>>>) and not via cudaLaunch (ptx-hacking is not an option for me, sorry).

Is there a solution for this drawbacks or do I have to accept this limitations?

Marcel

MisterAnderson: Did you find a solution to using GPUWorker with CUDA 2.3 and it’s new templated malloc functions? You mention that it was a problem In the CUDA 2.3 announcement thread.

I implemented a hack workaround by just creating simple C protypes (i.e. cudaMallocHack) that call the actual functions and then have the app call the hack functions. It is not pretty, but it works.

Alright, thanks.

Hi.

When I try to download the code, I see URL not found error message. Is the URL changed?

Uh, I suppose it did. HOOMD has moved to a new server and GPUWorker is just a part of that project. Here are new links for now:
https://codeblue.umich.edu/hoomd-blue/trac/…ils/GPUWorker.h
https://codeblue.umich.edu/hoomd-blue/trac/…ls/GPUWorker.cc

I’ll get it up on a separate URL at some point (really) before it is removed from hoomd completely.

Can it be complied on Windows Vista?
Dose anybody try it in the simpleMulitGPU and make it faster than using 1 gpu?