Integrate CUDA-Kernels in existing Application

Hi,

I’m trying to call a kernel from an C++ source to substitute the existing CPU-Code through a kernel. As I’m new to CUDA, problems are hard-wired. I’ve tried to follow some examples found in the internet, but most were demo-like and i was not able to take them to my problem.

On my current try, I’ve following files:

kernel_nomm.h which contains the necessary imports

#ifndef KERNEL_NOMM_H_

#define KERNEL_NOMM_H_

#include <stdio.h>

#include <stdlib.h>

#include <string.h>

#include <cuda.h>

#include <cuda_runtime.h>

//extern void kernel_wrapper(int*);

#endif /* CUDAHEADER_H_ */

kernel_nomm.cu which contains a demo-kernel with a dummy functionality

#include "kernel_nomm.h"

#include <stdio.h>

extern "C" void kernel_wrapper(int *a);

__global__ void kernel(int *a)

{

    int tx = threadIdx.x;

switch( tx )

    {

        case 0:

     a[tx] = a[tx] + 2;

     break;

        case 1:

     a[tx] = a[tx] + 3;

     break;

    }

}

void kernel_wrapper(int *a)

{

    int *d_a;

    dim3 threads( 2, 1 );

    dim3 blocks( 1, 1 );

cudaMalloc( (void **)&d_a, sizeof(int) * 2 );

cudaMemcpy( d_a, a, sizeof(int) * 2, cudaMemcpyHostToDevice );

printf("Kernel starts");

kernel<<< blocks, threads >>>( d_a );

printf("Kernel ends");

cudaMemcpy( a, d_a, sizeof(int) * 2, cudaMemcpyDeviceToHost );

printf( "Finish kernel wrapper\n" );

    cudaFree(d_a);

}

/*

int main(int argc, char *argv[])

{

    int *a = (int *)malloc(sizeof(int) * 2);

    a[0] = 2;

    a[1] = 3;

printf( "a[0]: %d, a[1]: %d\n", a[0], a[1] );

    kernel_wrapper(a);

    printf( "a[0]: %d, a[1]: %d\n", a[0], a[1] );

free(a);

    return 0;

}*/

And a cpp class which should call the function:

ebwt_search.cpp (only snippets)

//include kernels

#include "kernel_nomm.h"

#ifdef CHUD_PROFILING

#include <CHUD/CHUD.h>

#endif

using namespace std;

using namespace seqan;

extern void kernel_wrapper(int *a);
template<typename TStr>

static void driver(const char * type,

                   const string& ebwtFileBase,

                   const string& query,

                   const vector<string>& queries,

                   const vector<string>& qualities,

                   const string& outfile)

{

...

//#### MH Test Kernel to run ###

	int *a = (int *)malloc(sizeof(int) * 2);

	    a[0] = 2;

	    a[1] = 3;

	kernel_wrapper(a);

Eclipse underlines the last line and says: undefined reference to `kernel_wrapper(int*)’, but its possible to go to the definition by Ctrl+Click. If I try to compile:

ebwt_search.cpp:2511: undefined reference to `kernel_wrapper(int*)'

Because the Makefile is very complex, I would prefer to make something like a library or object that i can give g++ as an argument, to keep the current Makefile as much as possible. My very simple (and maybe stupid) approach was to tell g++ to use the object file as a library -L kernel_nomm.o (generated by nvcc -c kernel_nomm.cu)

ANY Ideas are welcome,

thx miccim

######################

Solved - see other Posts

Leave out [font=“Courier New”]extern “C”[/font] (or add it to the prototype in the calling file).

Thank you for your reply.

Either

extern void kernel_wrapper(int *a);

or

extern "C" void kernel_wrapper(int *a);

in both, the cu and cpp file resulted in the same error

After some hours of irritation i think the problem is the makefile. If i use the code (was from http://forums.nvidia.com/index.php?showtopic=62601) in a separate project with the given Makefile it works.

The proposed Makefile:

run: a.o b.o

        gcc -L /usr/local/cuda/lib -lcudart -o run a.o b.o

a.o: a.c b.h

        gcc -I /usr/local/cuda/include -c -o a.o a.c

b.o: b.cu b.h

        nvcc -c -o b.o b.cu

But I do not know how to apply this to my Makefile (attached). I think I understand what is done and why, but to scale up to the Makefile (from the original source, not written by me) seems to be difficult.

Solved, here are the important lines of the makefile

bowtie-debug: kernel_nomm.o $(OTHER_O) $(SEARCH_O) $(SEARCH_FRAGMENTS) $(SEARCH_MAIN_O) ewbt.h 

	$(CXX) $(DEBUG_FLAGS) \

		$(DEBUG_DEFS) $(ALL_FLAGS) \

		$(DEFS) -Wall \

		$(INC) \

		-o bowtie-debug \

		ebwt_search.cpp $(OTHER_O) $(SEARCH_O) $(SEARCH_FRAGMENTS_O) $(SEARCH_MAIN_O) \

		kernel_nomm.o \

		$(LIBS) $(CUDA_LIBS) $(SEARCH_LIBS)

%.o: %.cpp                                                                                                                                                                                    

	$(COMPILE) $(DEBUG_FLAGS) $(DEBUG_DEFS) $(CFLAGS) $(CXXFLAGS) $(INC) \

	$(DEFS) \

	-o $@ $<

kernel_nomm.o: kernel_nomm.cu kernel_nomm.h

		nvcc -c -o kernel_nomm.o kernel_nomm.cu $(INC)

		#nvcc -c -o kernel_nomm.o kernel_nomm.cu