Using CUBLAS and CUDA Calling a custom kernel among blas calls

Hello,

I have a fairly straightforward application for CUDA BLAS, except for one thing; I need to do a piecewise hyperbolic tangent operation on a vector. This should be a trivial kernel to write (I’ve already written it), but integrating this kernel into my code has shown to be problematic.

Here’s where I am so far:

-I put the kernel code in a .cu file, within my project.

-I downloaded the custom build rule for CUDA, and made sure that MSVC associates that rule with the .cu file (it compiles without any problems).

-I added the following lines to my code:

#include <cutil.h>

global void tanhGPU(int size, float *vec); //forward declaration

Since I’m calling cublasInit(), I’m assuming that I don’t need to call CUT_DEVICE_INIT(). I can’t call it anyway, since the compiler complains about all sorts of stuff inside that macro.

Even if I leave it out, I still get the following error:

error C2059: syntax error : ‘<’

…Referencing the call to:

tanhGPU<<1, 32>>(32, test_D);

Any ideas as to what I’m doing wrong?

Thanks!

I don’t know anything about compiling CUDA on WinXP, but the syntax for kernel calls uses triple angle brackets around the grid, block dimensions:

tanhGPU<<<1,32>>>(32, test_D);

My mistake. I’m posting and developing on different machines, so I have to transcribe. My problem code has triple brackets.