GPU VSIPL library problems

I hope someone can help me with an issue I have been having. I have implemented the GPU VSIPL library created out of a GT research grant project (link to project), which allows using the CUDA libraries without knowing all the CUDA programming techniques.

I am receiving the following error

First-chance exception at 0x7c812a6b in sp.exe: Microsoft C++ exception: cudaError_enum at memory location 0x0012f79c…

which has been documented in other posts on the forum, but I haven’t found a very concise answer as to the cause. I have read that it is VS2005’s response to an unknown error code provided by CUDA, but have not found very much confirmation of this. I have written functions to compute basic vector operations in VSIPL, but ran into this problem when I wrote a function for a complex vector multiplication operation.

The strange thing with the error is it happens after the function has been called about 6000 times, then this exception begins to show at the debug output. If I allow the program to continue running (the function is called about 12000 times) then the proper output is created but these exceptions have filled the debug output window in VS2005.

Below is my source code for the function I have written. If anyone has ideas for fixing this issue please let me know.

void zvzsmlx(COMPLEX_SPLIT *A, int I, COMPLEX_SPLIT *B, COMPLEX_SPLIT *C, int K, int N, int X, char * File, int Line)


	  //vsip_blockbind_f - creates a VSIPL data block and binds it to user memory

	  vsip_cblock_f *ablk = vsip_cblockbind_f(A->realp, A->imagp, N*I, VSIP_MEM_NONE);

	  vsip_cblock_f *bblk = vsip_cblockbind_f(B->realp, B->imagp, 2, VSIP_MEM_NONE);

	  vsip_cblock_f *cblk = vsip_cblockbind_f(C->realp, C->imagp, N*K, VSIP_MEM_NONE);


	  //Admit the created blocks for use by VSIPL operation




	  //create a Vector View object to point to the data blocks admitted to VSIPL and store

	  //the length, stride, and offset attributes of the data.

	  //vsip_vview_f *vsip_vbind_f(

	  //const vsip_block_f *block, vsip_offset offset, vsip_stride stride, vsip_length length);

	  vsip_cvview_f *a = vsip_cvbind_f(ablk, (vsip_offset)0, (vsip_stride)I, (vsip_length)N);

	  vsip_cvview_f *b = vsip_cvbind_f(bblk, (vsip_offset)0, 1, (vsip_length)2);

	  vsip_cvview_f *c = vsip_cvbind_f(cblk, (vsip_offset)0, (vsip_stride)K, (vsip_length)N);

	  //Perform the multiplication operation of scalar b and vector a and store the result in c

	  vsip_csvmul_f(vsip_cvget_f(b,0), a, c);

	  //Release the blocks from VSIPL operations

	  vsip_cblockrelease_f(bblk, VSIP_FALSE, &(B->realp), &(B->imagp));

	  vsip_cblockrelease_f(ablk, VSIP_FALSE, &(A->realp), &(A->imagp));

	  vsip_cblockrelease_f(cblk, VSIP_FALSE, &(C->realp), &(C->imagp));

	  //destroy the vector views and any associated blocks





Please submit a bug report to We will do our best to locate and resolve the bug with you. The presence of a complete test program that causes the behavior is a big plus.

This failure was due to a bug in the GPU VSIPL Library. A fix has been incorporated into the most recent release, posted to the GPU VSIPL website on 11 August 2009.