Segmentation Fault with kernel Wierd error

Hi,

I have 2 kernels defined in a file as -

#define B2G_Q 2
#define B2G_HASHSHIFT 4
#define B2G_SCAN 0x02
#define B2G_HASH16(a,B) (((a)<<B2G_HASHSHIFT) | (B))
#define u8_tolower© g_u8_lowercasetable[©]

__kernel void prepare_search_match_array(__constant char *ptns,
__constant uint *plen,
__constant uchar *nop,
__constant uchar *flags,
__constant uchar *g_u8_lowercasetable,
__global uint *search_B2G)
{
// carry out some compuation here
return;
}

__kernel void prepare_scan_match_array(__constant uchar *ptns,
__constant uint *plen,
__constant uchar *nop,
__constant uchar *flags,
__constant uchar *g_u8_lowercasetable,
__global uint *scan_B2G)

{
// carry out some computation here
return;
}

The first kernel is called “prepare_search_match_array” and the second one is called “prepare_scan_match_array”.

When I create a kernel for “prepare_scan_match_array” and enqueue it using clEnqueueNDRangeKernel, I get a segmentation fault. If I replace the first kernel’s name to “prepare_search_match_arra”, the program works perfectly. I am not sure what relation the first kernel’s name has with enqueuing the second kernel. I am not sure how removing the “y” from “prepare_search_match_array”, solves the issue. Can someone please help me with this issue?

Another issue I faced lately is inside the kernel I do a

plen[0] = 50;

and I get a crash with a backtrace

I inserted a int temp = 50; and then did a plen[0] = temp; and it worked perfectly.

After this I removed the int temp = 50; and went back to the old plen[0] = 50; and it worked perfectly.

Is the driver buggy or does my card have an issue or something?

Thanks

I had a similar problem with multiple kernels in one program using constant memory. By tweaking the name of each kernel I could either get a segfault or an internal compiler error.

In the end I separated them into two separate programs and the kernels worked fine. A driver bug, I guess.

This isn’t the only place where I get a crash or a seg fault. The latest issue I had was with the use of constant memory. When I changed it to global it worked perfectly. If i use constant, I get the wrong output. I didn’t exceed the no of constant arguments that can be supplied to the kernel function. I thought it was a driver bug and resorted to using global, although constant was what I required, because that optimized the retrieval of data(with caching), unlike global. Hopefully the next release solves these bugs.

How do I file a bug. I sent them a mail at opencl@nvidia.com. I didn’t see an option for Opencl in their bug entry page.