The fragment below runs into error, and the problem reported by cuda-memcheck is that the variable cand is NULL and can not copy content to it. This error does not occur in every thread, but there does exist some threads which report this error.
__global__ void join_kernel(int* d_candidate...)
{
int* cand = NULL;
cand = new int[num];
memcpy(cand, d_candidate, sizeof(int)*num);
}