I developed a cuda lib that can be called by C under VS2008 win7 64-bit.
I found that I cannot allocate large memry such as 3000*3000. for instance,
cudaMalloc((void**) &g, sizeof(int)*size);
cudaMemcpy(g, a, sizeof(int)*size, cudaMemcpyHostToDevice);
<<<>>> // assign each member to be 4
If more memory used , I will get all zero about g, only small arrays such as 2850*2850 can I get all members in g to be 4
Does any one met this kind of problem?