Hello!
I am starting with CUDA to speedup some pattern matching process but I am facing some unexpected problem. I am starting with a very basic implementation of a gram schmidt orthonormalization without parallelization (this is not the step I want to parallelize but it is previous to the huge calculation).
My problem is that a float value inside an array does not update and returns always 0. I do not understand what happens. I have a Nvidia 1.1 capability GPU now for the basic development and I don’t know if it might be the problem. The line returning 0
[b]for(int j=0;j<i;j++){ prod_scal[j] = prod_scal[j] + Q[k*n_base+i] * A[k*p +base[j]]; }} //scalar products among vectors[/b]
I leave the global file as attachment.
Thank you for your time and answers because if I do not solve this I just can’t follow up the calculations
Mattia