CPU to GPU problem with transformation

Hi, I am basic CUDA user and I have some problems. I use CUBLAS library to perform Householder Transformations but I don’t know how to write this function so that the calculations would be performed by the GPU.

[codebox]

float O=0;

for ( i=k*(N+1)+1; i<N*(k+1); ++i) // k- is a column, N - matrix-size

{ 

  O += powf(h_A[i],2);

}

S = (float)_copysign(1.0f,h_A[k*(N+1)+1])* sqrtf(O);

R = sqrtf(2*S*(S + h_A[k*(N+1)+1]));

for (  i = 0; i < (k+1); ++i )

{

  h_B[i] = 0;

}

h_B[k+1] = (S + h_A[N*k+(k+1)])/R;

for (  i = (k+2); i < N; ++i )

{

  h_B[i] = h_A[N*k+i]/R;

}[/codebox]

Can anyone help me?