Hi,

I want to implement Fibonacci sequence in CUDA . But in Fibonacci sequence

a(0) = 0, a(1) = 1 and a(n)=a(n-1) + a(n-2).

I can not implement it properly because of dependency in the code.

How can it be implemented in CUDA?

Hi,

I want to implement Fibonacci sequence in CUDA . But in Fibonacci sequence

a(0) = 0, a(1) = 1 and a(n)=a(n-1) + a(n-2).

I can not implement it properly because of dependency in the code.

How can it be implemented in CUDA?

The Fibonacci sequence has a closed form. You can calculate it without recurrence in parallel using something like

a(n) = floor((g^n)/sqrt(5) + 0.5)

where g is the Golden ratio (1+sqrt(5))/2.

The largest number calculable by this method is limited by floating point precision, but it is a start.

Thanks for reply.

Hi,

I had tried same thing in CUDA 6.5.

But I am not getting correct answer. Would you please help me in same?

//I wanted print Fibonacci sequence.

#include<stdio.h>

#include<math.h>

#include<cuda.h>

#define N 10

**global** void Fibonacci(double *ga, double *gb, double sqrt_five, double phi1, double phi2)

{

int i = blockDim.x * blockIdx.x + threadIdx.x;

```
if (i < N)
{
gb[i] = (pow((double)phi1, ga[i]) - pow((double)phi2, ga[i])) / sqrt_five;
}
```

}

int main()

{

double ha[N];// Host variable

double *ga,*gb; //For GPU use

```
double sqrt_five, phi1, phi2, result, fibo_result;
sqrt_five = sqrt(5);
phi1 = (sqrt_five + 1) / 2;
phi2 = (sqrt_five - 1) / 2;
// Initialize array on CPU
for (int i = 0; i<N; i++)
{
ha[i] = i;
}
//Allocate memory on GPU
cudaMalloc((void**)&ga, N*sizeof(double));
cudaMalloc((void**)&gb, N*sizeof(double));
//Copy array from CPU to GPU
cudaMemcpy(ga, ha, N*sizeof(double), cudaMemcpyHostToDevice);
//Kernel launching
Fibonacci << <1, 64 >> >(ga,gb, sqrt_five, phi1, phi2);
cudaError_t err = cudaGetLastError();
if (err != cudaSuccess)
printf("Error: %s\n", cudaGetErrorString(err));
//Copy results from GPU to CPU
cudaMemcpy(ha, gb, N*sizeof(double), cudaMemcpyDeviceToHost);
//print output
for (int j = 0; j<N; j++)
{
printf("\n%lf", ha[j]);
}
getchar();
return 0;
```

}

How exactly is the answer not correct? What have you tried to debug this code? What happens if you add proper error checking, or run the app under control of cuda-memcheck?

Above given code gives series of 0.00000 s.

When I run the code you have posted, I don’t get a sequence of 0. I get:

```
$ cuda-memcheck ./t15
========= CUDA-MEMCHECK
0.000000
0.447214
1.000000
1.788854
3.000000
4.919350
8.000000
12.969194
21.000000
33.988233
========= ERROR SUMMARY: 0 errors
$
```

So there is probably something wrong with your machine setup.

Any time you are having trouble with a CUDA code, you should always add proper cuda error checking, and run your code with cuda-memcheck, *before* asking for help.

If you don’t know what proper cuda error checking is, google “proper cuda error checking” and take the first hit. Study that, and add it to your code.

Thank you…for solution.