Hello, i am a begainner in cuda. Today i met a problem:when i wrote my code using CUDA to accelerate it, i found it seemed that CUDA can’t easily realise the reverse operation in C++.
like my original C++ code:
if (Nodd == 1)
{
/* h = [a(L+1:-1:2)/2; a(1); a(2:L+1)/2]'; */
a[0] /= 2;
for (__int64 i = length; i >= 1; i--) // 假设L = 21
result[length - i] = ((double)(a[i] / 2.0));
result[length] = (a[0]);
for (__int64 i = length + 1; i <= length + length; i++)
result[i] = ((double)(a[i - length] / 2.0));
}
else
{
for (__int64 i = length; i >= 0; i--)
result[length - i] = ((double)(a[i] / 2.0)); // result[length +1]
for (__int64 i = length + 1; i <= length + length + 1; i++)
result[i] = ((double)(a[i - (length + 1)] / 2.0));
}
i rewrote it in CUDA like:
double temp = 0;
if (Nodd == 1)
{
if (i < (length2 + 1))
{
temp = d_a[length2 - i] / 2;
d_coefficients[i] = temp;
}
if (length2 < i && i < (length2 + length2 + 1))
{
d_coefficients[i] = d_a[i - length2] / 2;
}
}
else
{
if (i < (length2 + 1))
{
temp = d_a[length2 - i] / 2;
d_coefficients[i] = temp;
}
if (length2 < i && i < (length2 + length2 + 1 + 1))
{
d_coefficients[i] = d_a[i - (length2 + 1)] / 2;
}
}
but i found the answer is not right: for example, if Nodd =1, the 0th to (length2)th value of d_coefficients is zero, and the rest value of d_coefficients is right.
Why would this situation happen, and how can i solve it, please help me, thanks!