Value of sum from thrust::reduce not correct w.r.t CPU

I have been trying to implement some code requiring to call reduce on thrust::device_ptr, and the results are not consistent with CPU implementation while dealing with large values. I have to deal with large values. So is there a way around:

CODE

the compiler that I am using is nvcc and my graphics card is nvidia 1650 with compute capability 7.5.

If anyone has any idea please help me.

1 Like