gpu is a 1050Ti with compute capability 6.1, running cuda 8.0 on 64-bit windows 10.
The error is on the “min” side, had similar problem with thrust::reduce().
This happens with a certain dataset of floats, coming from an image luma value (HDR app)
Just to verify, I have written a very simple single threaded kernel to perform the min-reduction; its result agrees with the CPU.
The floats are not huge (or very small) values: the min values are like -6.61f vs -6.049f and the max is about 2.5f.
Anyone else had a similar experience with thurst?
Thanks in advance
Can you provide a self-contained reproducer code?
Thanks for the response.
The reproducer code may not be applicable, because I come across this behavior with some (not all) certain data sets (coming from images).
But here is a code snippet:
//luminance is a float device pointer used with cudaMalloc() and actually been processed by a custom kernel
//length is the size of device memory pointed to by luminance
thrust::pair<thrust::device_ptr, thrust::device_ptr> tuple;
tuple = thrust::minmax_element(thrust::device,
thrust::device_pointer_cast((float*)luminance) + length);
CHECK_CUDA_ERROR(errCode, functionName, “thrust::minmax_element failed”);
minVal = *(tuple.first);
maxVal = *(tuple.second);