Hello, I know this topic was here several times, but as I am a beginer, I am not able to understand examples used in those that I’ve found and I don’t want to spend to much time at this particular part of my code (just a freetime project for fun:)). I’ve found complete and functional code finding min max using reduce, code can be found here: cuda_minmax/kernel.cu at master · AJcodes/cuda_minmax · GitHub
I’ve tried simple modification, to store array element index, instead of a min value, by replacing this part of a code in both functions:
if (maxtile[tid + s] > maxtile[tid])
maxtile[tid] = maxtile[tid + s];
if (maxtile[tid + s] > maxtile[tid])
mintile[tid] = tid + s;
But I am evidently not understanding the code and or missing something, as I am always getting n.2 as a final index of a max value. Could someone please help? thank you
this algorithm finds minimums in subarrays, and then finds minimal one among the minimums. by storing index instead of value you obviously breaks the second stage. so you need to store both things
that said, if you just need to solve the problem, by all means use Thrust. it closely mimics STL, so with Thrust you will need just one line of code
Hello, thank you, thrust lib worked fine, but unfortunatelly in my case it is almost 2 as slow. Now I understand I’ve made a mistake in the second function, but still don’t understand what to put in place in the first, to get the a proper index of a max value (mintile[tid] =):
unsigned int tid = threadIdx.x;
unsigned int i = blockIdx.x * blockDim.x + threadIdx.x;
maxtile[tid] = a[i];
mintile[tid] = a[i];
__syncthreads();
// strided index and non-divergent branch
for (unsigned int s = 1; s < blockDim.x; s *= 2) {
int index = 2 * s * tid;
if (index < blockDim.x) {
if (maxtile[tid + s] > maxtile[tid]){
maxtile[tid] = maxtile[tid + s];
mintile[tid] = i; //what to put here? I cannot get to the right values
}
}
__syncthreads();
}
and I leave in the second function:
if (index < blockDim.x) {
if (maxtile[tid + s] > maxtile[tid])
maxtile[tid] = maxtile[tid + s];
if (mintile[tid + s] < mintile[tid])
mintile[tid] = mintile[tid + s];
}
__syncthreads();
As If I understand it correctly, in 1st function I assing the index and second just sorts it.
i definitely can’t know what first and second functions of your code are doing :) use pastebin-like services or put code here using last button above edit box to format the code. and remove unused code
with thrust, you may be using it suboptimal way, so it’s better to start with thrust code you wrote