How to set the priority fro threads ?

Hello all~ :rolleyes:

I have an data_array which size is 256.
Each value of data_array is ‘1’
I created 256 threads to get the data respectively…
The data of 256 threads are summed up and stored at Result_array[1] (Result_array is global memory…)

After testing it, the result is ‘1’ . :blink:
What’s the main reason causing this problem?

Does CUDA provide any methods that users could set the priority to different threads?

You don’t want priorities. What you’re hitting is classic thread contention.

The proper way to sum up values in parallel is to use a “Reduction”… look at the SDK examples for a very very good introduction.

An alternative to reduction is to use atomic integer addition, but that wastes your compute power if you have to add many values.