Yes well some times you have to do this, i have to do it 300 times in my solver. And since the alternative is launching 300 kernels, for a total compute time of less then 8ms any alternative that would cost less would be welcome. even if it removes the thread interleaving at that specific point.
Related topics
Topic | Replies | Views | Activity | |
---|---|---|---|---|
help! is there a mutex in cuda? | 7 | 17756 | November 29, 2007 | |
Best way to find many minimums | 8 | 2411 | January 3, 2018 | |
Making changes to Global Memory visible | 3 | 1849 | May 24, 2009 | |
Reduction for a maximum value for all threads? | 3 | 739 | August 1, 2011 | |
Using reduction instead of atomics? | 9 | 5793 | March 9, 2015 | |
Many threads updating a single global variable | 7 | 6797 | March 30, 2012 | |
Threads and Race Condition | 11 | 2993 | April 30, 2012 | |
Reduction questions(newbie-ish) | 7 | 1800 | January 14, 2009 | |
About reduction About reduction performance VS occupancy | 3 | 5798 | December 19, 2009 | |
Mutual exclusion or Reduction on global memory? | 13 | 9103 | September 13, 2008 |