How fast is the __umul64hi function on the gpu? I am using it to divide integers and it take an extremely long time on some and shorter times on others. When I try to make a very simplified test case to isolate this behavior it goes away/ Has anyone else had any trouble with this intrinsic?
Related topics
Topic | Replies | Views | Activity | |
---|---|---|---|---|
Why is modulus so slow? | 9 | 4877 | May 21, 2010 | |
About integer calculation | 1 | 4349 | April 14, 2008 | |
How much speed of 64bit integer algebra in the latest GPUs? | 2 | 2060 | April 21, 2014 | |
unexpected slow performance | 0 | 369 | February 29, 2020 | |
CUDA FAQ posted | 3 | 6331 | May 22, 2007 | |
CUDA performance measure | 2 | 5237 | January 31, 2009 | |
timing the kernel | 1 | 5681 | June 18, 2008 | |
Multiplication methods | 6 | 1519 | December 20, 2013 | |
Timing for division and remainder | 1 | 1147 | December 21, 2009 | |
What is the work-efficient time complexities for integer intrinsics functions in CUDA? | 0 | 361 | May 1, 2020 |