Documentation indicates that if two warps are attempting to execute say a floating point instruction and a global memory instruction, both can execute in parallel. Does this hold true when one warp is executing a floating point instruction and another warp is executing a shared memory load?
Related topics
Topic | Replies | Views | Activity | |
---|---|---|---|---|
__shared__ memory: Just a question what happens if | 3 | 839 | March 15, 2016 | |
Concurrency of Global Memory Operations | 1 | 570 | February 17, 2011 | |
warp scheduler of Fermi architecture | 2 | 3207 | February 5, 2012 | |
Warps and Occupancy | 4 | 4047 | April 19, 2011 | |
A question on concurrent kernel execution | 2 | 780 | April 13, 2012 | |
Simulation engine based on CUDA Exploring a new(?) idea | 3 | 1749 | March 5, 2009 | |
Threads Dispatching : 2 different instructions per cycles? | 2 | 29 | January 31, 2025 | |
Concurrent Kernel Execution | 2 | 4531 | June 10, 2011 | |
Scheduling Blocks on a Multi-Processor Block Scheduling on Multiprocessor | 11 | 6393 | December 6, 2007 | |
Clarification on Memory Access issue | 1 | 3727 | September 9, 2009 |