Fermi Warp Sheduling

Hello, I write GPGPU reviews for my master’s diploma. And I have questions about dual warp sheduler on Fermi.

[list=1]

[*]Why double-precision instructions can’t be dual-issued with another instructions, such as transcendent functions, or load/store ones?

[*]Are double-precision instructions executed connecting two cores (like on AMD GPUs)?

If possible, please tell about sm2.0 and sm2.1 respectively.

Thanks, and sorry for my poor english External Image

Hi all, I’ve found answwers for those questions in article Inside Fermi: Nvidia's HPC Push
External Image