I am considering to try Cuda-x86 for some project, and I would like to get some more information on it. So could anybody answer these questions?
- What warp size, and what compute capabiliy is used? Can I set it somehow?
- Is the warp size dependent on the CPU architecture?
- What compute capability is shown for CPUs? Can I change it somehow?
- How is the machine code vectorised? Are you using tricks similar to those described here for Intel OpenCL?