An OpenSource cuBLAS-XT alternative

An OpenSource cuBLAS-XT alternative offers 25% more performance and 200% less communication. More importantly, it works on heterogeneous GPUs.

For more information, please check github@