Hello, is there a future plan on adding geam and qr in half precision? If not, is there a good source where I can find how to optimize my kernels so that I can to be as close as possible to the cublas version in terms of wall clock times?
This is part of my masters thesis if it plays any part
Thank you!