I need to develop some BLAS2 complex codes using CUDA. My memory tells me that I have seen a post that indicated where the underlying code for BLAS1 routines can be read (and studied!), but I can’t remember where that place is!
If anyone knows, would you please advise me where to find them?
Thanks, Malcolm