Hi, my application needs to compute singular values of several hundred 30x30 matrices concurrently, in periodic batches. I am prioritizing speed/throughput over accuracy - I was wondering which one I should explore using.
Does anyone have any resources/experience for comparing the three? Am I missing any options?