Help: Choosing the right GPU for Large-Scale Matrix Operations

We are building a workstation for statistical learning. The algorithms require iterative calling (training by cross validation) of large-scale matrix operations. Also, we heavily relay on the KNN algorithm which is basically finding the numerical distance between tens of million of vectors (also called iteratively). We think the right selection will be one of the following Quadros: GV100 or RTX 8000.

Clear advantages for each one:
GV100 has the highest memory bandwidth (870 GB/s vs 672 GB/s), memory interface (4096-bit vs 384-bit), number of CUDA/tensor cores (5,120/640 vs 4,608/576).
RTX 8000 has a higher speed performance (10% faster), and the Turing architecture is getting all the buzz nowadays. Some claim Turing is the successor of Volta, which makes us think GV100 is built on an older tech.

We think that the GV100 is a better choice given we need frequent iterative lunching of the GPU kernels and high speed data transfer plus the higher number of cores.

Please let us know if you have other GPU suggestions (Titan maybe?)

Your help is highly appreciated.