Hello,
I’m using both Matlab (latest) and CUDA (latest) in its mexfunctions to utilize my multiple GPUs for a long time. It is basically writing a CUDA-C code, compiling it, and calling directly from Matlab.
I installed the cuDSS (0.3) library and managed to make it work perfectly with the setup I explained for a single GPU. The results and speed are very satisfactory.
I also tried to factorize a large matrix that could not fit into the GPU memory so I had to switch to the CPU-GPU hybrid mode. I’ve realized if the factors cannot fit into the GPU memory the CPU-GPU hybrid mode gives out factorization error. I’ve tried many things including limiting the amount of memory that the GPU can utilize however I couldn’t find a fix. It could be a bug in the library or the environment I’m using it. Either way, I’d like to report this.
The second issue is with the single-thread multi-GPU setup on the same computer. I managed to find the NCCL1 library for Windows and I was able to do basic collective operations successfully. After that, I tried to utilize multiple GPUs to solve a single equation. If the factors can fit into one GPU’s memory then it solves the equations successfully but the other GPUs stand idle. If the factors cannot fit into the first GPU’s memory then the library fails again. This can be due to multiple reasons, the old NCCL library for example, however, I’d like to report this issue too.
These problems can also be related to the mixing of multiple languages or old NCCL library or being on the Windows environment. On the other hand, I had no issues with other CUDA libraries to this day.
Regards
Deniz