we are going to buy two RTX ADA 4090 graphics cards for algorithmic computations in machine learning and deep learning (computer vision), but these cards do not have physical NVLink support.
Is it possible to use both cards simultaneously for distributing the calculations? If so, how does this work at the hardware and software level? Are parallelization methods such as data parallelism or model parallelism utilized in this case? Can solutions like PCI Express allow efficient communication between the cards, or are there other methods to optimize task distribution? Additionally, which libraries, such as PyTorch and Keras, support this configuration for managing the distribution of computations across multiple GPUs?
Thank you