Hi Nvidia, I’m working on a VR ray tracing title and wanted to add support for NV Link to boost performance for remote cloud rendering situations, but it appears that even your professional GPU lineup no longer has the NV Link connector:
Is NV Link truly a dead technology? I mostly wanted it to double the path tracing performance in VR, but I’m wondering to what degree NV Link is even necessary or beneficial for such workloads, as the two eyes can be rendered completely independently. In that case I may be better off simply buying two 4090s for this situation. Any caveats there?
NVLink is far from dead. If you check out the Information on our pages you will see that the technology is still in use and developing.
The fact that it is not supported on Desktop or Workstation GPUs any longer does not mean the technology is discontinued.
And basically you are correct, NVLink is most beneficial if you need to transfer huge amounts of data between GPUs, like Deep Learning tasks in HPC settings. On the other hand, for VR applications, even if there were a need to transfer whole frames between GPUs, PCIe 4 offers enough bandwidth even for 4k at 120Hz and then some.
So if your engine supports Multi-GPU rendering like alternate frames, you should be fine with a 4090. But you should take into account that Desktop GPUs are not designed for any kind of server setups.
this is a very-very unfortunate decision of Nvidia, we are scientists and fully utilizing NVlink in our duo-RTX A6000 setup for machine learning.
So, basically, no possibility for us to move to A6000 Ada L. Gen., and pool all of their 96Gb combined memory together, to train GANs, transformers, or just big ANNs of other types?
What would you recommend then? How with limited resources can we build a similar ML-PC with a big enough video RAM on board, but most importantly - pooled together?
Welcome @alex1988d to the NVIDIA developer forums.
I cannot say much about any decisions or the possible impact of it. But of course there are alternatives.
The simplest being to just make use of PCIe. While not as fast as NVLINK it still allows to utilize CUDA across multiple devices the same way that NVLINK does.
I can imagine that buying a full-fledged DGX H100 is beyond the budget of smaller companies or educational institutions. For that there are alternatives like Cloud service providers, for example the NVIDIA partner Cyxtera. Check the DGX page for DGX as a service.
Thanks for your reply Markus,
We have never tried to pool both GPUs VRAM together using PCIe x16 Gen4, but my guess is even if it works without tweaking much - it would slow down the learning rate significantly.
Cloud services we can’t use for 2 reasons - 1. data leaks and 2. constant modifications of ANNs with iterative development - in other words, we need a machine that stands just here and we are not limited by time of learning and/or amount of tasks, like in AWS or alternatives offer.
Yes, DGX is too expensive, but it would be an ideal variant for us of course.
We need smth like it was before - 2 GPUs 5k$ each + 5k$ for the rest of the machine and we have a nice 15 000$ rig with 96Gb VRAM.
I guess we will just stay with RTX A6000 further as no alternatives so far.
The next step of the upgrade, despite the price, would be dual Hopper H100 PCIe, as it was promised they do harbor NVlink.
I agree with @alex1988d, the Cloud GPU service can never be an alternative to NVLINK. I think NVidia removed that feature thinking about the general consumer market. But also, there are specific applications and setup of the GPUs where faster data transmission in between GPUs are crucial. NVidia could still keep the NVLink feature as Pro version.
@alex1988d and @_Bi2022 I had the same concerns as I wanted multiple GPUs NVLinked together for working with large models. Although NVIDIA’s DGX systems are ridiculously overpriced, you should check out Lambdalabs and Bizon-tech, they build custom multi-GPU desktop workstation machines and also servers. I have a workstation from Bizon-tech with multiple NVLinked GPUs, it works really well and is a fraction of the cost of NVIDIA’s DGX systems.
Ya but those use older generation Ampere or Turing RTX cards, right? Lovelace is too good to give up, at least for my purposes which is ray tracing. For ML those older cards in NVLink are probably a good value. Since 2x 3090s gives you 48gb of VRAM total (addressable as a contiguous block, if I understand it correctly).
I am hoping that the ability to have larger than 48 GB amounts of VRAM available on a ADA GPU [or ability to be pooled from several cards] will occur for use on a local workstation. Additionally, it does seem that a PCIe 5 bus will be needed going forward.[ not just the GPU].
I just don’t understand it, if removing NV Link from consumer RTX cards is meant to force pro users to upgrade to the higher end Ada Cards, then why was it removed from the pro Ada cards too? It makes no sense to me at all.
NV Linked dual 4090 would have gotten a lot of use, there is an Unreal plugin to use NV Linked dual GPUs for gaming, and of course the ML training or other VRAM heavy scenarios many people have mentioned here already.
I completely lost all interest in SLI now, and I implemented the SLI support in GTA V and wanted to redo that for my path tracer based on Nvidia’s NV Linked path tracing samples. What a shame. I hope Nvidia changes course for the 5000 series but I guess this is the end of the road for SLI or rather, multi-GPU in consumer space in a single workstation.
Agree with @alex1988d on this. In my field, 3D and VFX NVLink has been a critical way to pool VRAM for large scenes. I currently run multiple 4090’s, but often have to pay for an expensive render farm service that still has 3090’s when I need more VRAM. Despite the exorbitant cost of the RTX 6000 Ada, it only has 48GB of VRAM with no NVLink. Thus, there is no practical solution for VRAM pooling. This will push us back to using CPUs, which are much slower but have no VRAM limitation with local systems. A great deal of 3D/VFX work is iterating on designs, simulations, and animations, which is best (or only possible) on a local machine. Given that NVLink is still actively developed and used, dropping it from consumer and workstation GPUs was a huge letdown.