Memory management across different hardware variants

Hello Nvidians,

when it comes to generating synthetic data, it often happens that my RTX4090 runs out of VRAM and isaac-sim crashes. Either because of the current memory leak or because the scenes got too big.
When I’m thinking about how to solve this issue, there is no good information out there, what requirements different hardware configuration must fulfill in order to work and effectively supply a bigger VRAM pool.
Since SLI/NV-Link was removed by the end of 2021 I’m not really sure if plugging a second RTX4090 will solve the issue.
On the Product page of the RTX4090 is stated that linkage is achieved through PCI-E, but there must be a reason, why the H-Series still uses the next-gen NV-Link to share VRAM and Ada doesn’t.
Unfortunately the H-Series only has normal CUDA and Tensor Cores, so it isn’t pretty helpful when it comes to ray tracing and generating synthetic data.
Of course, there are options like RTX6000 Ada, or any other Ada A-Series, but the question remains, if multiple cards will provide a bigger VRAM pool, and therefor supply bigger scenes, of if they top off all at the same time regardless how many cards the system has, similar to the RTX4090. In that case it is not really possible to upgrade a system besides to put a new card in and to throw the old card into the trash because requirements changed.
So, I wonder if the best call would be to just go with RTX3090(Ti) because they still have SLI and ray tracing?
If I remember correctly you can link up to four cards, so those should in theory be able to supply 72Gb of VRAM, which is more that any Ada A-Series GPU.
Ada A-Series of course would be faster, but I don’t really care about speed anymore at that point.
When it comes to performance L40S would probably be the best call, but also no NV-Link/SLI because it is L and not H, so again the same struggle.
So, questions to answer:

  1. Is VRAM linkage achievable through PCI-E and wich cards support this feature for Omniverse?
  2. Will the linkage be sufficient for big scenes in Omniverse and prevent crashes, because VRAM was topped off?
  3. If not, is the 30-Series a solution to the problem because it still has a NV-Link Option? But the software stack could be problematic in this case, because I assume nobody have ever tested this before.