I have been reviewing the below link which shows the maximum GRID boards per server.
I am searching more information on these limits. Are there technical issues with exceeding these limits? If so, what are the technical issues encountered? Or, are these simply numbers that NVIDIA has not certified beyond yet? Would NVIDIA deny technical support if these limits are exceeded?
Thanks so much!
The numbers posted are based on tested limits. These tested limits are definitely related to the technical limitations of each hardware platform.
Each GRID card consumes a huge amount of power relative to other components in a server and also generates a significant amount of heat. The considerations at play here are can the power supplies provide enough power for these cards at full load and can the chassis the cards are installed in properly dissipate the heat generated by the cards.
I would NOT recommend exceeding the posted guidelines. It’s possible that you could get the cards to work but you risk causing significant damage to the server or the cards if you experience a failure, especially with heat dissipation.
Good answer, Mike! Thank you!
In addition to Mike’s observations, you’ll also find limits in the SBIOS on the server mainboard which in addition cooling, space and power will also restricts the card count. This is one reason why sometimes you’ll see a chassis that can take 3 x K2’s but only 2 x K1’s due to what can be addressed.
FYI the certification process is done jointly with the OEM’s who select which servers to certify and then work through the process.