Dear Nvdia for your next Telsa or GeForce card: Blink LED for ID New hardware feature idea

Dear Nvidia,

This is a cross post of a message I just left over on EVGA.com but it’s good idea for either company to use.

I Fold, I SLI, I overclock. I have different cards able to reach different top ends and do so differently in different PCIe slots. To help me figure out which slot correspond to a specific enumeration and listed card I would like to be able to do the same thing that I already do for drives in my RAID array.

Namely I want to be able to cause a specific card’s LED to blink. I would like to do so from Precision, and/or the Nvidia Control Panel interface if this feature migrates upward to the Nvidia reference designs.

For the most part we already have power LEDs on the card we buy. How about a feature to be able to use the power LED and make it blink? This feature would be nice to have when mapping out my high performance video cards. Th feature would also be another item on the checklist to keep me welded to EVGA/Nvidia as long term customer and enthusiast.

This feature would also be useful for large machines with multiple Tesla boards to help ID a card that has degraded during replacement operations.

Thanks for reading.

Regards,

Vectrexer

Dear Nvidia,

This is a cross post of a message I just left over on EVGA.com but it’s good idea for either company to use.

I Fold, I SLI, I overclock. I have different cards able to reach different top ends and do so differently in different PCIe slots. To help me figure out which slot correspond to a specific enumeration and listed card I would like to be able to do the same thing that I already do for drives in my RAID array.

Namely I want to be able to cause a specific card’s LED to blink. I would like to do so from Precision, and/or the Nvidia Control Panel interface if this feature migrates upward to the Nvidia reference designs.

For the most part we already have power LEDs on the card we buy. How about a feature to be able to use the power LED and make it blink? This feature would be nice to have when mapping out my high performance video cards. Th feature would also be another item on the checklist to keep me welded to EVGA/Nvidia as long term customer and enthusiast.

This feature would also be useful for large machines with multiple Tesla boards to help ID a card that has degraded during replacement operations.

Thanks for reading.

Regards,

Vectrexer

Such a useful feature already exists.
In Linux, you call “nvidia-smi --toggle-led=NUM” where NUM is the ID of the GPU.

I don’t know how this is exposed in Windows or Mac though.

A bigger problem is the fact that there are many different enumerations of various “GPU ID” and it’s hard to match them.
The CUDA device number has no relation to the device numbers listed in nvidia-smi, which has no relation to the display device numbers in nvidia-settings, which have no relation to the physical ordering of cards in the motherboard, which have no relation to the PCI assignments you see in lspci .

So if you have a CUDA computation which failed and you expect it’s a hardware fault so you want to blink that failed GPU’s LED, you could do it with nvidia-smi, but you don’t know how to find the number to give nvidia-smi based on the failed CUDA device number.

If I remember right, Tim Murray may have said something about improving this… I haven’t checked the toolkits or drivers for a while.


Back to LEDs, a lot of GPUs don’t have them any longer… my GTX295s have LEDs but not my GTX480s, for example.
I admit those dual-color LEDs are useful for coarsely but instantly diagnosing power problems when building systems.

Such a useful feature already exists.
In Linux, you call “nvidia-smi --toggle-led=NUM” where NUM is the ID of the GPU.

I don’t know how this is exposed in Windows or Mac though.

A bigger problem is the fact that there are many different enumerations of various “GPU ID” and it’s hard to match them.
The CUDA device number has no relation to the device numbers listed in nvidia-smi, which has no relation to the display device numbers in nvidia-settings, which have no relation to the physical ordering of cards in the motherboard, which have no relation to the PCI assignments you see in lspci .

So if you have a CUDA computation which failed and you expect it’s a hardware fault so you want to blink that failed GPU’s LED, you could do it with nvidia-smi, but you don’t know how to find the number to give nvidia-smi based on the failed CUDA device number.

If I remember right, Tim Murray may have said something about improving this… I haven’t checked the toolkits or drivers for a while.


Back to LEDs, a lot of GPUs don’t have them any longer… my GTX295s have LEDs but not my GTX480s, for example.
I admit those dual-color LEDs are useful for coarsely but instantly diagnosing power problems when building systems.