Cannot enable vGPUs on Tesla T4 machine in AWS

Hi folks.

I have an AWS machine with Tesla T4 card and I want to enable vGPUs on this machine. After following the steps as recommended by Amazon to install nvidia drivers and disabling nouveau, when I run diplaymodeselector, it fails.

sudo ./display_mode_change_tool/linux/x64/displaymodeselector --gpumode physical_display_disabled
...
Specified GPU mode not supported on this device 0x1EB8.

The output of

nvidia-smi -q

tool is as follows:

==============NVSMI LOG==============

Timestamp                                 : Wed Mar  6 06:50:55 2024
Driver Version                            : 550.54.14
CUDA Version                              : 12.4

Attached GPUs                             : 1
GPU 00000000:00:1E.0
    Product Name                          : Tesla T4
    Product Brand                         : NVIDIA
    Product Architecture                  : Turing
    Display Mode                          : Enabled
    Display Active                        : Disabled
    Persistence Mode                      : Disabled
    Addressing Mode                       : None
    MIG Mode
        Current                           : N/A
        Pending                           : N/A
    Accounting Mode                       : Disabled
    Accounting Mode Buffer Size           : 4000
    Driver Model
        Current                           : N/A
        Pending                           : N/A
    Serial Number                         : 1561620020480
    GPU UUID                              : GPU-35a3ab88-ce3f-5092-6676-10992ae344b8
    Minor Number                          : 0
    VBIOS Version                         : 90.04.96.00.02
    MultiGPU Board                        : No
    Board ID                              : 0x1e
    Board Part Number                     : 900-2G183-0000-001
    GPU Part Number                       : 1EB8-895-A1
    FRU Part Number                       : N/A
    Module ID                             : 1
    Inforom Version
        Image Version                     : G183.0200.00.02
        OEM Object                        : 1.1
        ECC Object                        : 5.0
        Power Management Object           : N/A
    Inforom BBX Object Flush
        Latest Timestamp                  : N/A
        Latest Duration                   : N/A
    GPU Operation Mode
        Current                           : N/A
        Pending                           : N/A
    GPU C2C Mode                          : N/A
    GPU Virtualization Mode
        Virtualization Mode               : Pass-Through
        Host VGPU Mode                    : N/A
        vGPU Heterogeneous Mode           : N/A
    vGPU Software Licensed Product
        Product Name                      : NVIDIA Virtual Applications
        License Status                    : Licensed (Expiry: N/A)
    GPU Reset Status
        Reset Required                    : No
        Drain and Reset Recommended       : N/A
    GSP Firmware Version                  : 550.54.14
    IBMNPU
        Relaxed Ordering Mode             : N/A
    PCI
        Bus                               : 0x00
        Device                            : 0x1E
        Domain                            : 0x0000
        Base Classcode                    : 0x3
        Sub Classcode                     : 0x2
        Device Id                         : 0x1EB810DE
        Bus Id                            : 00000000:00:1E.0
        Sub System Id                     : 0x12A210DE
        GPU Link Info
            PCIe Generation
                Max                       : 3
                Current                   : 3
                Device Current            : 3
                Device Max                : 3
                Host Max                  : N/A
            Link Width
                Max                       : 16x
                Current                   : 8x
        Bridge Chip
            Type                          : N/A
            Firmware                      : N/A
        Replays Since Reset               : 0
        Replay Number Rollovers           : 0
        Tx Throughput                     : 0 KB/s
        Rx Throughput                     : 0 KB/s
        Atomic Caps Inbound               : N/A
        Atomic Caps Outbound              : N/A
    Fan Speed                             : N/A
    Performance State                     : P0
    Clocks Event Reasons
        Idle                              : Not Active
        Applications Clocks Setting       : Not Active
        SW Power Cap                      : Not Active
        HW Slowdown                       : Not Active
            HW Thermal Slowdown           : Not Active
            HW Power Brake Slowdown       : Not Active
        Sync Boost                        : Not Active
        SW Thermal Slowdown               : Not Active
        Display Clock Setting             : Not Active
    Sparse Operation Mode                 : N/A
    FB Memory Usage
        Total                             : 15360 MiB
        Reserved                          : 442 MiB
        Used                              : 0 MiB
        Free                              : 14917 MiB
    BAR1 Memory Usage
        Total                             : 256 MiB
        Used                              : 2 MiB
        Free                              : 254 MiB
    Conf Compute Protected Memory Usage
        Total                             : 0 MiB
        Used                              : 0 MiB
        Free                              : 0 MiB
    Compute Mode                          : Default
    Utilization
        Gpu                               : 0 %
        Memory                            : 0 %
        Encoder                           : 0 %
        Decoder                           : 0 %
        JPEG                              : 0 %
        OFA                               : 0 %
    Encoder Stats
        Active Sessions                   : 0
        Average FPS                       : 0
        Average Latency                   : 0
    FBC Stats
        Active Sessions                   : 0
        Average FPS                       : 0
        Average Latency                   : 0
    ECC Mode
        Current                           : Enabled
        Pending                           : Enabled
    ECC Errors
        Volatile
            SRAM Correctable              : 0
            SRAM Uncorrectable            : 0
            DRAM Correctable              : 0
            DRAM Uncorrectable            : 0
        Aggregate
            SRAM Correctable              : 0
            SRAM Uncorrectable            : 0
            DRAM Correctable              : 0
            DRAM Uncorrectable            : 0
    Retired Pages
        Single Bit ECC                    : 0
        Double Bit ECC                    : 0
        Pending Page Blacklist            : No
    Remapped Rows                         : N/A
    Temperature
        GPU Current Temp                  : 31 C
        GPU T.Limit Temp                  : N/A
        GPU Shutdown Temp                 : 96 C
        GPU Slowdown Temp                 : 93 C
        GPU Max Operating Temp            : 85 C
        GPU Target Temperature            : N/A
        Memory Current Temp               : N/A
        Memory Max Operating Temp         : N/A
    GPU Power Readings
        Power Draw                        : 27.33 W
        Current Power Limit               : 70.00 W
        Requested Power Limit             : 70.00 W
        Default Power Limit               : 70.00 W
        Min Power Limit                   : 60.00 W
        Max Power Limit                   : 70.00 W
    GPU Memory Power Readings 
        Power Draw                        : N/A
    Module Power Readings
        Power Draw                        : N/A
        Current Power Limit               : N/A
        Requested Power Limit             : N/A
        Default Power Limit               : N/A
        Min Power Limit                   : N/A
        Max Power Limit                   : N/A
    Clocks
        Graphics                          : 585 MHz
        SM                                : 585 MHz
        Memory                            : 5000 MHz
        Video                             : 840 MHz
    Applications Clocks
        Graphics                          : 585 MHz
        Memory                            : 5001 MHz
    Default Applications Clocks
        Graphics                          : 585 MHz
        Memory                            : 5001 MHz
    Deferred Clocks
        Memory                            : N/A
    Max Clocks
        Graphics                          : 1590 MHz
        SM                                : 1590 MHz
        Memory                            : 5001 MHz
        Video                             : 1470 MHz
    Max Customer Boost Clocks
        Graphics                          : 1590 MHz
    Clock Policy
        Auto Boost                        : N/A
        Auto Boost Default                : N/A
    Voltage
        Graphics                          : N/A
    Fabric
        State                             : N/A
        Status                            : N/A
        CliqueId                          : N/A
        ClusterUUID                       : N/A
        Health
            Bandwidth                     : N/A
    Processes                             : None

Please suggest how to proceed.

Edit: FYI, the mode is available on the GPU.

$ sudo ./display_mode_change_tool/linux/x64/displaymodeselector --gpumode

NVIDIA Display Mode Selector Utility (Version 1.61.0)

Copyright (C) 2015-2021, NVIDIA Corporation. All Rights Reserved.

WARNING: This operation updates the firmware on the board and could make

the device unusable if your host system lacks the necessary support.

Are you sure you want to continue?

Press 'y' to confirm (any other key to abort):

y

Select a number:

<0> physical_display_enabled_256MB_bar1

<1> physical_display_disabled

<2> physical_display_enabled_8GB_bar1

Select a number (ESC to quit):

^[

ERROR: User aborted

Hello @pawankhatri and welcome to the NVIDIA developer forums.

I took the liberty of moving this to the vGPU forums, there are the experts who will surely be able to help with further issues.

But I do not think that the tool is even supported on the T4. From the official pages:

Q: Which GPUs are supported by the Display Mode Selector tool?

A: The Display Mode Selector tool is a special tool for NVIDIA L40S, NVIDIA L40, NVIDIA RTX 6000 Ada, NVIDIA A40, NVIDIA RTX A5000, NVIDIA RTX A5500, and NVIDIA RTX A6000 only. It should not be used with any other GPU.

1 Like