Problem with T4-2Q

Hi guys.

I got 4 T4-2Q and want to host them in a Ruby Cloud.
I thought, I had it all set up, but my virtual GPU can’t fetch the license.
I am using Ubuntu 20.04, CUDA 11.2, CUDNN 8.1, I set the server address to the correct server in gridd.conf, however, when I type

sudo service nvidia-gridd restart

I get no licence.

If I type in var/log/

sudo grep gridd syslog

I get the following message:

Jul 5 15:31:31 ds-lab-gpu-ubuntu-2004 nvidia-gridd: Calling load_byte_array(tra)
Jul 5 15:31:33 ds-lab-gpu-ubuntu-2004 nvidia-gridd: Failed to acquire/renew license from license server. (Info: http://10.38.224.110:7070/request; NVIDIA RTX Virtual Workstation - Error: [1,7E2,2,0[7000000B,0,702C7]]#012Requested feature was not found.)

From my research, I found out, that I need a vWS license, yet in my license server, I got a NVIDIA-vComputeServer license.

I would appreciate any help.

Kind regards,
Robert

Hi,

did you set the right feature type in gridd.conf? I assume not. Therefore the licserver expects to aquire a vWS license.

regards
Simon

Hi Simon,

thanks for answering.

I tried all possible values in gridd.conf (1, 2, 4) and in every case, I could not acquire a license.

Right now, I have the FeatureType 2.

When I type

sudo service nvidia-gridd restart
nvidia-smi -q

I get the following:

ubuntu@ds-lab-gpu-ubuntu-2004:~$ nvidia-smi -q

==============NVSMI LOG==============

Timestamp : Tue Jul 6 06:29:49 2021
Driver Version : 460.32.03
CUDA Version : 11.2

Attached GPUs : 1
GPU 00000000:00:05.0
Product Name : GRID T4-2Q
Product Brand : NVIDIA RTX Virtual Workstation
Display Mode : Enabled
Display Active : Disabled
Persistence Mode : Enabled
MIG Mode
Current : N/A
Pending : N/A
Accounting Mode : Disabled
Accounting Mode Buffer Size : 4000
Driver Model
Current : N/A
Pending : N/A
Serial Number : N/A
GPU UUID : NoneOfYourBusiness
Minor Number : 0
VBIOS Version : 00.00.00.00.00
MultiGPU Board : No
Board ID : 0x5
GPU Part Number : N/A
Inforom Version
Image Version : N/A
OEM Object : N/A
ECC Object : N/A
Power Management Object : N/A
GPU Operation Mode
Current : N/A
Pending : N/A
GPU Virtualization Mode
Virtualization Mode : VGPU
Host VGPU Mode : N/A
vGPU Software Licensed Product
Product Name : NVIDIA RTX Virtual Workstation
License Status : Unlicensed (Unrestricted)
IBMNPU
Relaxed Ordering Mode : N/A
PCI
Bus : 0x00
Device : 0x05
Domain : 0x0000
Device Id : AlsoNoneOfYourBusiness
Bus Id : 00000000:00:05.0
Sub System Id : AgainNoneOfYourBusiness
GPU Link Info
PCIe Generation
Max : N/A
Current : N/A
Link Width
Max : N/A
Current : N/A
Bridge Chip
Type : N/A
Firmware : N/A
Replays Since Reset : N/A
Replay Number Rollovers : N/A
Tx Throughput : N/A
Rx Throughput : N/A
Fan Speed : N/A
Performance State : P8
Clocks Throttle Reasons : N/A
FB Memory Usage
Total : 2048 MiB
Used : 288 MiB
Free : 1760 MiB
BAR1 Memory Usage
Total : 256 MiB
Used : 0 MiB
Free : 256 MiB
Compute Mode : Default
Utilization
Gpu : 0 %
Memory : 0 %
Encoder : 0 %
Decoder : 0 %
Encoder Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
FBC Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
Ecc Mode
Current : Enabled
Pending : Enabled
ECC Errors
Volatile
SRAM Correctable : 0
SRAM Uncorrectable : 0
DRAM Correctable : 0
DRAM Uncorrectable : 0
Aggregate
SRAM Correctable : 0
SRAM Uncorrectable : 0
DRAM Correctable : 0
DRAM Uncorrectable : 0
Retired Pages
Single Bit ECC : 0
Double Bit ECC : 0
Pending Page Blacklist : No
Remapped Rows : N/A
Temperature
GPU Current Temp : N/A
GPU Shutdown Temp : N/A
GPU Slowdown Temp : N/A
GPU Max Operating Temp : N/A
GPU Target Temperature : N/A
Memory Current Temp : N/A
Memory Max Operating Temp : N/A
Power Readings
Power Management : N/A
Power Draw : N/A
Power Limit : N/A
Default Power Limit : N/A
Enforced Power Limit : N/A
Min Power Limit : N/A
Max Power Limit : N/A
Clocks
Graphics : 300 MHz
SM : 300 MHz
Memory : 405 MHz
Video : 540 MHz
Applications Clocks
Graphics : N/A
Memory : N/A
Default Applications Clocks
Graphics : N/A
Memory : N/A
Max Clocks
Graphics : N/A
SM : N/A
Memory : N/A
Video : N/A
Max Customer Boost Clocks
Graphics : N/A
Clock Policy
Auto Boost : N/A
Auto Boost Default : N/A
Processes : None

I pseudonymized certain data due to protect my company. Is it possible, that the GPU is not set up correctly? The amount of N/A-values worries me.

Also, if you need any other information, please feel free to tell me, what you need (and, sadly, how I can acquire this information, since I am quite new to Linux and vGPUs).

Hi,
forgot to mention you need to choose the right profile :) You can only use the C profile for your license type!!!
License assignment is based on the profile.

regards
Simon

Hi Simon,

I am amazed at how fast you react, that’s wonderful!

In my license server, I have the following info:

Feature Name: NVIDIA-vComputeServer
Version: 9.0
Total count: 4
Available: 4
Current Usage: 0
Reserved Count: 0
Vendor String: NVIDIA-vComputeServer-1
Feature Expiry: Sadly,NoneOfYourBusiness

You still think, I should set up a C-profile?

You need to. License aquisition is based on the profile type. Only vWS is a “superset” license type and can license different profile types.

Thanks man, I am gonna try it.

Hi Simon,

I read a little documentation, and if I understand you correctly, we set up the GPUs wrongly.

According to

nvidia-smi -q

I got a T4-2Q set up which needs a vWS license which I do not have.

However, as I showed you before, I do have a vCS license, and I can use this license with T4-4C, T4-8C and T4-16C, right?

This means, we have to uninstall the GPUs and install them anew with suiting options?

Kind regards,
Robert

Correct. You can only use the C-profile which means starting with T4-4C as you stated above. Simply remove the Q profile and add the C profile to the VM.