I just set up a TitanX machine to replace an AWS EC2 instance I’ve been using with Tensorflow. I noticed that it’s actually exponentially slower to run on this TitanX than it is to run on CPU (32 cores(threads)) on this box. Something is obviously wrong.
The only thing I could see was when running nvidia-smi -q I see:
HW Slowdown : Active
I can’t find much information on “HW Slowdown” other than what it says in the docs
The GPU was 100% utilized, I attempted to manually raise the fan speed to see if it was a temperature issue but even down near 42C it was still slow.
==============NVSMI LOG==============
Timestamp : Tue Apr 12 00:32:40 2016
Driver Version : 361.42
Attached GPUs : 1
GPU 0000:84:00.0
Product Name : GeForce GTX TITAN X
Product Brand : GeForce
Display Mode : Disabled
Display Active : Disabled
Persistence Mode : Enabled
Accounting Mode : Disabled
Accounting Mode Buffer Size : 1920
Driver Model
Current : N/A
Pending : N/A
Serial Number : 0420116032145
GPU UUID : GPU-cb55b961-e76a-15fc-2abc-412687da3242
Minor Number : 0
VBIOS Version : 84.00.45.00.90
MultiGPU Board : No
Board ID : 0x8400
GPU Part Number : N/A
Inforom Version
Image Version : G001.0000.01.03
OEM Object : 1.1
ECC Object : N/A
Power Management Object : N/A
GPU Operation Mode
Current : N/A
Pending : N/A
PCI
Bus : 0x84
Device : 0x00
Domain : 0x0000
Device Id : 0x17C210DE
Bus Id : 0000:84:00.0
Sub System Id : 0x29923842
GPU Link Info
PCIe Generation
Max : 3
Current : 3
Link Width
Max : 16x
Current : 16x
Bridge Chip
Type : N/A
Firmware : N/A
Replays since reset : 0
Tx Throughput : 22000 KB/s
Rx Throughput : 61000 KB/s
Fan Speed : 100 %
Performance State : P2
Clocks Throttle Reasons
Idle : Not Active
Applications Clocks Setting : Not Active
SW Power Cap : Not Active
HW Slowdown : Active
Sync Boost : Not Active
Unknown : Not Active
FB Memory Usage
Total : 12287 MiB
Used : 11763 MiB
Free : 524 MiB
BAR1 Memory Usage
Total : 256 MiB
Used : 4 MiB
Free : 252 MiB
Compute Mode : Default
Utilization
Gpu : 100 %
Memory : 0 %
Encoder : 0 %
Decoder : 0 %
Ecc Mode
Current : N/A
Pending : N/A
ECC Errors
Volatile
Single Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Total : N/A
Double Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Total : N/A
Aggregate
Single Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Total : N/A
Double Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Total : N/A
Retired Pages
Single Bit ECC : N/A
Double Bit ECC : N/A
Pending : N/A
Temperature
GPU Current Temp : 56 C
GPU Shutdown Temp : 97 C
GPU Slowdown Temp : 92 C
Power Readings
Power Management : Supported
Power Draw : 241.50 W
Power Limit : 250.00 W
Default Power Limit : 250.00 W
Enforced Power Limit : 250.00 W
Min Power Limit : 150.00 W
Max Power Limit : 275.00 W
Clocks
Graphics : 1328 MHz
SM : 1328 MHz
Memory : 3304 MHz
Video : 1227 MHz
Applications Clocks
Graphics : 1126 MHz
Memory : 3505 MHz
Default Applications Clocks
Graphics : 1126 MHz
Memory : 3505 MHz
Max Clocks
Graphics : 1519 MHz
SM : 1519 MHz
Memory : 3505 MHz
Video : 1397 MHz
Clock Policy
Auto Boost : On
Auto Boost Default : On
Processes
Process ID : 23088
Type : C
Name : python
Used GPU Memory : 11736 MiB
That didn’t seem like it was throttling the clock, I’m not even sure what to look for from here. Any help would be appreciated