After the upgrade to 460.15_gameready_win10-dch_64bit_international.exe the graph doesn’t show much/any usage despite running a large, GPU-based workload that always maxed that graph previously. The GPU memory usage plot still seems to work as before though.
Is this a known issue? If so, is there any convenient way to monitor the utilization in lieu?
For unrelated reasons, I’ve since changed the GPU, Cuda, Driver and PyTroch version and still see the same problem.
Driver: 460.20
import torch
from tqdm import tqdm
print(torch.cuda.get_device_name())
print(f'Cuda version: {torch._C._cuda_getCompiledVersion()}')
print(f'Torch version: {torch.__version__}')
size = 30000
for n in tqdm(range(1000000)):
a = torch.rand((size, size), device='cuda')
b = torch.rand((size, size), device='cuda')
ab = a * b
GeForce RTX 3090
Cuda version: 11000
Torch version: 1.7.0.dev20200926+cu110
It must being doing something, cos the GPU temp goes up and the PSU starts drawing 600w!
It’s very difficult to understand what’s happening with my model training without utilisation stats, so if anyone knows a fix/workaround it would be appreciated!