Dear Team,
I have setup a docker and created a container by following below steps
$ sudo git clone GitHub - pytorch/TensorRT: PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT
$ cd Torch-TensorRT
$ sudo docker build -t torch_tensorrt -f ./docker/Dockerfile .
$ sudo docker run --gpus=all --rm -it -v $PWD:/Torch-TensorRT --net=host --ipc=host --ulimit memlock=-1 –
ulimit stack=67108864 torch_tensorrt:latest bash
$ cd /Torch-TensorRT/notebooks
$ jupyter notebook --allow-root --ip 0.0.0.0 --port 8888
I tried to run the model using Jupyter notebook( [Resnet50-example.ipynb]) but facing several issues as listed below :
Error marked as BOLD
Model Description:
This ResNet-50 model is based on the Deep Residual Learning for Image Recognition paper, which describes ResNet as “a method for detecting objects in images using a single deep neural network". The input size is fixed to 32x32.
alt
- Running the model without optimizations
PyTorch has a model repository called timm, which is a source for high quality implementations of computer vision models. We can get our EfficientNet model from there pretrained on ImageNet.
import torch
import torchvision
torch.hub._validate_not_a_forked_repo=lambda a,b,c: True
resnet50_model = torch.hub.load(‘pytorch/vision:v0.10.0’, ‘resnet50’, pretrained=True)
resnet50_model.eval()
Using cache found in /root/.cache/torch/hub/pytorch_vision_v0.10.0
ResNet(
(conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
(layer1): Sequential(
(0): Bottleneck(
(conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(downsample): Sequential(
(0): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): Bottleneck(
(conv1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
(2): Bottleneck(
(conv1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
)
(layer2): Sequential(
(0): Bottleneck(
(conv1): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(downsample): Sequential(
(0): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): Bottleneck(
(conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
(2): Bottleneck(
(conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
(3): Bottleneck(
(conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
)
(layer3): Sequential(
(0): Bottleneck(
(conv1): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(downsample): Sequential(
(0): Conv2d(512, 1024, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
(2): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
(3): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
(4): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
(5): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
)
(layer4): Sequential(
(0): Bottleneck(
(conv1): Conv2d(1024, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(downsample): Sequential(
(0): Conv2d(1024, 2048, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): Bottleneck(
(conv1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
(2): Bottleneck(
(conv1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
)
(avgpool): AdaptiveAvgPool2d(output_size=(1, 1))
(fc): Linear(in_features=2048, out_features=1000, bias=True)
)
With our model loaded, let’s proceed to downloading some images!
!mkdir -p ./data
!wget -O ./data/img0.JPG “https://d17fnq9dkz9hgj.cloudfront.net/breed-uploads/2018/08/siberian-husky-detail.jpg?bust=1535566590&width=630”
!wget -O ./data/img1.JPG “https://www.hakaimagazine.com/wp-content/uploads/header-gulf-birds.jpg”
!wget -O ./data/img2.JPG “https://www.artis.nl/media/filer_public_thumbnails/filer_public/00/f1/00f1b6db-fbed-4fef-9ab0-84e944ff11f8/chimpansee_amber_r_1920x1080.jpg__1920x1080_q85_subject_location-923%2C365_subsampling-2.jpg”
!wget -O ./data/img3.JPG “https://www.familyhandyman.com/wp-content/uploads/2018/09/How-to-Avoid-Snakes-Slithering-Up-Your-Toilet-shutterstock_780480850.jpg”
!wget -O ./data/imagenet_class_index.json “https://s3.amazonaws.com/deep-learning-models/image-models/imagenet_class_index.json”
–2022-05-27 04:43:58-- https://d17fnq9dkz9hgj.cloudfront.net/breed-uploads/2018/08/siberian-husky-detail.jpg?bust=1535566590&width=630
Resolving d17fnq9dkz9hgj.cloudfront.net (d17fnq9dkz9hgj.cloudfront.net)… 18.66.40.187, 18.66.40.52, 18.66.40.216, …
Connecting to d17fnq9dkz9hgj.cloudfront.net (d17fnq9dkz9hgj.cloudfront.net)|18.66.40.187|:443… connected.
HTTP request sent, awaiting response… 200 OK
Length: 24112 (24K) [image/jpeg]
Saving to: ‘./data/img0.JPG’
./data/img0.JPG 100%[===================>] 23.55K --.-KB/s in 0.006s
2022-05-27 04:43:59 (3.98 MB/s) - ‘./data/img0.JPG’ saved [24112/24112]
–2022-05-27 04:43:59-- https://www.hakaimagazine.com/wp-content/uploads/header-gulf-birds.jpg
Resolving www.hakaimagazine.com (www.hakaimagazine.com)… 164.92.73.117, 64:ff9b::a45c:4975
Connecting to www.hakaimagazine.com (www.hakaimagazine.com)|164.92.73.117|:443… connected.
HTTP request sent, awaiting response… 200 OK
Length: 452718 (442K) [image/jpeg]
Saving to: ‘./data/img1.JPG’
./data/img1.JPG 100%[===================>] 442.11K 425KB/s in 1.0s
2022-05-27 04:44:02 (425 KB/s) - ‘./data/img1.JPG’ saved [452718/452718]
–2022-05-27 04:44:02-- https://www.artis.nl/media/filer_public_thumbnails/filer_public/00/f1/00f1b6db-fbed-4fef-9ab0-84e944ff11f8/chimpansee_amber_r_1920x1080.jpg__1920x1080_q85_subject_location-923%2C365_subsampling-2.jpg
Resolving www.artis.nl (www.artis.nl)… 94.75.225.20
Connecting to www.artis.nl (www.artis.nl)|94.75.225.20|:443… connected.
HTTP request sent, awaiting response… 200 OK
Length: 361413 (353K) [image/jpeg]
Saving to: ‘./data/img2.JPG’
./data/img2.JPG 100%[===================>] 352.94K 334KB/s in 1.1s
2022-05-27 04:44:05 (334 KB/s) - ‘./data/img2.JPG’ saved [361413/361413]
–2022-05-27 04:44:05-- https://www.familyhandyman.com/wp-content/uploads/2018/09/How-to-Avoid-Snakes-Slithering-Up-Your-Toilet-shutterstock_780480850.jpg
Resolving www.familyhandyman.com (www.familyhandyman.com)… 104.18.202.107, 104.18.201.107, 2606:4700::6812:ca6b, …
Connecting to www.familyhandyman.com (www.familyhandyman.com)|104.18.202.107|:443… connected.
HTTP request sent, awaiting response… 200 OK
Length: 90994 (89K) [image/jpeg]
Saving to: ‘./data/img3.JPG’
./data/img3.JPG 100%[===================>] 88.86K --.-KB/s in 0.03s
2022-05-27 04:44:05 (2.71 MB/s) - ‘./data/img3.JPG’ saved [90994/90994]
–2022-05-27 04:44:06-- https://s3.amazonaws.com/deep-learning-models/image-models/imagenet_class_index.json
Resolving s3.amazonaws.com (s3.amazonaws.com)… 52.217.172.24, 64:ff9b::34d8:1c86
Connecting to s3.amazonaws.com (s3.amazonaws.com)|52.217.172.24|:443… connected.
HTTP request sent, awaiting response… 200 OK
Length: 35363 (35K) [application/octet-stream]
Saving to: ‘./data/imagenet_class_index.json’
./data/imagenet_cla 100%[===================>] 34.53K 102KB/s in 0.3s
2022-05-27 04:44:07 (102 KB/s) - ‘./data/imagenet_class_index.json’ saved [35363/35363]
All pre-trained models expect input images normalized in the same way, i.e. mini-batches of 3-channel RGB images of shape (3 x H x W), where H and W are expected to be at least 224. The images have to be loaded in to a range of [0, 1] and then normalized using mean = [0.485, 0.456, 0.406] and std = [0.229, 0.224, 0.225].
Here’s a sample execution.
from PIL import Image
from torchvision import transforms
import matplotlib.pyplot as plt
import json
fig, axes = plt.subplots(nrows=2, ncols=2)
for i in range(4):
img_path = ‘./data/img%d.JPG’%i
img = Image.open(img_path)
preprocess = transforms.Compose([
transforms.Resize(256),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])
input_tensor = preprocess(img)
plt.subplot(2,2,i+1)
plt.imshow(img)
plt.axis(‘off’)
loading labels
with open(“./data/imagenet_class_index.json”) as json_file:
d = json.load(json_file)
Throughout this tutorial, we will be making use of some utility functions; rn50_preprocess for preprocessing input images, predict to use the model for prediction and benchmark to benchmark the inference. You do not need to understand/go through these utilities to make use of Torch TensorRT, but are welecomed to do so if you choose.
import numpy as np
import time
import torch.backends.cudnn as cudnn
cudnn.benchmark = True
def rn50_preprocess():
preprocess = transforms.Compose([
transforms.Resize(256),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])
return preprocess
decode the results into ([predicted class, description], probability)
def predict(img_path, model):
img = Image.open(img_path)
preprocess = rn50_preprocess()
input_tensor = preprocess(img)
input_batch = input_tensor.unsqueeze(0) # create a mini-batch as expected by the model
# move the input and model to GPU for speed if available
if torch.cuda.is_available():
input_batch = input_batch.to('cuda')
model.to('cuda')
with torch.no_grad():
output = model(input_batch)
# Tensor of shape 1000, with confidence scores over Imagenet’s 1000 classes
sm_output = torch.nn.functional.softmax(output[0], dim=0)
ind = torch.argmax(sm_output)
return d[str(ind.item())], sm_output[ind] #([predicted class, description], probability)
def benchmark(model, input_shape=(1024, 1, 224, 224), dtype=‘fp32’, nwarmup=50, nruns=10000):
input_data = torch.randn(input_shape)
input_data = input_data.to(“cuda”)
if dtype==‘fp16’:
input_data = input_data.half()
print("Warm up ...")
with torch.no_grad():
for _ in range(nwarmup):
features = model(input_data)
torch.cuda.synchronize()
print("Start timing ...")
timings = []
with torch.no_grad():
for i in range(1, nruns+1):
start_time = time.time()
features = model(input_data)
torch.cuda.synchronize()
end_time = time.time()
timings.append(end_time - start_time)
if i%10==0:
print('Iteration %d/%d, ave batch time %.2f ms'%(i, nruns, np.mean(timings)*1000))
print(“Input shape:”, input_data.size())
print(“Output features size:”, features.size())
print(‘Average batch time: %.2f ms’%(np.mean(timings)*1000))
With the model downloaded and the util functions written, let’s just quickly see some predictions, and benchmark the model in its current un-optimized state.
for i in range(4):
img_path = ‘./data/img%d.JPG’%i
img = Image.open(img_path)
pred, prob = predict(img_path, resnet50_model)
print('{} - Predicted: {}, Probablility: {}'.format(img_path, pred, prob))
plt.subplot(2,2,i+1)
plt.imshow(img);
plt.axis(‘off’);
plt.title(pred[1])
/usr/local/lib/python3.8/dist-packages/torch/cuda/init.py:80: UserWarning: CUDA initialization: CUDA unknown error - this may be due to an incorrectly set up environment, e.g. changing env variable CUDA_VISIBLE_DEVICES after program start. Setting the available devices to be zero. (Triggered internally at …/c10/cuda/CUDAFunctions.cpp:112.)
** return torch._C._cuda_getDeviceCount() > 0**
./data/img0.JPG - Predicted: [‘n02110185’, ‘Siberian_husky’], Probablility: 0.49787387251853943
./data/img1.JPG - Predicted: [‘n01820546’, ‘lorikeet’], Probablility: 0.6446995735168457
./data/img2.JPG - Predicted: [‘n02481823’, ‘chimpanzee’], Probablility: 0.9899842739105225
./data/img3.JPG - Predicted: [‘n01749939’, ‘green_mamba’], Probablility: 0.4564127027988434
Model benchmark without Torch-TensorRT
model = resnet50_model.eval().to(“cuda”)
benchmark(model, input_shape=(128, 3, 224, 224), nruns=100)
RuntimeError Traceback (most recent call last)
Input In [8], in <cell line: 2>()
** 1 # Model benchmark without Torch-TensorRT**
----> 2 model = resnet50_model.eval().to(“cuda”)
** 3 benchmark(model, input_shape=(128, 3, 224, 224), nruns=100)**
File /usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py:899, in Module.to(self, args, kwargs)
** 895 return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None,
** 896 non_blocking, memory_format=convert_to_format)*
** 897 return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)**
→ 899 return self._apply(convert)
File /usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py:570, in Module._apply(self, fn)
** 568 def _apply(self, fn):**
** 569 for module in self.children():**
→ 570 module._apply(fn)
** 572 def compute_should_use_set_data(tensor, tensor_applied):**
** 573 if torch._has_compatible_shallow_copy_type(tensor, tensor_applied):**
** 574 # If the new tensor has compatible tensor type as the existing tensor,**
** 575 # the current behavior is to change the tensor in-place using .data =
,**
** (…)**
** 580 # global flag to let the user control whether they want the future**
** 581 # behavior of overwriting the existing tensor or not.**
File /usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py:593, in Module._apply(self, fn)
** 589 # Tensors stored in modules are graph leaves, and we don’t want to**
** 590 # track autograd history of param_applied
, so we have to use**
** 591 # with torch.no_grad():
**
** 592 with torch.no_grad():**
→ 593 param_applied = fn(param)
** 594 should_use_set_data = compute_should_use_set_data(param, param_applied)**
** 595 if should_use_set_data:**
File /usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py:897, in Module.to..convert(t)
** 894 if convert_to_format is not None and t.dim() in (4, 5):**
** 895 return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None,**
** 896 non_blocking, memory_format=convert_to_format)**
→ 897 return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
File /usr/local/lib/python3.8/dist-packages/torch/cuda/init.py:214, in _lazy_init()
** 210 raise AssertionError(**
** 211 “libcudart functions unavailable. It looks like you have a broken build?”)**
** 212 # This function throws if there’s a driver initialization error, no GPUs**
** 213 # are found or any other error occurs**
→ 214 torch._C._cuda_init()
** 215 # Some of the queued calls may reentrantly call _lazy_init();**
** 216 # we need to just return without initializing in that case.**
** 217 # However, we must not let any other threads in!**
** 218 _tls.is_initializing = True**
RuntimeError: CUDA unknown error - this may be due to an incorrectly set up environment, e.g. changing env variable CUDA_VISIBLE_DEVICES after program start. Setting the available devices to be zero.
Kindly help me to resolve the same.
Thanks and Regards,
Vyom Mishra