Description
I am facing the issue
“RuntimeError: make_default_context() wasn’t able to create a context on any of the 1 detected devices “
This process runs in kubernetes inside a pod. Sometimes restarting the pod’s resolves the issue and sometimes we need to move the process to a different gpu.
Traceback
from FR.src.RetinaFace_trt import Retinaface_trt
File "/data/JF/FR/src/RetinaFace_trt.py", line 14, in <module>
import pycuda.autoinit
File "/usr/local/lib/python3.6/dist-packages/pycuda/autoinit.py", line 9, in <module>
context = make_default_context()
File "/usr/local/lib/python3.6/dist-packages/pycuda/tools.py", line 204, in make_default_context
"on any of the %d detected devices" % ndevices)
RuntimeError: make_default_context() wasn't able to create a context on any of the 1 detected devices
Environment
TensorRT Version: tensorrt==7.1.3.4
GPU Type: T4
Nvidia Driver Version: 460.32.03
CUDA Version: 11.2
CUDNN Version: 8.0.4
Operating System + Version: Ubuntu 18.04.6 LTS
Python Version (if applicable): 3.6.9
TensorFlow Version (if applicable):
PyTorch Version (if applicable): 1.4.0
Baremetal or Container (if container which image + tag):
import ctypes
import os
import random
import sys
import threading
import time
import dlib
import cv2
import numpy as np
import pycuda.autoinit
import pycuda.driver as cuda
import tensorrt as trt
import torch
import torchvision
Getting error on the import(pycuda.autoinit) itself