Pycuda autoinit and tensorRT `create_execution_context` hangs indefinitely

Continuing the discussion from Import pycuda.autoinit() works infinitely:

I’m using an AGX Xavier, Jetpack 4.4.1. This issue is blocking any further work for us.

import pycuda.driver as cuda

works but following doesn’t work

import pycuda.autoinit

pycuda is being used for tensorRT model definition so if pycuda.autoinit is removed the last line of the following code block doesn’t work either.

EDIT: the code hangs at trt.Runtime

logger = trt.Logger(trt.Logger.WARNING)
runtime = trt.Runtime(logger)
with open(f"{model_name}", "rb") as f:
    model_trt =
self.model_trt = runtime.deserialize_cuda_engine(model_trt)
ctx = self.model_trt.create_execution_context()