Jetson Xavier Nx
Jetpack = 4.4
TensorRT = 7.1.3-1 + cuda10.2
We have a system which was working fine and we needed to separate code base and wanted to include queue system to achieve asynchronous between two different modules.
We have successfully achieved it using celery (flask & redis), BUT we ended up with a pycuda driver issue:
Traceback (most recent call last):
File “/home/nvidia/.local/lib/python3.6/site-packages/flask/app.py”, line 2070, in wsgi_app
response = self.full_dispatch_request()
File “/home/nvidia/.local/lib/python3.6/site-packages/flask/app.py”, line 1515, in full_dispatch_request
rv = self.handle_user_exception(e)
File “/home/nvidia/.local/lib/python3.6/site-packages/flask/app.py”, line 1513, in full_dispatch_request
rv = self.dispatch_request()
File “/home/nvidia/.local/lib/python3.6/site-packages/flask/app.py”, line 1499, in dispatch_request
return self.ensure_sync(self.view_functions[rule.endpoint])(**req.view_args)
File “ai_app.py”, line 111, in start_controller
controller.change_current_model(crop.lower(), variety.lower(), offline=offline)
File “/media/nvidia/Agdhi-128/Seedvision-Hardware-AI/trt_inference/controller.py”, line 35, in change_current_model
self.load_model(PaddyTRTModel, crop, variety)
File “/media/nvidia/Agdhi-128/Seedvision-Hardware-AI/trt_inference/controller.py”, line 23, in load_model
current_model = model_class(self.config)
File “/media/nvidia/Agdhi-128/Seedvision-Hardware-AI/trt_inference/paddy_trt.py”, line 53, in init
self.smc = TrtModel(self.smc_model_path)
File “/media/nvidia/Agdhi-128/Seedvision-Hardware-AI/trt_inference/model.py”, line 28, in init
self.inputs, self.outputs, self.bindings, self.stream = self.allocate_buffers()
File “/media/nvidia/Agdhi-128/Seedvision-Hardware-AI/trt_inference/model.py”, line 47, in allocate_buffers
stream = cuda.Stream()
pycuda._driver.LogicError: explicit_context_dependent failed: invalid device context - no currently active context?
We have solved it following this:
python - pyCUDA with Flask gives pycuda._driver.LogicError: cuModuleLoadDataEx - Stack Overflow
But we got another error:
[TensorRT] ERROR: …/rtSafe/safeContext.cpp (133) - Cudnn Error in configure: 7 (CUDNN_STATUS_MAPPING_ERROR)
[TensorRT] ERROR: FAILED_EXECUTION: std::exception
Solution for this:
But they are telling us to change back to the code which will give us previous error, now we are stuck in-between, please help us solve this, it may be a simple bug or a nightmare.
Let me know what are the details you want i will reply.