still the same error, actually there’s one line of code to make sure .engine
file does exists:
if not os.path.exists(trt_engine_path):
print("the engine file does not exists, quit!")
exit()
and this was never hit in above experiments.
this is what I’ve done for this time to export a new name: abc.engine of engine file:
tao-converter /tao_models/electric_bicycle_net_tao/final_model.etlt -k nvidia_tlt -d 3,224,224 -o predictions/Softmax -m 16 -e /opt/tritonserver/abc.engine
[INFO] [MemUsageChange] Init CUDA: CPU +534, GPU +0, now: CPU 540, GPU 1827 (MiB)
[INFO] [MemUsageSnapshot] Builder begin: CPU 629 MiB, GPU 1827 MiB
[INFO] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +791, GPU +340, now: CPU 1464, GPU 2167 (MiB)
[INFO] [MemUsageChange] Init cuDNN: CPU +195, GPU +342, now: CPU 1659, GPU 2509 (MiB)
[WARNING] Detected invalid timing cache, setup a local cache instead
[INFO] Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output.
[INFO] Detected 1 inputs and 1 output network tensors.
[INFO] Total Host Persistent Memory: 94352
[INFO] Total Device Persistent Memory: 46283264
[INFO] Total Scratch Memory: 0
[INFO] [MemUsageStats] Peak memory usage of TRT CPU/GPU memory allocators: CPU 53 MiB, GPU 32 MiB
[INFO] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 2634, GPU 3035 (MiB)
[INFO] [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 2634, GPU 3043 (MiB)
[INFO] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 2634, GPU 3027 (MiB)
[INFO] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 2634, GPU 3009 (MiB)
[INFO] [MemUsageSnapshot] Builder end: CPU 2634 MiB, GPU 3009 MiB
root@9207ab950ed0:/opt/tritonserver/mytest# mv ../abc.engine ./
root@9207ab950ed0:/opt/tritonserver/mytest# python3 infer_cls.py
[03/24/2022-04:10:36] [TRT] [E] 1: [stdArchiveReader.cpp::StdArchiveReader::35] Error Code 1: Serialization (Serialization assertion safeVersionRead == safeSerializationVersion failed.Version tag does not match. Note: Current Version: 0, Serialized Engine Version: 43)
[03/24/2022-04:10:36] [TRT] [E] 4: [runtime.cpp::deserializeCudaEngine::50] Error Code 4: Internal Error (Engine deserialization failed.)
Traceback (most recent call last):
File "infer_cls.py", line 86, in <module>
h_input, d_input, h_output, d_output, stream = allocate_buffers(trt_engine)
File "infer_cls.py", line 34, in allocate_buffers
h_input = cuda.pagelocked_empty(trt.volume(engine.get_binding_shape(0)), dtype=trt.nptype(trt.float32))
AttributeError: 'NoneType' object has no attribute 'get_binding_shape'
root@9207ab950ed0:/opt/tritonserver/mytest#