Jarvis-asr: jarvis_start.sh times out [TensorRT version error]

namanveer2000 · June 28, 2021, 4:36am

Hi, We have converted Nemo model[b4 branch] to Jarvis asr speech skills model for audio streaming.
We have used these steps for nemo to jarvis conversion:

On running jarvis_start.sh [Jarvis API version: 1.0.0-b3], it times out and throws a TensorRT version error and Model not loaded error inside the docker.

TensorRT error in the docker logs:

E0628 04:06:02.545757 70 logging.cc:43] INVALID_CONFIG: The engine plan file is not compatible with this version of TensorRT, expecting library version 7.2.2 got 7.2.1, please rebuild.

Here is the entire output of docker logs jarvis-speech :

==========================
== Jarvis Speech Skills ==
==========================

NVIDIA Release 21.03 (build 21236204)

Copyright (c) 2018-2020, NVIDIA CORPORATION.  All rights reserved.

Various files include modifications (c) NVIDIA CORPORATION.  All rights reserved.
NVIDIA modifications are covered by the license terms that apply to the underlying
project or file.

NOTE: The SHMEM allocation limit is set to the default of 64MB.  This may be insufficient for the inference server.  NVIDIA recommends the use of the following flags:
   nvidia-docker run --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 ...

  > Jarvis waiting for Triton server to load all models...retrying in 1 second
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0628 04:05:47.190740 70 metrics.cc:221] Collecting metrics for GPU 0: Tesla T4
I0628 04:05:47.240207 70 onnxruntime.cc:1728] TRITONBACKEND_Initialize: onnxruntime
I0628 04:05:47.240817 70 onnxruntime.cc:1738] Triton TRITONBACKEND API version: 1.0
I0628 04:05:47.240830 70 onnxruntime.cc:1744] 'onnxruntime' TRITONBACKEND API version: 1.0
I0628 04:05:47.453731 70 pinned_memory_manager.cc:205] Pinned memory pool is created at '0x7f2098000000' with size 268435456
I0628 04:05:47.454794 70 cuda_memory_manager.cc:103] CUDA memory pool is created on device 0 with size 1000000000
I0628 04:05:47.472853 70 model_repository_manager.cc:787] loading: jarvis-asr-ctc-decoder-cpu-streaming:1
I0628 04:05:47.573070 70 model_repository_manager.cc:787] loading: jarvis-asr-feature-extractor-streaming:1
I0628 04:05:47.573322 70 custom_backend.cc:198] Creating instance jarvis-asr-ctc-decoder-cpu-streaming_0_0_cpu on CPU using libtriton_jarvis_asr_decoder_cpu.so
I0628 04:05:47.624053 70 model_repository_manager.cc:960] successfully loaded 'jarvis-asr-ctc-decoder-cpu-streaming' version 1
I0628 04:05:47.673287 70 model_repository_manager.cc:787] loading: jarvis-asr-voice-activity-detector-ctc-streaming:1
I0628 04:05:47.673631 70 custom_backend.cc:201] Creating instance jarvis-asr-feature-extractor-streaming_0_0_gpu0 on GPU 0 (7.5) using libtriton_jarvis_asr_features.so
I0628 04:05:47.773488 70 model_repository_manager.cc:787] loading: jarvis-trt-jarvis-asr-am-streaming:1
I0628 04:05:47.773715 70 custom_backend.cc:198] Creating instance jarvis-asr-voice-activity-detector-ctc-streaming_0_0_cpu on CPU using libtriton_jarvis_asr_vad.so
I0628 04:05:47.852341 70 model_repository_manager.cc:960] successfully loaded 'jarvis-asr-voice-activity-detector-ctc-streaming' version 1
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
E0628 04:06:02.545757 70 logging.cc:43] INVALID_CONFIG: The engine plan file is not compatible with this version of TensorRT, expecting library version 7.2.2 got 7.2.1, please rebuild.
E0628 04:06:02.545803 70 logging.cc:43] engine.cpp (1646) - Serialization Error in deserialize: 0 (Core engine deserialization failure)
E0628 04:06:02.553355 70 logging.cc:43] INVALID_STATE: std::exception
E0628 04:06:02.553397 70 logging.cc:43] INVALID_CONFIG: Deserialize the cuda engine failed.
E0628 04:06:02.554710 70 model_repository_manager.cc:963] failed to load 'jarvis-trt-jarvis-asr-am-streaming' version 1: Internal: unable to create TensorRT engine
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0628 04:06:03.232984 70 model_repository_manager.cc:960] successfully loaded 'jarvis-asr-feature-extractor-streaming' version 1
E0628 04:06:03.233056 70 model_repository_manager.cc:1160] Invalid argument: ensemble 'jarvis-asr' depends on 'jarvis-trt-jarvis-asr-am-streaming' which has no loaded version
I0628 04:06:03.233142 70 server.cc:495] 
+-------------+-----------------------------------------------------------------+------+
| Backend     | Config                                                          | Path |
+-------------+-----------------------------------------------------------------+------+
| onnxruntime | /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so | {}   |
+-------------+-----------------------------------------------------------------+------+

I0628 04:06:03.233204 70 server.cc:538] 
+--------------------------------------------------+---------+---------------------------------------------------------+
| Model                                            | Version | Status                                                  |
+--------------------------------------------------+---------+---------------------------------------------------------+
| jarvis-asr                                       | -       | Not loaded: No model version was found                  |
| jarvis-asr-ctc-decoder-cpu-streaming             | 1       | READY                                                   |
| jarvis-asr-feature-extractor-streaming           | 1       | READY                                                   |
| jarvis-asr-voice-activity-detector-ctc-streaming | 1       | READY                                                   |
| jarvis-trt-jarvis-asr-am-streaming               | 1       | UNAVAILABLE: Internal: unable to create TensorRT engine |
+--------------------------------------------------+---------+---------------------------------------------------------+

I0628 04:06:03.233297 70 tritonserver.cc:1642] 
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------+
| Option                           | Value                                                                                                                                              |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id                        | triton                                                                                                                                             |
| server_version                   | 2.7.0                                                                                                                                              |
| server_extensions                | classification sequence model_repository schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics |
| model_repository_path[0]         | /data/models                                                                                                                                       |
| model_control_mode               | MODE_NONE                                                                                                                                          |
| strict_model_config              | 1                                                                                                                                                  |
| pinned_memory_pool_byte_size     | 268435456                                                                                                                                          |
| cuda_memory_pool_byte_size{0}    | 1000000000                                                                                                                                         |
| min_supported_compute_capability | 6.0                                                                                                                                                |
| strict_readiness                 | 1                                                                                                                                                  |
| exit_timeout                     | 30                                                                                                                                                 |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------+

I0628 04:06:03.233308 70 server.cc:220] Waiting for in-flight requests to complete.
I0628 04:06:03.233315 70 model_repository_manager.cc:820] unloading: jarvis-asr-voice-activity-detector-ctc-streaming:1
I0628 04:06:03.233368 70 model_repository_manager.cc:820] unloading: jarvis-asr-feature-extractor-streaming:1
I0628 04:06:03.233556 70 model_repository_manager.cc:820] unloading: jarvis-asr-ctc-decoder-cpu-streaming:1
I0628 04:06:03.233722 70 server.cc:235] Timeout 30: Found 3 live models and 0 in-flight non-inference requests
I0628 04:06:03.439320 70 model_repository_manager.cc:943] successfully unloaded 'jarvis-asr-voice-activity-detector-ctc-streaming' version 1
I0628 04:06:03.439458 70 model_repository_manager.cc:943] successfully unloaded 'jarvis-asr-ctc-decoder-cpu-streaming' version 1
I0628 04:06:03.445044 70 model_repository_manager.cc:943] successfully unloaded 'jarvis-asr-feature-extractor-streaming' version 1
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0628 04:06:04.233848 70 server.cc:235] Timeout 29: Found 0 live models and 0 in-flight non-inference requests
error: creating server: Internal - failed to load all models
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
  > Triton server died before reaching ready state. Terminating Jarvis startup.
Check Triton logs with: docker logs 
/opt/jarvis/bin/start-jarvis: line 1: kill: (70) - No such process

NVES · June 28, 2021, 4:37am

Hi,
We recommend you to raise this query in TRITON forum for better assistance.

Thanks!

SunilJB · June 28, 2021, 6:52am

Hi @namanveer2000

Could you please try to run jarvis_clean.sh and then jarvis_init.sh.
To me, it looks like you may have downloaded a new version and perhaps trying to install over top of an old version because of this:
[E] [TRT] INVALID_CONFIG: The engine plan file is not compatible with this version of TensorRT, expecting library version 7.2.1 got 7.2.2, please rebuild.

Thanks

namanveer2000 · June 30, 2021, 11:45am

Hi @SunilJB. We have tried this approach initially and the models which were created as a result of jarvis_init.sh were giving us an improper and gibberish output. Due to this, we used a different method to convert Nemo to Jarvis.

namanveer2000 · July 16, 2021, 6:09am

We have converted the model using the 1.0.0-b.3 servicemaker instead of 1.0.0-b.2 servicemaker. By this we were able to resolve the TensorRT version error.

Topic		Replies	Views
Jarvis: Triton server timed-out before ready state Riva riva	2	1960	July 5, 2021
Trying to run jarvis Riva riva	3	892	March 18, 2021
Jarvis Support for GPU RTX 2060 Riva riva	4	1010	July 19, 2021
Jarvis Installation Issue: "Waiting for Jarvis server to load all models...retrying in 10 seconds" when running sudo bash jarvis_start.sh Riva riva	3	1671	July 8, 2021
Jarvis: Triton server died before reaching ready state. Terminating Jarvis startup Riva riva	6	2192	October 12, 2021
Waiting for Jarvis server to load all models...retrying in 10 seconds Riva riva	7	2697	April 30, 2021
Error during jarvis_init.sh for jarvis 1.2.1 beta Riva riva	7	1763	December 22, 2021
Trying jarvis ont the local docker Riva riva	12	1008	October 12, 2021
Jarvis_start.sh times out Riva riva	6	1803	May 4, 2021
Health ready check failed error whe running bash jarvis_start.sh Riva riva	4	1751	May 28, 2021

Jarvis-asr: jarvis_start.sh times out [TensorRT version error]

Related topics