- Hardware: Jetson AGX Orin Developer Kit 64GB
- Operating System: Ubuntu 22.04 w/ Jetpack 6.0
- Riva Version: v2.16.0
Hey team, I’m trying to run a text classification NLP task, but when I download a sample model from NGC, the Triton server crashes.
This is how to reproduce the issue:
-
I’m able to run
riva_init.sh
andriva_start.sh
with defaultconfig.sh
succesfully. -
Download the RIVA Intent Slot model from NGC. Apparently, the quick start for embedded does not include NLP models, so I have to download this one separately. (If there is an easier way to try text classification, please advise.)
-
Then I follow these instructions to deploy the
.riva
model. First, I launch the ServiceMaker image:
docker run --gpus all -it --rm -v /ssd/code/riva_models:/servicemaker-dev \
-v /ssd/code/riva_quickstart_arm64_v2.16.0/model_repository:/data \
--entrypoint="/bin/bash" nvcr.io/nvidia/riva/riva-speech:2.16.0-servicemaker-l4t-aarch64
- Then I do
riva-build
:
riva-build intent_slot \
/servicemaker-dev/domain_model_misty.rmir:tlt_encode \
/servicemaker-dev/domain_model_misty.riva:tlt_encode
- The I do
riva-deploy
:
riva-deploy /servicemaker-dev/domain_model_misty.rmir:tlt_encode /data/models
- At this point, I can exit the ServiceMaker container, and I can relaunch Riva. I can see the model being loaded successfully.
I0829 17:16:16.099950 20 http_server.cc:282] Started Metrics Service at 0.0.0.0:8002
I0829 17:16:23.944136 22 model_registry.cc:143] Successfully registered: conformer-en-US-asr-streaming-asr-bls-ensemble for ASR Triton URI: localhost:8001
I0829 17:16:23.996660 22 model_registry.cc:143] Successfully registered: riva-punctuation-en-US for NLP Triton URI: localhost:8001
I0829 17:16:24.008618 22 model_registry.cc:143] Successfully registered: riva_intent_default for NLP Triton URI: localhost:8001
I0829 17:16:24.038719 22 model_registry.cc:143] Successfully registered: riva-punctuation-en-US for NLP Triton URI: localhost:8001
I0829 17:16:24.050558 22 model_registry.cc:143] Successfully registered: riva_intent_default for NLP Triton URI: localhost:8001
I0829 17:16:24.068651 22 model_registry.cc:143] Successfully registered: fastpitch_hifigan_ensemble-English-US for TTS Triton URI: localhost:8001
I0829 17:16:24.138864 22 riva_server.cc:171] Riva Conversational AI Server listening on 0.0.0.0:50051
- Then I use the Python client to launch the following text classification request:
result = nlp_service.classify_text(
input_strings=["Do I need an umbrella today?", "Tell me a joke."],
model_name="riva_intent_default",
language_code="en-US",
)
- The server crashes with the following logs:
I0829 17:16:44.314632 187 grpc_riva_nlp.cc:52] NLPService.ClassifyText called for riva_intent_default.
Signal (11) received.
0# 0x0000AAAAE986BCCC in tritonserver
1# __kernel_rt_sigreturn in linux-vdso.so.1
E0829 17:16:44.801805 187 client_object.cc:116] error: failed to do inference: Socket closed
/opt/riva/bin/start-riva: line 59: 20 Segmentation fault (core dumped) ${CUSTOM_TRITON_ENV} tritonserver --log-verbose=0 --disable-auto-complete-config $model_repos --cuda-memory-pool-byte-size=0:1000000000
One of the processes has exited unexpectedly. Stopping container.
W0829 17:16:49.018988 22 riva_server.cc:195] Signal: 15
How do I fix this crash? Alternatively, is there another intent classification model I can use to test basic functionality? Thanks, team, for your support.