Jarvis Support for GPU RTX 2060

output to Nvidia-smi

Thu Jul 8 12:02:16 2021
±----------------------------------------------------------------------------+
| NVIDIA-SMI 460.27.04 Driver Version: 460.27.04 CUDA Version: 11.2 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GeForce RTX 2060 On | 00000000:01:00.0 Off | N/A |
| N/A 70C P0 64W / N/A | 753MiB / 5934MiB | 42% Default |
| | | N/A |
±------------------------------±---------------------±---------------------+

±----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 1196 G /usr/lib/xorg/Xorg 416MiB |
| 0 N/A N/A 2276 G /usr/bin/gnome-shell 198MiB |
| 0 N/A N/A 8689 G …/debug.log --shared-files 32MiB |
| 0 N/A N/A 20080 G …_18131.log --shared-files 2MiB |
| 0 N/A N/A 24661 G glmark2 4MiB |
| 0 N/A N/A 30884 G …AAAAAAAAA= --shared-files 61MiB |
| 0 N/A N/A 30964 G …AAAAAAAAA= --shared-files 34MiB |
±----------------------------------------------------------------------------+

I have been trying to setup jarvis… by during running jarvis_init.sh I am getting conversion error…

Logging into NGC docker registry if necessary…
Pulling required docker images if necessary…
Note: This may take some time, depending on the speed of your Internet connection.

Pulling Jarvis Speech Server images.
Image nvcr.io/nvidia/jarvis/jarvis-speech:1.1.0-beta-server exists. Skipping.
Image nvcr.io/nvidia/jarvis/jarvis-speech-client:1.1.0-beta exists. Skipping.
Image nvcr.io/nvidia/jarvis/jarvis-speech:1.1.0-beta-servicemaker exists. Skipping.

Downloading models (JMIRs) from NGC…
Note: this may take some time, depending on the speed of your Internet connection.
To skip this process and use existing JMIRs set the location and corresponding flag in config.sh.

==========================
== Jarvis Speech Skills ==

NVIDIA Release (build 21060478)

Copyright (c) 2018-2021, NVIDIA CORPORATION. All rights reserved.

Various files include modifications (c) NVIDIA CORPORATION. All rights reserved.
NVIDIA modifications are covered by the license terms that apply to the underlying
project or file.

NOTE: The SHMEM allocation limit is set to the default of 64MB. This may be
insufficient for the inference server. NVIDIA recommends the use of the following flags:
nvidia-docker run --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 …

/data/artifacts /opt/jarvis

Downloading nvidia/jarvis/jmir_punctuation:1.0.0-b.1…
Downloaded 418.11 MB in 4m 11s, Download speed: 1.66 MB/s


Transfer id: jmir_punctuation_v1.0.0-b.1 Download status: Completed.
Downloaded local path: /data/artifacts/jmir_punctuation_v1.0.0-b.1
Total files downloaded: 1
Total downloaded size: 418.11 MB
Started at: 2021-07-08 06:18:40.041490
Completed at: 2021-07-08 06:22:51.382292
Duration taken: 4m 11s

/opt/jarvis

Converting JMIRs at jarvis-model-repo/jmir to Jarvis Model repository.

==========================
== Jarvis Speech Skills ==

NVIDIA Release (build 21060478)

Copyright (c) 2018-2021, NVIDIA CORPORATION. All rights reserved.

Various files include modifications (c) NVIDIA CORPORATION. All rights reserved.
NVIDIA modifications are covered by the license terms that apply to the underlying
project or file.

NOTE: The SHMEM allocation limit is set to the default of 64MB. This may be
insufficient for the inference server. NVIDIA recommends the use of the following flags:
nvidia-docker run --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 …

2021-07-08 06:22:56,101 [INFO] Writing Jarvis model repository to ‘/data/models’…
2021-07-08 06:22:56,101 [INFO] The jarvis model repo target directory is /data/models
2021-07-08 06:22:57,073 [INFO] Extract_binaries for tokenizer → /data/models/jarvis_tokenizer/1
2021-07-08 06:22:58,068 [INFO] Extract_binaries for language_model → /data/models/jarvis-trt-jarvis_punctuation-nn-bert-base-uncased/1
2021-07-08 06:23:01,484 [INFO] Building TRT engine from PyTorch Checkpoint
[TensorRT] ERROR: …/rtSafe/safeRuntime.cpp (25) - Cuda Error in allocate: 2 (out of memory)
[TensorRT] ERROR: …/rtSafe/safeRuntime.cpp (25) - Cuda Error in allocate: 2 (out of memory)
Traceback (most recent call last):
File “/opt/conda/lib/python3.8/site-packages/servicemaker/triton/export_bert_pytorch_to_trt.py”, line 1200, in
pytorch_to_trt()
File “/opt/conda/lib/python3.8/site-packages/servicemaker/triton/export_bert_pytorch_to_trt.py”, line 1159, in pytorch_to_trt
return convert_pytorch_to_trt(
File “/opt/conda/lib/python3.8/site-packages/servicemaker/triton/export_bert_pytorch_to_trt.py”, line 963, in convert_pytorch_to_trt
with build_engine(
AttributeError: enter
2021-07-08 06:23:15,549 [ERROR] Traceback (most recent call last):
File “/opt/conda/lib/python3.8/site-packages/servicemaker/cli/deploy.py”, line 88, in deploy_from_jmir
generator.serialize_to_disk(
File “/opt/conda/lib/python3.8/site-packages/servicemaker/triton/triton.py”, line 341, in serialize_to_disk
module.serialize_to_disk(repo_dir, jmir, config_only, verbose, overwrite)
File “/opt/conda/lib/python3.8/site-packages/servicemaker/triton/triton.py”, line 232, in serialize_to_disk
self.update_binary(version_dir, jmir, verbose)
File “/opt/conda/lib/python3.8/site-packages/servicemaker/triton/triton.py”, line 489, in update_binary
bindings = self.build_trt_engine_from_pytorch_bert(
File “/opt/conda/lib/python3.8/site-packages/servicemaker/triton/triton.py”, line 455, in build_trt_engine_from_pytorch_bert
raise Exception(“convert_pytorch_to_trt failed.”)
Exception: convert_pytorch_to_trt failed.

  • echo

  • echo ‘Jarvis initialization complete. Run ./jarvis_start.sh to launch services.’
    Jarvis initialization complete. Run ./jarvis_start.sh to launch services.

Please help to solve the issue.
Also, I wanted to know Jarvis compatibility with my system?

Thank you

Gentle reminder… I am still looking forward for solution…!!

Hi @jyoti.khetan
Please refer to below support matrix:
https://docs.nvidia.com/deeplearning/jarvis/user-guide/docs/support-matrix.html

 Care must be taken to not exceed the memory available when selecting models to deploy.

Also, regarding OOM issue, could you please try commenting out all the NLP models except 1 and see if that deploys successfully on your setup.

Thanks

Thanks for reverting back.

I am have done that already & checked…!! attaching my modified config.sh…

Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.

NVIDIA CORPORATION and its licensors retain all intellectual property

and proprietary rights in and to this software, related documentation

and any modifications thereto. Any use, reproduction, disclosure or

distribution of this software and related documentation without an express

license agreement from NVIDIA CORPORATION is strictly prohibited.

Enable or Disable Jarvis Services

service_enabled_asr=false
service_enabled_nlp=true
service_enabled_tts=false

Specify one or more GPUs to use

specifying more than one GPU is currently an experimental feature, and may result in undefined behaviours.

gpus_to_use=“device=0”

Specify the encryption key to use to deploy models

MODEL_DEPLOY_KEY=“tlt_encode”

Locations to use for storing models artifacts

If an absolute path is specified, the data will be written to that location

Otherwise, a docker volume will be used (default).

jarvis_init.sh will create a jmir and models directory in the volume or

path specified.

JMIR ($jarvis_model_loc/jmir)

Jarvis uses an intermediate representation (JMIR) for models

that are ready to deploy but not yet fully optimized for deployment. Pretrained

versions can be obtained from NGC (by specifying NGC models below) and will be

downloaded to $jarvis_model_loc/jmir by jarvis_init.sh

Custom models produced by NeMo or TLT and prepared using jarvis-build

may also be copied manually to this location $(jarvis_model_loc/jmir).

Models ($jarvis_model_loc/models)

During the jarvis_init process, the JMIR files in $jarvis_model_loc/jmir

are inspected and optimized for deployment. The optimized versions are

stored in $jarvis_model_loc/models. The jarvis server exclusively uses these

optimized versions.

jarvis_model_loc=“jarvis-model-repo”

The default JMIRs are downloaded from NGC by default in the above $jarvis_jmir_loc directory

If you’d like to skip the download from NGC and use the existing JMIRs in the $jarvis_jmir_loc

then set the below $use_existing_jmirs flag to true. You can also deploy your set of custom

JMIRs by keeping them in the jarvis_jmir_loc dir and use this quickstart script with the

below flag to deploy them all together.

use_existing_jmirs=false

Ports to expose for Jarvis services

jarvis_speech_api_port=“50051”
jarvis_vision_api_port=“60051”

NGC orgs

jarvis_ngc_org=“nvidia”
jarvis_ngc_team=“jarvis”
jarvis_ngc_image_version=“1.1.0-beta”
jarvis_ngc_model_version=“1.0.0-b.1”

Pre-built models listed below will be downloaded from NGC. If models already exist in $jarvis-jmir

then models can be commented out to skip download from NGC

#models_asr=(

Punctuation model

“${jarvis_ngc_org}/${jarvis_ngc_team}/jmir_punctuation:${jarvis_ngc_model_version}”

Jasper Streaming w/ CPU decoder, best latency configuration

“${jarvis_ngc_org}/${jarvis_ngc_team}/jmir_jarvis_asr_jasper_english_streaming:${jarvis_ngc_model_version}”

Jasper Streaming w/ CPU decoder, best throughput configuration

“${jarvis_ngc_org}/${jarvis_ngc_team}/jmir_jarvis_asr_jasper_english_streaming_throughput:${jarvis_ngc_model_version}”

Jasper Offline w/ CPU decoder

“${jarvis_ngc_org}/${jarvis_ngc_team}/jmir_jarvis_asr_jasper_english_offline:${jarvis_ngc_model_version}”

Quarztnet Streaming w/ CPU decoder, best latency configuration

“${jarvis_ngc_org}/${jarvis_ngc_team}/jmir_jarvis_asr_quartznet_english_streaming:${jarvis_ngc_model_version}”

Quarztnet Streaming w/ CPU decoder, best throughput configuration

“${jarvis_ngc_org}/${jarvis_ngc_team}/jmir_jarvis_asr_quartznet_english_streaming_throughput:${jarvis_ngc_model_version}”

Quarztnet Offline w/ CPU decoder

“${jarvis_ngc_org}/${jarvis_ngc_team}/jmir_jarvis_asr_quartznet_english_offline:${jarvis_ngc_model_version}”

Jasper Streaming w/ GPU decoder, best latency configuration

“${jarvis_ngc_org}/${jarvis_ngc_team}/jmir_jarvis_asr_jasper_english_streaming_gpu_decoder:${jarvis_ngc_model_version}”

Jasper Streaming w/ GPU decoder, best throughput configuration

“${jarvis_ngc_org}/${jarvis_ngc_team}/jmir_jarvis_asr_jasper_english_streaming_throughput_gpu_decoder:${jarvis_ngc_model_version}”

Jasper Offline w/ GPU decoder

“${jarvis_ngc_org}/${jarvis_ngc_team}/jmir_jarvis_asr_jasper_english_offline_gpu_decoder:${jarvis_ngc_model_version}”

#)

models_nlp=(
“${jarvis_ngc_org}/${jarvis_ngc_team}/jmir_punctuation:${jarvis_ngc_model_version}”

“${jarvis_ngc_org}/${jarvis_ngc_team}/jmir_named_entity_recognition:${jarvis_ngc_model_version}”

“${jarvis_ngc_org}/${jarvis_ngc_team}/jmir_intent_slot:${jarvis_ngc_model_version}”

“${jarvis_ngc_org}/${jarvis_ngc_team}/jmir_question_answering:${jarvis_ngc_model_version}”

“${jarvis_ngc_org}/${jarvis_ngc_team}/jmir_text_classification:${jarvis_ngc_model_version}”

)
#models_tts=(

“${jarvis_ngc_org}/${jarvis_ngc_team}/jmir_jarvis_tts_ljspeech:${jarvis_ngc_model_version}”

#)

NGC_TARGET=${jarvis_ngc_org}
if [[ ! -z ${jarvis_ngc_team} ]]; then
NGC_TARGET="${NGC_TARGET}/${jarvis_ngc_team}"
else
team=""""
fi

define docker images required to run Jarvis

image_client=“nvcr.io/${NGC_TARGET}/jarvis-speech-client:${jarvis_ngc_image_version}
image_speech_api=“nvcr.io/${NGC_TARGET}/jarvis-speech:${jarvis_ngc_image_version}-server

define docker images required to setup Jarvis

image_init_speech=“nvcr.io/${NGC_TARGET}/jarvis-speech:${jarvis_ngc_image_version}-servicemaker

daemon names

jarvis_daemon_speech=“jarvis-speech”
jarvis_daemon_client=“jarvis-client”

Also about of support matrix, I have checked the requirement…It matches as mentioned…!!!

Please let me know what’s needs to be done here.

Thank you

I also wanted to mention… that I have used only one model of NLP model - jmir_punctuation …!! Still I am getting issue…

Logging into NGC docker registry if necessary…
Pulling required docker images if necessary…
Note: This may take some time, depending on the speed of your Internet connection.

Pulling Jarvis Speech Server images.
Image nvcr.io/nvidia/jarvis/jarvis-speech:1.1.0-beta-server exists. Skipping.
Image nvcr.io/nvidia/jarvis/jarvis-speech-client:1.1.0-beta exists. Skipping.
Image nvcr.io/nvidia/jarvis/jarvis-speech:1.1.0-beta-servicemaker exists. Skipping.

Downloading models (JMIRs) from NGC…
Note: this may take some time, depending on the speed of your Internet connection.
To skip this process and use existing JMIRs set the location and corresponding flag in config.sh.

==========================
== Jarvis Speech Skills ==

NVIDIA Release (build 21060478)

Copyright (c) 2018-2021, NVIDIA CORPORATION. All rights reserved.

Various files include modifications (c) NVIDIA CORPORATION. All rights reserved.
NVIDIA modifications are covered by the license terms that apply to the underlying
project or file.

NOTE: The SHMEM allocation limit is set to the default of 64MB. This may be
insufficient for the inference server. NVIDIA recommends the use of the following flags:
nvidia-docker run --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 …

/data/artifacts /opt/jarvis

Downloading nvidia/jarvis/jmir_punctuation:1.0.0-b.1…
Downloaded 418.11 MB in 3m 6s, Download speed: 2.24 MB/s


Transfer id: jmir_punctuation_v1.0.0-b.1 Download status: Completed.
Downloaded local path: /data/artifacts/jmir_punctuation_v1.0.0-b.1
Total files downloaded: 1
Total downloaded size: 418.11 MB
Started at: 2021-07-08 07:22:01.388678
Completed at: 2021-07-08 07:25:07.646512
Duration taken: 3m 6s

/opt/jarvis

Converting JMIRs at jarvis-model-repo/jmir to Jarvis Model repository.

==========================
== Jarvis Speech Skills ==

NVIDIA Release (build 21060478)

Copyright (c) 2018-2021, NVIDIA CORPORATION. All rights reserved.

Various files include modifications (c) NVIDIA CORPORATION. All rights reserved.
NVIDIA modifications are covered by the license terms that apply to the underlying
project or file.

NOTE: The SHMEM allocation limit is set to the default of 64MB. This may be
insufficient for the inference server. NVIDIA recommends the use of the following flags:
nvidia-docker run --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 …

2021-07-08 07:25:21,784 [INFO] Writing Jarvis model repository to ‘/data/models’…
2021-07-08 07:25:21,784 [INFO] The jarvis model repo target directory is /data/models
2021-07-08 07:25:22,780 [INFO] Extract_binaries for tokenizer → /data/models/jarvis_tokenizer/1
2021-07-08 07:25:23,762 [INFO] Extract_binaries for language_model → /data/models/jarvis-trt-jarvis_punctuation-nn-bert-base-uncased/1
2021-07-08 07:25:27,322 [INFO] Building TRT engine from PyTorch Checkpoint
[TensorRT] ERROR: …/rtSafe/safeRuntime.cpp (25) - Cuda Error in allocate: 2 (out of memory)
[TensorRT] ERROR: …/rtSafe/safeRuntime.cpp (25) - Cuda Error in allocate: 2 (out of memory)
Traceback (most recent call last):
File “/opt/conda/lib/python3.8/site-packages/servicemaker/triton/export_bert_pytorch_to_trt.py”, line 1200, in
pytorch_to_trt()
File “/opt/conda/lib/python3.8/site-packages/servicemaker/triton/export_bert_pytorch_to_trt.py”, line 1159, in pytorch_to_trt
return convert_pytorch_to_trt(
File “/opt/conda/lib/python3.8/site-packages/servicemaker/triton/export_bert_pytorch_to_trt.py”, line 963, in convert_pytorch_to_trt
with build_engine(
AttributeError: enter
2021-07-08 07:25:40,656 [ERROR] Traceback (most recent call last):
File “/opt/conda/lib/python3.8/site-packages/servicemaker/cli/deploy.py”, line 88, in deploy_from_jmir
generator.serialize_to_disk(
File “/opt/conda/lib/python3.8/site-packages/servicemaker/triton/triton.py”, line 341, in serialize_to_disk
module.serialize_to_disk(repo_dir, jmir, config_only, verbose, overwrite)
File “/opt/conda/lib/python3.8/site-packages/servicemaker/triton/triton.py”, line 232, in serialize_to_disk
self.update_binary(version_dir, jmir, verbose)
File “/opt/conda/lib/python3.8/site-packages/servicemaker/triton/triton.py”, line 489, in update_binary
bindings = self.build_trt_engine_from_pytorch_bert(
File “/opt/conda/lib/python3.8/site-packages/servicemaker/triton/triton.py”, line 455, in build_trt_engine_from_pytorch_bert
raise Exception(“convert_pytorch_to_trt failed.”)
Exception: convert_pytorch_to_trt failed.

  • echo

  • echo ‘Jarvis initialization complete. Run ./jarvis_start.sh to launch services.’
    Jarvis initialization complete. Run ./jarvis_start.sh to launch services.

Hi @jyoti.khetan

Sorry for delayed response. I think issue might be due to GPU memory requirement not met.
https://docs.nvidia.com/deeplearning/jarvis/user-guide/docs/support-matrix.html#hardware

Thanks