Hi!
I am trying to deploy jarvis 1.2.1-beta on DGX Station by using helm chart.
The command is as follows:
helm -n jarvis-latest install jarvis --set ngcCredentials.password=echo -n $NGC_API_KEY | base64 -w0
--set modelRepoGenerator.modelDeployKey=echo -n model_key_string | base64 -w0
.
But the installation breaks on unpickling file
/data/models/jarvis-trt-jarvis_punctuation-nn-bert-base-uncased/1/model_weights.ckpt
...
[TensorRT] VERBOSE: Plugin creator already registered - ::Split version 1
[TensorRT] INFO: Using configuration file: /data/models/jarvis-trt-jarvis_punctuation-nn-bert-base-uncased/1/bert-base-uncased_encoder_config.json
[TensorRT] INFO: Hi!!!
[TensorRT] ERROR: invalid load key, '\xfa'.
<__main__.BertConfig object at 0x7fde58516b80>
/data/models/jarvis-trt-jarvis_punctuation-nn-bert-base-uncased/1/model_weights.ckpt
Traceback (most recent call last):
File "/opt/conda/lib/python3.8/site-packages/servicemaker/triton/export_bert_pytorch_to_trt.py", line 980, in <module>
pytorch_to_trt()
File "/opt/conda/lib/python3.8/site-packages/servicemaker/triton/export_bert_pytorch_to_trt.py", line 939, in pytorch_to_trt
return convert_pytorch_bert_to_trt(
File "/opt/conda/lib/python3.8/site-packages/servicemaker/triton/export_bert_pytorch_to_trt.py", line 772, in convert_pytorch_bert_to_trt
weights_dict = load_weights(bert_weight, config)
File "/opt/conda/lib/python3.8/site-packages/servicemaker/triton/export_bert_pytorch_to_trt.py", line 515, in load_weights
weights_dict.update(additional_dict)
UnboundLocalError: local variable 'additional_dict' referenced before assignment
2021-07-02 19:20:20,384 [ERROR] Traceback (most recent call last):
File "/opt/conda/lib/python3.8/site-packages/servicemaker/cli/deploy.py", line 87, in deploy_from_jmir
generator.serialize_to_disk(
File "/opt/conda/lib/python3.8/site-packages/servicemaker/triton/triton.py", line 340, in serialize_to_disk
module.serialize_to_disk(repo_dir, jmir, config_only, verbose, overwrite)
File "/opt/conda/lib/python3.8/site-packages/servicemaker/triton/triton.py", line 231, in serialize_to_disk
self.update_binary(version_dir, jmir, verbose)
File "/opt/conda/lib/python3.8/site-packages/servicemaker/triton/triton.py", line 569, in update_binary
bindings = self.build_trt_engine_from_pytorch_bert(
File "/opt/conda/lib/python3.8/site-packages/servicemaker/triton/triton.py", line 532, in build_trt_engine_from_pytorch_bert
raise Exception("convert_pytorch_to_trt failed.")
Exception: convert_pytorch_to_trt failed.
Would appreciate any help! Thank you!
Best,
Alex