Please provide the following information when requesting support.
• Hardware (T4)
• Network Type (Detectnet_v2/Faster_rcnn/Yolo_v4/LPRnet/Mask_rcnn/Classification/etc)
• TLT Version (Please run “tlt info --verbose” and share “docker_tag” here)
tao info --verbose
Configuration of the TAO Toolkit Instance
dockers:
nvidia/tao/tao-toolkit-tf:
docker_registry: nvcr.io
docker_tag: v3.21.08-py3
tasks:
- augment
- bpnet
- classification
- detectnet_v2
- dssd
- emotionnet
- faster_rcnn
- fpenet
- gazenet
- gesturenet
- heartratenet
- lprnet
- mask_rcnn
- multitask_classification
- retinanet
- ssd
- unet
- yolo_v3
- yolo_v4
- converter
nvidia/tao/tao-toolkit-pyt:
docker_registry: nvcr.io
docker_tag: v3.21.08-py3
tasks: - speech_to_text
- speech_to_text_citrinet
- text_classification
- question_answering
- token_classification
- intent_slot_classification
- punctuation_and_capitalization
nvidia/tao/tao-toolkit-lm:
docker_registry: nvcr.io
docker_tag: v3.21.08-py3
tasks: - n_gram
format_version: 1.0
toolkit_version: 3.21.08
published_date: 08/17/2021
nvidia-smi -L
GPU 0: Tesla T4 (UUID: GPU-9a3d7360-595d-cb85-a728-26f7058bc5c7)
nvidia-smi
Fri Aug 27 08:58:00 2021
±----------------------------------------------------------------------------+
| NVIDIA-SMI 470.57.02 Driver Version: 470.57.02 CUDA Version: 11.4 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla T4 On | 00000000:00:1E.0 Off | 0 |
| N/A 32C P8 10W / 70W | 0MiB / 15109MiB | 0% Default |
| | | N/A |
±------------------------------±---------------------±---------------------+
±----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
±----------------------------------------------------------------------------+
docker version
Client: Docker Engine - Community
Version: 20.10.7
API version: 1.41
Go version: go1.13.15
Git commit: f0df350
Built: Wed Jun 2 11:56:40 2021
OS/Arch: linux/amd64
Context: default
Experimental: true
Server: Docker Engine - Community
Engine:
Version: 20.10.7
API version: 1.41 (minimum version 1.12)
Go version: go1.13.15
Git commit: b0f5bc3
Built: Wed Jun 2 11:54:48 2021
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.4.6
GitCommit: d71fcd7d8303cbf684402823e425e9dd2e99285d
runc:
Version: 1.0.0-rc95
GitCommit: b9ee9c6314599f1b4a7f497e1f1f856fe433d3b7
docker-init:
Version: 0.19.0
GitCommit: de40ad0
• Training spec file(If have, please share here)
infer.yaml
////////////////////////////////////
Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
TLT Spec file for inference using a previously pretrained BERT model for a text classification task.
“Simulate” user input: batch with four samples.
input_batch:
- “by the end of no such thing the audience , like beatrice , has a watchful affection for the monster .”
- “director rob marshall went out gunning to make a great one .”
- “uneasy mishmash of styles and genres .”
- “I love exotic science fiction / fantasy movies but this one was very unpleasant to watch . Suggestions and images of child abuse , mutilated bodies (live or dead) , other gruesome scenes , plot holes , boring acting made this a regretable experience , The basic idea of entering another person’s mind is not even new to the movies or TV (An Outer Limits episode was better at exploring this idea) . i gave it 4 / 10 since some special effects were nice .”
• How to reproduce the issue ? (This is for errors. Please share the command line and the detailed log here.)
ain,download_specs}
text_classification: error: the following arguments are required: -r/–results_dir
2021-09-01 10:25:43,076 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.
(taoenv) ubuntu@ip-172-31-14-240:~/tao$ tao text_classification infer -e /specs/nlp/text_classification/infer.yaml -r /results/nlp/text_classification/infer -m /results/nlp/text_classification/train/checkpoints/trained-model.tlt -g 1 -k $KEY
2021-09-01 10:27:38,950 [INFO] root: Registry: [‘nvcr.io’]
2021-09-01 10:27:39,055 [WARNING] tlt.components.docker_handler.docker_handler:
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the “user”:“UID:GID” in the
DockerOptions portion of the “/home/ubuntu/.tao_mounts.json” file. You can obtain your
users UID and GID by using the “id -u” and “id -g” commands on the
terminal.
[NeMo W 2021-09-01 10:27:42 experimental:27] Module <class ‘nemo.collections.nlp.modules.common.megatron.megatron_bert.MegatronBertEncoder’> is experimental, not ready for production and is not fully supported. Use at your own risk.
[NeMo W 2021-09-01 10:27:45 experimental:27] Module <class ‘nemo.collections.nlp.modules.common.megatron.megatron_bert.MegatronBertEncoder’> is experimental, not ready for production and is not fully supported. Use at your own risk.
[NeMo I 2021-09-01 10:27:46 tlt_logging:20] Experiment configuration:
restore_from: /results/nlp/text_classification/train/checkpoints/trained-model.tlt
exp_manager:
task_name: infer
explicit_log_dir: /results/nlp/text_classification/infer
input_batch:
- by the end of no such thing the audience , like beatrice , has a watchful affection
for the monster . - director rob marshall went out gunning to make a great one .
- uneasy mishmash of styles and genres .
- I love exotic science fiction / fantasy movies but this one was very unpleasant
to watch . Suggestions and images of child abuse , mutilated bodies (live or dead)
, other gruesome scenes , plot holes , boring acting made this a regretable experience
, The basic idea of entering another person’s mind is not even new to the movies
or TV (An Outer Limits episode was better at exploring this idea) . i gave it 4
/ 10 since some special effects were nice .
encryption_key: ‘*****’
[NeMo W 2021-09-01 10:27:46 exp_manager:26] Exp_manager is logging to `/results/nlp/text_classification/infer``, but it already exists.
[NeMo W 2021-09-01 10:27:48 modelPT:193] Using /tmp/tmptv7d8fn6/tokenizer.vocab_file instead of tokenizer.vocab_file.
Using bos_token, but it is not set yet.
Using eos_token, but it is not set yet.
[NeMo W 2021-09-01 10:27:48 modelPT:1202] World size can only be set by PyTorch Lightning Trainer.
Traceback (most recent call last):
File “/opt/conda/lib/python3.8/site-packages/hydra/_internal/utils.py”, line 198, in run_and_report
return func()
File “/opt/conda/lib/python3.8/site-packages/hydra/_internal/utils.py”, line 347, in
lambda: hydra.run(
File “/opt/conda/lib/python3.8/site-packages/hydra/_internal/hydra.py”, line 107, in run
return run_job(
File “/opt/conda/lib/python3.8/site-packages/hydra/core/utils.py”, line 127, in run_job
ret.return_value = task_function(task_cfg)
File “/tlt-nemo/nlp/text_classification/scripts/infer.py”, line 83, in main
File “/opt/conda/lib/python3.8/posixpath.py”, line 142, in basename
p = os.fspath(p)
TypeError: expected str, bytes or os.PathLike object, not NoneType
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File “/tlt-nemo/nlp/text_classification/scripts/infer.py”, line 113, in
File “/opt/conda/lib/python3.8/site-packages/nemo/core/config/hydra_runner.py”, line 98, in wrapper
_run_hydra(
File “/opt/conda/lib/python3.8/site-packages/hydra/_internal/utils.py”, line 346, in _run_hydra
run_and_report(
File “/opt/conda/lib/python3.8/site-packages/hydra/_internal/utils.py”, line 237, in run_and_report
assert mdl is not None
AssertionError
2021-09-01 10:27:58,400 [INFO] tlt.components.docker_handler.docker_handler: Stopping container