TAO Lightning logs width in Jupyter output cell

Please provide the following information when requesting support.

• Hardware (8 x V100)
• Network Type (action_recognition)
• TLT Version (Please run “tlt info --verbose” and share “docker_tag” here)

tao info --verbose
dockers:                                                                                                                                                                                                   
        nvidia/tao/tao-toolkit:                                                                                                                                                                            
                4.0.0-tf2.9.1:                                                                                                                                                                             
                        docker_registry: nvcr.io                                                                                                                                                           
                        tasks:                                                                                                                                                                             
                                1. classification_tf2                                                                                                                                                      
                                2. efficientdet_tf2                                                                                                                                                        
                4.0.0-tf1.15.5:                                                                                                                                                                            
                        docker_registry: nvcr.io                                                                                                                                                           
                        tasks:                                                                                                                                                                             
                                1. augment
                                2. bpnet
                                3. classification_tf1
                                4. detectnet_v2
                                5. dssd
                                6. emotionnet
                                7. efficientdet_tf1
                                8. faster_rcnn
                                9. fpenet
                                10. gazenet
                                11. gesturenet
                                12. heartratenet
                                13. lprnet
                                14. mask_rcnn
                                15. multitask_classification
                                16. retinanet
                                17. ssd
                                18. unet
                                19. yolo_v3
                                20. yolo_v4
                                21. yolo_v4_tiny
                                22. converter
...
Configuration of the TAO Toolkit Instance
dockers: ['nvidia/tao/tao-toolkit']
format_version: 2.0
toolkit_version: 4.0.1
published_date: 03/06/2023

• How to reproduce the issue ? (This is for errors. Please share the command line and the detailed log here.)

When I start training some of the logs are trimmed in the Jupyter output cell. For example I can’t see val_loss in output of the cell.

Epoch 0:  98%|██████████▊| 787/800 [06:41<00:06,  1.96it/s, loss=0.664, v_num=0]Adjusting learning rate of group 0 to 1.0000e-02.
Epoch 0:  98%|██████████▊| 788/800 [06:52<00:06,  1.91it/s, loss=0.674, v_num=0]
Validation: 0it [00:00, ?it/s]
Validation DataLoader 0:   0%|                           | 0/12 [00:03<?, ?it/s]
Epoch 0:  99%|██████████▊| 789/800 [06:55<00:05,  1.90it/s, loss=0.674, v_num=0]
Epoch 0:  99%|██████████▊| 790/800 [06:56<00:05,  1.90it/s, loss=0.674, v_num=0]
Epoch 0:  99%|██████████▉| 791/800 [06:56<00:04,  1.90it/s, loss=0.674, v_num=0]
Epoch 0:  99%|██████████▉| 792/800 [06:56<00:04,  1.90it/s, loss=0.674, v_num=0]
Epoch 0:  99%|██████████▉| 793/800 [06:56<00:03,  1.90it/s, loss=0.674, v_num=0]
Epoch 0:  99%|██████████▉| 794/800 [06:57<00:03,  1.90it/s, loss=0.674, v_num=0]
Epoch 0:  99%|██████████▉| 795/800 [06:57<00:02,  1.90it/s, loss=0.674, v_num=0]
Epoch 0: 100%|██████████▉| 796/800 [06:57<00:02,  1.91it/s, loss=0.674, v_num=0]
Epoch 0: 100%|██████████▉| 797/800 [06:58<00:01,  1.91it/s, loss=0.674, v_num=0]
Epoch 0: 100%|██████████▉| 798/800 [06:58<00:01,  1.91it/s, loss=0.674, v_num=0]
Epoch 0: 100%|██████████▉| 799/800 [06:58<00:00,  1.91it/s, loss=0.674, v_num=0]
Epoch 0: 100%|█| 800/800 [07:00<00:00,  1.90it/s, loss=0.674, v_num=0, val_loss=
Epoch 1:  98%|▉| 787/800 [13:23<00:13,  1.02s/it, loss=0.0738, v_num=0, val_lossAdjusting learning rate of group 0 to 1.0000e-02.
Epoch 1:  98%|▉| 788/800 [13:24<00:12,  1.02s/it, loss=0.0712, v_num=0, val_loss
Validation: 0it [00:00, ?it/s]
Validation DataLoader 0:   0%|                           | 0/12 [00:03<?, ?it/s]
Epoch 1:  99%|▉| 789/800 [13:28<00:11,  1.02s/it, loss=0.0712, v_num=0, val_loss
Epoch 1:  99%|▉| 790/800 [13:28<00:10,  1.02s/it, loss=0.0712, v_num=0, val_loss
Epoch 1:  99%|▉| 791/800 [13:28<00:09,  1.02s/it, loss=0.0712, v_num=0, val_loss
Epoch 1:  99%|▉| 792/800 [13:29<00:08,  1.02s/it, loss=0.0712, v_num=0, val_loss
Epoch 1:  99%|▉| 793/800 [13:29<00:07,  1.02s/it, loss=0.0712, v_num=0, val_loss
Epoch 1:  99%|▉| 794/800 [13:29<00:06,  1.02s/it, loss=0.0712, v_num=0, val_loss
Epoch 1:  99%|▉| 795/800 [13:29<00:05,  1.02s/it, loss=0.0712, v_num=0, val_loss
Epoch 1: 100%|▉| 796/800 [13:30<00:04,  1.02s/it, loss=0.0712, v_num=0, val_loss
Epoch 1: 100%|▉| 797/800 [13:30<00:03,  1.02s/it, loss=0.0712, v_num=0, val_loss
Epoch 1: 100%|▉| 798/800 [13:31<00:02,  1.02s/it, loss=0.0712, v_num=0, val_loss
Epoch 1: 100%|▉| 799/800 [13:31<00:01,  1.02s/it, loss=0.0712, v_num=0, val_loss
Epoch 1: 100%|█| 800/800 [13:31<00:00,  1.01s/it, loss=0.0712, v_num=0, val_loss
Epoch 2:  98%|▉| 787/800 [19:53<00:19,  1.52s/it, loss=0.000615, v_num=0, val_loAdjusting learning rate of group 0 to 1.0000e-02.
Epoch 2:  98%|▉| 788/800 [19:53<00:18,  1.52s/it, loss=0.000603, v_num=0, val_lo
Validation: 0it [00:00, ?it/s]
Validation DataLoader 0:   0%|                           | 0/12 [00:03<?, ?it/s]
Epoch 2:  99%|▉| 789/800 [19:57<00:16,  1.52s/it, loss=0.000603, v_num=0, val_lo
Epoch 2:  99%|▉| 790/800 [19:57<00:15,  1.52s/it, loss=0.000603, v_num=0, val_lo
Epoch 2:  99%|▉| 791/800 [19:58<00:13,  1.51s/it, loss=0.000603, v_num=0, val_lo
Epoch 2:  99%|▉| 792/800 [19:58<00:12,  1.51s/it, loss=0.000603, v_num=0, val_lo
Epoch 2:  99%|▉| 793/800 [19:58<00:10,  1.51s/it, loss=0.000603, v_num=0, val_lo
Epoch 2:  99%|▉| 794/800 [19:58<00:09,  1.51s/it, loss=0.000603, v_num=0, val_lo
Epoch 2:  99%|▉| 795/800 [19:59<00:07,  1.51s/it, loss=0.000603, v_num=0, val_lo
Epoch 2: 100%|▉| 796/800 [19:59<00:06,  1.51s/it, loss=0.000603, v_num=0, val_lo
Epoch 2: 100%|▉| 797/800 [20:00<00:04,  1.51s/it, loss=0.000603, v_num=0, val_lo

As mentioned in the notebook, to see the full log in std out, please run the command in a terminal.
You can open a terminal to run instead of running inside notebook.
For example,
$ tao action_recognition run /bin/bash

Then inside the container, run you commands without “tao”.
# action_recognition train xxx

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.