TAO Toolkit Train of an EfficientDet-D0 is stuck!

Hi all, I’m having some issues with running the retrain step in the Train, Prune, Retrain tutorial. I’m training the model in the COCO dataset.

I had some issues with the creating of the container when I tried to run the tao efficientdet train *etc*command.

So I used the following command to access the docker container and execute commands inside of it:

docker run --runtime=nvidia -it --rm --entrypoint "" -v ~/cv_samples_v1.4.0:/home/araujo/cv_samples_v1.4.0/ --device /dev/nvidia0 --device /dev/nvidia1 --device /dev/nvidiactl --device /dev/nvidia-uvm --device /dev/nvidia-uvm-tools --device /dev/nvidia-modeset -p 25104:8888 nvcr.io/nvidia/tao/tao-toolkit-tf:v3.22.05-tf1.15.5-py3 /bin/bash

The cv_samples is the one available in this link.

When I tried to run the command

efficientdet train -e $SPECS_DIR/efficientdet_d0_train.txt \
                        -d $USER_EXPERIMENT_DIR/experiment_dir_unpruned\
                        -k $KEY \
                        --gpu_index 1

inside the Jupyter Notebook, I was able to train the model in around 12hrs.

The prune part, gave some trouble as it did not recognize the first layer throwing a NotImplementedError: Unknown layer type ... InputLayer. Long story short, I had to reinstall everything. Then I was able to execute the prune command.

efficientdet prune -m $USER_EXPERIMENT_DIR/experiment_dir_unpruned/model.step-$NUM_STEP.tlt \
                        -o $USER_EXPERIMENT_DIR/experiment_dir_pruned \
                        -pth 0.7 \
                        -k $KEY --gpu_index 1

with the following output:

Using TensorFlow backend.
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
Using TensorFlow backend.
2022-06-24 18:19:18,104 [INFO] root: Starting EfficientDet pruning.
2022-06-24 18:19:22,889 [INFO] root: Loading weights from /workspace/tao-experiments/efficientdet/experiment_dir_unpruned/model.step-85500.tlt
2022-06-24 18:19:27,990 [INFO] __main__: Pruning process will take some time. Please wait...
2022-06-24 18:19:28,051 [INFO] modulus.pruning.pruning: Exploring graph for retainable indices
2022-06-24 18:19:42,597 [INFO] modulus.pruning.pruning: Pruning model and appending pruned nodes to new graph
2022-06-24 18:27:13,008 [INFO] __main__: Pruning ratio (pruned model / original model): 0.990706219378439
2022-06-24 18:29:00,186 [INFO] root: Pruning finished successfully.
1

So awesome, I had now the trained model and its pruned version. The next thing in the tutorial was to run the Retrain step to correct losses in accuracy due to the pruning.

So, in the Jupyter Notebook I ran the command:

efficientdet train -e $SPECS_DIR/efficientdet_d0_retrain.txt \
                        -d $USER_EXPERIMENT_DIR/experiment_dir_retrain\
                        -k $KEY \
                        --gpu_index 1

And got this output (the […] indicate that the output was longer but similar to the usual or the lines surrounding it):

Using TensorFlow backend.
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
Using TensorFlow backend.
Loading experiment spec at %s. /workspace/tao-experiments/efficientdet/specs/efficientdet_d0_retrain.txt
2022-06-24 18:32:22,642 [INFO] iva.efficientdet.utils.spec_loader: Merging specification from /workspace/tao-experiments/efficientdet/specs/efficientdet_d0_retrain.txt
2022-06-24 18:32:23,059 [INFO] iva.common.logging.logging: Log file already exists at /workspace/tao-experiments/efficientdet/experiment_dir_retrain/status.json
2022-06-24 18:32:23,059 [INFO] root: Starting EfficientDet training.
2022-06-24 18:32:23,059 [INFO] root: [train] AMP is activated - Experiment Feature
2022-06-24 18:32:23,060 [INFO] root: Create EncryptCheckpointSaverHook.
2022-06-24 18:32:23,060 [INFO] root: Loading pretrained model...
2022-06-24 18:32:35,662 [INFO] root: Starting training cycle: 1, epoch: 0.
Model: "model"

[...] 
       
==================================================================================================
Total params: 3,891,284
Trainable params: 3,845,428
Non-trainable params: 45,856
__________________________________________________________________________________________________
Pruned graph is loaded succesfully.
2022-06-24 18:32:41,479 [INFO] iva.efficientdet.models.det_model_fn: LR schedule method: cosine
2022-06-24 18:32:41,762 [INFO] iva.efficientdet.models.det_model_fn: clip gradients norm by 5.000000
2022-06-24 18:33:04,756 [WARNING] root: Checkpoint is missing variable [current_loss_scale]
2022-06-24 18:33:04,756 [WARNING] root: Checkpoint is missing variable [good_steps]
2022-06-24 18:33:04,756 [WARNING] root: Checkpoint is missing variable [block3a_expand_bn/beta/ExponentialMovingAverage]
2022-06-24 18:33:04,756 [WARNING] root: Checkpoint is missing variable [block3a_dwconv/depthwise_kernel/ExponentialMovingAverage]

[...]

2022-06-24 18:33:04,776 [WARNING] root: Checkpoint is missing variable [class-1-bn-7/moving_variance/ExponentialMovingAverage]
2022-06-24 18:33:49,523 [INFO] root: Saving checkpoints for 0 into /workspace/tao-experiments/efficientdet/experiment_dir_retrain/model.step-0.tlt.
[GPU 00] Restoring pretrained weights (676 Tensors)
2022-06-24 18:33:52,616 [INFO] root: Pretrained weights loaded with success...

It stayed there and never continued the training. Then, using the nvtop command I checked the process to understand if it was in fact, running. That’s when I realized there was a pyc file that would allow me to run the training again without evoking the tao efficientdet train command. So I stopped the execution and the jupyter notebook.

Inside the docker I ran the following command (it makes the execution occur only in the second GPU):


CUDA_DEVICE_ORDER=PCI_BUS_ID CUDA_VISIBLE_DEVICES=1 python3.6 /usr/local/lib/python3.6/dist-packages/iva/efficientdet/scripts/train.pyc -e specs/efficientdet_d0_retrain.txt -d /workspace/tao-experiments/efficientdet/experiment_dir_retrain/ -k nvidia_tlt --gpu_index 1

This way, I was able to get a more complete output of the model but ended up being stuck at the same point:


2022-06-24 20:04:29.284805: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0

WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.

Using TensorFlow backend.

2022-06-24 20:04:31.560552: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0

Loading experiment spec at %s. specs/efficientdet_d0_retrain.txt

2022-06-24 20:04:31,808 [INFO] iva.efficientdet.utils.spec_loader: Merging specification from specs/efficientdet_d0_retrain.txt

2022-06-24 20:04:32,245 [INFO] root: Loading weights from /workspace/tao-experiments/efficientdet/experiment_dir_retrain/model.step-0.tlt

2022-06-24 20:04:32,969 [INFO] root: Loading weights from /workspace/tao-experiments/efficientdet/experiment_dir_retrain/model.step-0.tlt

2022-06-24 20:04:33,753 [INFO] iva.common.logging.logging: Log file already exists at /workspace/tao-experiments/efficientdet/experiment_dir_retrain/status.json

2022-06-24 20:04:33,753 [INFO] root: Starting EfficientDet training.

2022-06-24 20:04:33,753 [INFO] root: [train] AMP is activated - Experiment Feature

2022-06-24 20:04:33,754 [INFO] root: Create EncryptCheckpointSaverHook.

2022-06-24 20:04:33,754 [INFO] root: Starting training cycle: 1, epoch: 0.

2022-06-24 20:04:33.856369: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1

2022-06-24 20:04:33.896565: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2022-06-24 20:04:33.896815: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1666] Found device 0 with properties:

name: NVIDIA GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.65

pciBusID: 0000:41:00.0

2022-06-24 20:04:33.896838: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0

2022-06-24 20:04:33.899929: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11

2022-06-24 20:04:33.925619: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10

2022-06-24 20:04:33.926198: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10

2022-06-24 20:04:33.927310: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.11

2022-06-24 20:04:33.929052: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.11

2022-06-24 20:04:33.929213: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8

2022-06-24 20:04:33.929361: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2022-06-24 20:04:33.929724: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2022-06-24 20:04:33.929958: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1794] Adding visible gpu devices: 0

Model: "model"

[...]

==================================================================================================

Total params: 3,891,284

Trainable params: 3,845,428

Non-trainable params: 45,856

__________________________________________________________________________________________________

Pruned graph is loaded succesfully.

2022-06-24 20:04:39,310 [INFO] iva.efficientdet.models.det_model_fn: LR schedule method: cosine

2022-06-24 20:04:39,597 [INFO] iva.efficientdet.models.det_model_fn: clip gradients norm by 5.000000

2022-06-24 20:05:05.844375: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3492930000 Hz

2022-06-24 20:05:05.844547: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x6266fa0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:

2022-06-24 20:05:05.844591: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version

2022-06-24 20:05:05.956348: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2022-06-24 20:05:05.956693: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55a4fa0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:

2022-06-24 20:05:05.956774: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): NVIDIA GeForce RTX 2080 Ti, Compute Capability 7.5

2022-06-24 20:05:05.957323: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2022-06-24 20:05:05.957915: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1666] Found device 0 with properties:

name: NVIDIA GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.65

pciBusID: 0000:41:00.0

2022-06-24 20:05:05.957979: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0

2022-06-24 20:05:05.958104: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11

2022-06-24 20:05:05.958163: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10

2022-06-24 20:05:05.958209: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10

2022-06-24 20:05:05.958260: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.11

2022-06-24 20:05:05.958312: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.11

2022-06-24 20:05:05.958364: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8

2022-06-24 20:05:05.958572: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2022-06-24 20:05:05.959056: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2022-06-24 20:05:05.959448: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1794] Adding visible gpu devices: 0

2022-06-24 20:05:06.222101: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1206] Device interconnect StreamExecutor with strength 1 edge matrix:

2022-06-24 20:05:06.222164: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1212] 0

2022-06-24 20:05:06.222188: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1225] 0: N

2022-06-24 20:05:06.222562: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2022-06-24 20:05:06.222989: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2022-06-24 20:05:06.223221: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1351] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 9666 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce RTX 2080 Ti, pci bus id: 0000:41:00.0, compute capability: 7.5)

2022-06-24 20:05:09.682955: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1987] Running auto_mixed_precision graph optimizer

2022-06-24 20:05:09.745272: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1343] No whitelist ops found, nothing to do

2022-06-24 20:05:13.171474: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1987] Running auto_mixed_precision graph optimizer

2022-06-24 20:05:13.192779: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1343] No whitelist ops found, nothing to do

2022-06-24 20:05:15.859414: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1987] Running auto_mixed_precision graph optimizer

2022-06-24 20:05:15.859651: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1343] No whitelist ops found, nothing to do

2022-06-24 20:05:17.327803: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1987] Running auto_mixed_precision graph optimizer

2022-06-24 20:05:17.346898: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1343] No whitelist ops found, nothing to do

2022-06-24 20:05:19.977485: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1987] Running auto_mixed_precision graph optimizer

2022-06-24 20:05:19.977868: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1343] No whitelist ops found, nothing to do

2022-06-24 20:05:20.019213: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1343] No whitelist ops found, nothing to do

2022-06-24 20:05:20.035546: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1343] No whitelist ops found, nothing to do

2022-06-24 20:05:20.038062: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1343] No whitelist ops found, nothing to do

2022-06-24 20:05:20.046337: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1343] No whitelist ops found, nothing to do

2022-06-24 20:05:20.052663: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1343] No whitelist ops found, nothing to do

2022-06-24 20:05:20.056733: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1343] No whitelist ops found, nothing to do

2022-06-24 20:05:20.058823: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1343] No whitelist ops found, nothing to do

2022-06-24 20:05:20.066910: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1343] No whitelist ops found, nothing to do

2022-06-24 20:05:20.253775: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1343] No whitelist ops found, nothing to do

2022-06-24 20:05:20.350226: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1343] No whitelist ops found, nothing to do

2022-06-24 20:05:20.352461: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1343] No whitelist ops found, nothing to do

2022-06-24 20:05:20.355949: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1343] No whitelist ops found, nothing to do

2022-06-24 20:05:20.358261: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1343] No whitelist ops found, nothing to do

2022-06-24 20:05:20.359805: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1343] No whitelist ops found, nothing to do

2022-06-24 20:05:20.360993: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1343] No whitelist ops found, nothing to do

2022-06-24 20:05:20.361850: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1343] No whitelist ops found, nothing to do

2022-06-24 20:05:20.363896: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1343] No whitelist ops found, nothing to do

2022-06-24 20:05:20.365662: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1343] No whitelist ops found, nothing to do

2022-06-24 20:05:20.367112: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1343] No whitelist ops found, nothing to do

2022-06-24 20:05:20.370644: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1343] No whitelist ops found, nothing to do

2022-06-24 20:05:21.733968: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1987] Running auto_mixed_precision graph optimizer

2022-06-24 20:05:21.734185: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1343] No whitelist ops found, nothing to do

2022-06-24 20:05:42.195256: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1987] Running auto_mixed_precision graph optimizer

2022-06-24 20:05:42.215183: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1343] No whitelist ops found, nothing to do

2022-06-24 20:05:47,682 [INFO] root: Saving checkpoints for 0 into /workspace/tao-experiments/efficientdet/experiment_dir_retrain/model.step-0.tlt.

2022-06-24 20:05:50.430353: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1987] Running auto_mixed_precision graph optimizer

2022-06-24 20:05:50.447254: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1343] No whitelist ops found, nothing to do

[GPU 00] Restoring pretrained weights (1354 Tensors)

2022-06-24 20:05:52,425 [INFO] root: Pretrained weights loaded with success...

2022-06-24 20:06:03.543855: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1987] Running auto_mixed_precision graph optimizer

2022-06-24 20:06:03.747367: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1035] Automatic Mixed Precision Grappler Pass Summary:

Total processable nodes: 28684

Recognized nodes available for conversion: 15214

Total nodes converted: 2174

Total FP16 Cast ops used (excluding Const and Variable casts): 122

Whitelisted nodes converted: 1068

Blacklisted nodes blocking conversion: 971

Nodes blocked from conversion by blacklisted nodes: 1235

For more information regarding mixed precision training, including how to make automatic mixed precision aware of a custom op type, please see the documentation available here:

https://docs.nvidia.com/deeplearning/frameworks/tensorflow-user-guide/index.html#tfamp

So at this point, it has been running for an hour and no progress past this point. I reduced the train_batch_size and num_examples_per_epoch as a way to see if it would work but no luck.

Anyone can help me with this situation?

I’ll put the info about the environment in the reply below.

INFOs:

• Hardware:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 515.48.07    Driver Version: 515.48.07    CUDA Version: 11.7     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  On   | 00000000:08:00.0 Off |                  N/A |
| 39%   58C    P2   114W / 260W |  10779MiB / 11264MiB |     41%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  NVIDIA GeForce ...  On   | 00000000:41:00.0 Off |                  N/A |
|  0%   54C    P8    10W / 260W |   1312MiB / 11264MiB |      0%      Default |
|                               |                      |                  N/A |

±------------------------------±---------------------±---------------------+

• Network Type: EfficientDet-D0
• TAO Version:

Configuration of the TAO Toolkit Instance
dockers: ['nvidia/tao/tao-toolkit-tf', 'nvidia/tao/tao-toolkit-pyt', 'nvidia/tao/tao-toolkit-lm']
format_version: 2.0
toolkit_version: 3.22.05
published_date: 05/25/2022

• Training spec files:
RETRAIN

training_config {
  checkpoint: "/workspace/tao-experiments/efficientdet/experiment_dir_pruned/model.tlt"
  pruned_model_path: "/workspace/tao-experiments/efficientdet/experiment_dir_pruned/model.tlt"
  train_batch_size: 6
  iterations_per_loop: 10
  checkpoint_period: 2
  num_examples_per_epoch: 70000
  num_epochs: 6
  tf_random_seed: 42
  lr_warmup_epoch: 3
  lr_warmup_init: 0.002
  learning_rate: 0.02
  amp: True
  moving_average_decay: 0.9999
  l2_weight_decay: 0.00004
  l1_weight_decay: 0.0
}
dataset_config {
  num_classes: 91
  image_size: "512,512"
  training_file_pattern: "/workspace/tao-experiments/data/train*.tfrecord"
  validation_file_pattern: "/workspace/tao-experiments/data/val*.tfrecord"
  validation_json_file: "/workspace/tao-experiments/data/raw-data/annotations/instances_val2017.json"
  max_instances_per_image: 100
  skip_crowd_during_training: True
}
model_config {
  model_name: 'efficientdet-d0'
  min_level: 3
  max_level: 7
  num_scales: 3
}
augmentation_config {
  rand_hflip: True
  random_crop_min_scale: 0.1
  random_crop_max_scale: 2.0
}
eval_config {
  eval_batch_size: 6
  eval_epoch_cycle: 2
  eval_samples: 500
  min_score_thresh: 0.4
  max_detections_per_image: 100
}

TRAIN (This was the config that allowed me to initially train the model before the pruning step)

training_config {
  checkpoint: "/workspace/tao-experiments/efficientdet/pretrained_efficientdet_vefficientnet_b0/efficientnet_b0.hdf5"
  train_batch_size: 12
  iterations_per_loop: 10
  checkpoint_period: 2
  num_examples_per_epoch: 70000
  num_epochs: 12
  tf_random_seed: 42
  lr_warmup_epoch: 3
  lr_warmup_init: 0.002
  learning_rate: 0.02
  amp: True
  moving_average_decay: 0.9999
  l2_weight_decay: 0.00004
  l1_weight_decay: 0.0
}
dataset_config {
  num_classes: 91
  image_size: "512,512"
  training_file_pattern: "/workspace/tao-experiments/data/train-*"
  validation_file_pattern: "/workspace/tao-experiments/data/val-*"
  validation_json_file: "/workspace/tao-experiments/data/raw-data/annotations/instances_val2017.json"
  max_instances_per_image: 100
  skip_crowd_during_training: True
}
model_config {
  model_name: 'efficientdet-d0'
  min_level: 3
  max_level: 7
  num_scales: 3
}
augmentation_config {
  rand_hflip: True
  random_crop_min_scale: 0.1
  random_crop_max_scale: 2.0
}
eval_config {
  eval_batch_size: 16
  eval_epoch_cycle: 2
  eval_samples: 500
  min_score_thresh: 0.4
  max_detections_per_image: 100

• How to reproduce the issue ?

Could you try to disable amp and run the retraining?
amp: False

Hi @Morganh! Thanks for your reply!

I did this change in the config and got the same result. =/

Pruned graph is loaded succesfully.
2022-06-27 14:12:44,350 [INFO] iva.efficientdet.models.det_model_fn: LR schedule method: cosine
2022-06-27 14:12:44,640 [INFO] iva.efficientdet.models.det_model_fn: clip gradients norm by 5.000000
2022-06-27 14:13:08.311798: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3492930000 Hz
2022-06-27 14:13:08.312124: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5aeaad0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2022-06-27 14:13:08.312190: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2022-06-27 14:13:08.436311: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-27 14:13:08.436722: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5b9a5d0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2022-06-27 14:13:08.436786: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): NVIDIA GeForce RTX 2080 Ti, Compute Capability 7.5
2022-06-27 14:13:08.437195: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-27 14:13:08.437524: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1666] Found device 0 with properties: 
name: NVIDIA GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.65
pciBusID: 0000:41:00.0
2022-06-27 14:13:08.437569: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2022-06-27 14:13:08.437664: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11
2022-06-27 14:13:08.437701: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2022-06-27 14:13:08.437734: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10
2022-06-27 14:13:08.437770: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.11
2022-06-27 14:13:08.437802: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.11
2022-06-27 14:13:08.437833: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8
2022-06-27 14:13:08.437948: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-27 14:13:08.438304: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-27 14:13:08.438561: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1794] Adding visible gpu devices: 0
2022-06-27 14:13:10.731618: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1206] Device interconnect StreamExecutor with strength 1 edge matrix:
2022-06-27 14:13:10.731676: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1212]      0 
2022-06-27 14:13:10.731694: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1225] 0:   N 
2022-06-27 14:13:10.732719: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-27 14:13:10.733092: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-27 14:13:10.733350: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1351] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 9666 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce RTX 2080 Ti, pci bus id: 0000:41:00.0, compute capability: 7.5)
2022-06-27 14:13:48,056 [INFO] root: Saving checkpoints for 0 into /workspace/tao-experiments/efficientdet/experiment_dir_retrain/model.step-0.tlt.
[GPU 00] Restoring pretrained weights (1352 Tensors)
2022-06-27 14:13:52,509 [INFO] root: Pretrained weights loaded with success...

Stuck there for an hour, unfortunatelly.

Thanks for the info. I need to check if I can reproduce.
For your side, please monitor the CPU memory and GPU memory when stuck happens.
More, you can try to set lower train_batch_size and retest.
Also, delete
checkpoint: "/workspace/tao-experiments/efficientdet/experiment_dir_pruned/model.tlt"

No problem.

Ok, so @Morganh, I have a screenshot from both (running on the Device 1):

NVTOP output:

HTOP output (filtered to show only the process):

Can u spot anything?

Hi,
Could you please double check? Actually I cannot reproduce the result. The retraining can run smoothly.

Below are my spec files and scripts.

root@3065041:/workspace/tlt-experiments/efficientdet# cat train.sh
efficientdet train -e /workspace/tlt-experiments/efficientdet/efficientdet_d0_train.txt -d /workspace/tlt-experiments/efficientdet/experiment_dir_unpruned -k nvidia_tlt --gpus 1
root@3065041:/workspace/tlt-experiments/efficientdet# cat prune.sh
efficientdet prune -m /workspace/tlt-experiments/efficientdet/experiment_dir_unpruned/model.step-0.tlt -o /workspace/tlt-experiments/efficientdet/experiment_dir_pruned -pth 0.7 -k nvidia_tlt --gpu_index 1
root@3065041:/workspace/tlt-experiments/efficientdet# cat retrain.sh
efficientdet train -e /workspace/tlt-experiments/efficientdet/efficientdet_d0_retrain.txt -d /workspace/tlt-experiments/efficientdet/experiment_dir_retrain -k nvidia_tlt --gpus 1 --gpu_index 1
root@3065041:/workspace/tlt-experiments/efficientdet# cat efficientdet_d0_train.txt
training_config {
  checkpoint: "/workspace/tlt-experiments/efficientdet/pretrained_efficientdet_vefficientnet_b0/efficientnet_b0.hdf5"
  train_batch_size: 16
  iterations_per_loop: 10
  checkpoint_period: 2
  num_examples_per_epoch: 118288
  num_epochs: 6
  tf_random_seed: 42
  lr_warmup_epoch: 3
  lr_warmup_init: 0.002
  learning_rate: 0.02
  amp: True
  moving_average_decay: 0.9999
  l2_weight_decay: 0.00004
  l1_weight_decay: 0.0
}
dataset_config {
  num_classes: 91
  image_size: "512,512"
  training_file_pattern: "/workspace/tlt-experiments/efficientdet/tfrecords/train-*"
  validation_file_pattern: "/workspace/tlt-experiments/efficientdet/tfrecords/val-*"
  validation_json_file: "/workspace/tlt-experiments/mask_rcnn/coco/raw-data/annotations/instances_val2017.json"
  max_instances_per_image: 100
  skip_crowd_during_training: True
}
model_config {
  model_name: 'efficientdet-d0'
  min_level: 3
  max_level: 7
  num_scales: 3
}
augmentation_config {
  rand_hflip: True
  random_crop_min_scale: 0.1
  random_crop_max_scale: 2.0
}
eval_config {
  eval_batch_size: 16
  eval_epoch_cycle: 2
  eval_samples: 500
  min_score_thresh: 0.4
  max_detections_per_image: 100
}
root@3065041:/workspace/tlt-experiments/efficientdet# cat efficientdet_d0_retrain.txt
training_config {
  pruned_model_path: "/workspace/tlt-experiments/efficientdet/experiment_dir_pruned/model.tlt"
  train_batch_size: 16
  iterations_per_loop: 10
  checkpoint_period: 2
  num_examples_per_epoch: 118288
  num_epochs: 6
  tf_random_seed: 42
  lr_warmup_epoch: 3
  lr_warmup_init: 0.002
  learning_rate: 0.02
  amp: True
  moving_average_decay: 0.9999
  l2_weight_decay: 0.00004
  l1_weight_decay: 0.0
}
dataset_config {
  num_classes: 91
  image_size: "512,512"
  training_file_pattern: "/workspace/tlt-experiments/efficientdet/tfrecords/train-*"
  validation_file_pattern: "/workspace/tlt-experiments/efficientdet/tfrecords/val-*"
  validation_json_file: "/workspace/tlt-experiments/mask_rcnn/coco/raw-data/annotations/instances_val2017.json"
  max_instances_per_image: 100
  skip_crowd_during_training: True
}
model_config {
  model_name: 'efficientdet-d0'
  min_level: 3
  max_level: 7
  num_scales: 3
}
augmentation_config {
  rand_hflip: True
  random_crop_min_scale: 0.1
  random_crop_max_scale: 2.0
}
eval_config {
  eval_batch_size: 16
  eval_epoch_cycle: 2
  eval_samples: 500
  min_score_thresh: 0.4
  max_detections_per_image: 100
}

Interesting. I will re-instance the docker container and retry the steps you mentioned.

Hi, @Morganh. Re-running everything I got to the same point. Still stuck =/

root@a4416cb5896d:/workspace/tao-experiments/efficientdet# efficientdet train -e /workspace/tao-experiments/efficientdet/specs/efficientdet_d0_retrain.txt -d /workspace/tao-experiments/efficientdet/experiment_dir_retrain/ -k nvidia_tlt --gpus 1 --gpu_index 1
Using TensorFlow backend.
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
Using TensorFlow backend.
Loading experiment spec at %s. /workspace/tao-experiments/efficientdet/specs/efficientdet_d0_retrain.txt
2022-06-29 15:57:31,548 [INFO] iva.efficientdet.utils.spec_loader: Merging specification from /workspace/tao-experiments/efficientdet/specs/efficientdet_d0_retrain.txt
2022-06-29 15:57:31,984 [INFO] iva.common.logging.logging: Log file already exists at /workspace/tao-experiments/efficientdet/experiment_dir_retrain/status.json
2022-06-29 15:57:31,984 [INFO] root: Starting EfficientDet training.
2022-06-29 15:57:31,985 [INFO] root: [train] AMP is activated - Experiment Feature
2022-06-29 15:57:31,986 [INFO] root: Create EncryptCheckpointSaverHook.
2022-06-29 15:57:31,986 [INFO] root: Loading pretrained model...
2022-06-29 15:57:45,497 [INFO] root: Starting training cycle: 1, epoch: 0.
Model: "model"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
Input (InputLayer)              [(16, 512, 512, 3)]  0                                            
__________________________________________________________________________________________________
stem_conv_pad (ZeroPadding2D)   (16, 513, 513, 3)    0           Input[0][0]      

[...]          
__________________________________________________________________________________________________
box-predict (SeparableConv2D)   multiple             2916        activation_41[0][0]              
                                                                 activation_44[0][0]              
                                                                 activation_47[0][0]              
                                                                 activation_50[0][0]              
                                                                 activation_53[0][0]              
==================================================================================================
Total params: 3,891,284
Trainable params: 3,845,428
Non-trainable params: 45,856
__________________________________________________________________________________________________
Pruned graph is loaded succesfully.
2022-06-29 15:57:51,949 [INFO] iva.efficientdet.models.det_model_fn: LR schedule method: cosine
2022-06-29 15:57:52,264 [INFO] iva.efficientdet.models.det_model_fn: clip gradients norm by 5.000000
2022-06-29 15:58:17,016 [WARNING] root: Checkpoint is missing variable [current_loss_scale]
2022-06-29 15:58:17,016 [WARNING] root: Checkpoint is missing variable [good_steps]
2022-06-29 15:58:17,016 [WARNING] root: Checkpoint is missing variable [block4b_se_reduce/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,016 [WARNING] root: Checkpoint is missing variable [block4c_se_expand/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,016 [WARNING] root: Checkpoint is missing variable [block6a_project_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,016 [WARNING] root: Checkpoint is missing variable [block6c_se_expand/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,016 [WARNING] root: Checkpoint is missing variable [block6d_se_reduce/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,016 [WARNING] root: Checkpoint is missing variable [weighted_fusion_0_0/weighted_fusion_0_0/ExponentialMovingAverage]
2022-06-29 15:58:17,016 [WARNING] root: Checkpoint is missing variable [bifpn_bn_4_0/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,016 [WARNING] root: Checkpoint is missing variable [class-1-bn-6/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,016 [WARNING] root: Checkpoint is missing variable [bifpn_bn_5_2/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,016 [WARNING] root: Checkpoint is missing variable [block2b_project_conv/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,016 [WARNING] root: Checkpoint is missing variable [block3a_se_reduce/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,016 [WARNING] root: Checkpoint is missing variable [block4c_expand_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,016 [WARNING] root: Checkpoint is missing variable [block5b_se_reduce/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,016 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_2_2/depthwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,016 [WARNING] root: Checkpoint is missing variable [box-0/pointwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,016 [WARNING] root: Checkpoint is missing variable [class-0-bn-7/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,016 [WARNING] root: Checkpoint is missing variable [bifpn_bn_3_0/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,016 [WARNING] root: Checkpoint is missing variable [block2b_project_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,017 [WARNING] root: Checkpoint is missing variable [block3a_expand_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,017 [WARNING] root: Checkpoint is missing variable [block4b_se_reduce/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,017 [WARNING] root: Checkpoint is missing variable [block6b_project_conv/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,017 [WARNING] root: Checkpoint is missing variable [bifpn_bn_7_0/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,017 [WARNING] root: Checkpoint is missing variable [bifpn_bn_6_2/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,017 [WARNING] root: Checkpoint is missing variable [block1a_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,017 [WARNING] root: Checkpoint is missing variable [bifpn_bn_2_1/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,017 [WARNING] root: Checkpoint is missing variable [bifpn_bn_7_1/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,017 [WARNING] root: Checkpoint is missing variable [box-0-bn-5/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,017 [WARNING] root: Checkpoint is missing variable [block6d_project_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,017 [WARNING] root: Checkpoint is missing variable [bifpn0_1_9_0_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,017 [WARNING] root: Checkpoint is missing variable [bifpn0_2_10_0_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,017 [WARNING] root: Checkpoint is missing variable [class-0-bn-3/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,017 [WARNING] root: Checkpoint is missing variable [box-0-bn-3/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,017 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_3_1/depthwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,017 [WARNING] root: Checkpoint is missing variable [bifpn_bn_7_1/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,017 [WARNING] root: Checkpoint is missing variable [block3b_dwconv/depthwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,017 [WARNING] root: Checkpoint is missing variable [block5a_expand_conv/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,017 [WARNING] root: Checkpoint is missing variable [block5c_se_expand/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,017 [WARNING] root: Checkpoint is missing variable [block6b_project_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,017 [WARNING] root: Checkpoint is missing variable [block6c_se_reduce/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,017 [WARNING] root: Checkpoint is missing variable [block7a_se_reduce/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,017 [WARNING] root: Checkpoint is missing variable [bifpn_bn_7_0/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,017 [WARNING] root: Checkpoint is missing variable [bifpn_bn_6_2/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,018 [WARNING] root: Checkpoint is missing variable [block6d_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,018 [WARNING] root: Checkpoint is missing variable [box-2-bn-4/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,018 [WARNING] root: Checkpoint is missing variable [block5b_expand_conv/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,018 [WARNING] root: Checkpoint is missing variable [bifpn_bn_0_0/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,018 [WARNING] root: Checkpoint is missing variable [class-0-bn-4/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,018 [WARNING] root: Checkpoint is missing variable [class-2-bn-6/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,018 [WARNING] root: Checkpoint is missing variable [block3b_expand_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,018 [WARNING] root: Checkpoint is missing variable [block4a_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,018 [WARNING] root: Checkpoint is missing variable [bifpn0_2_6_0_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,018 [WARNING] root: Checkpoint is missing variable [bifpn0_1_7_0_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,018 [WARNING] root: Checkpoint is missing variable [bifpn_bn_0_1/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,018 [WARNING] root: Checkpoint is missing variable [bifpn_bn_1_2/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,018 [WARNING] root: Checkpoint is missing variable [bifpn_bn_0_1/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,018 [WARNING] root: Checkpoint is missing variable [weighted_fusion_1_1/weighted_fusion_1_1/ExponentialMovingAverage]
2022-06-29 15:58:17,018 [WARNING] root: Checkpoint is missing variable [block2a_project_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,018 [WARNING] root: Checkpoint is missing variable [block2b_expand_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,018 [WARNING] root: Checkpoint is missing variable [block4c_se_reduce/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,018 [WARNING] root: Checkpoint is missing variable [block5a_se_expand/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,018 [WARNING] root: Checkpoint is missing variable [block6c_expand_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,018 [WARNING] root: Checkpoint is missing variable [block6d_se_expand/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,018 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_0_0/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,018 [WARNING] root: Checkpoint is missing variable [box-0-bn-5/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,018 [WARNING] root: Checkpoint is missing variable [block4b_expand_conv/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,018 [WARNING] root: Checkpoint is missing variable [block5a_expand_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,018 [WARNING] root: Checkpoint is missing variable [block6b_expand_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,018 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_5_1/depthwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,018 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_3_2/pointwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,019 [WARNING] root: Checkpoint is missing variable [class-predict/depthwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,019 [WARNING] root: Checkpoint is missing variable [block4c_project_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,019 [WARNING] root: Checkpoint is missing variable [block6a_expand_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,019 [WARNING] root: Checkpoint is missing variable [block7a_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,019 [WARNING] root: Checkpoint is missing variable [class-2-bn-3/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,019 [WARNING] root: Checkpoint is missing variable [block4b_se_expand/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,019 [WARNING] root: Checkpoint is missing variable [block6c_project_conv/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,019 [WARNING] root: Checkpoint is missing variable [bifpn_bn_4_0/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,019 [WARNING] root: Checkpoint is missing variable [bifpn_bn_5_0/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,019 [WARNING] root: Checkpoint is missing variable [bifpn_bn_0_2/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,019 [WARNING] root: Checkpoint is missing variable [block5a_se_reduce/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,019 [WARNING] root: Checkpoint is missing variable [block6c_project_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,019 [WARNING] root: Checkpoint is missing variable [block6d_expand_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,019 [WARNING] root: Checkpoint is missing variable [block7a_se_expand/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,019 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_1_2/depthwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,019 [WARNING] root: Checkpoint is missing variable [bifpn_bn_4_1/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,019 [WARNING] root: Checkpoint is missing variable [block2b_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,019 [WARNING] root: Checkpoint is missing variable [block3a_project_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,019 [WARNING] root: Checkpoint is missing variable [block4c_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,019 [WARNING] root: Checkpoint is missing variable [p6_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,019 [WARNING] root: Checkpoint is missing variable [box-0-bn-5/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,019 [WARNING] root: Checkpoint is missing variable [block5c_dwconv/depthwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,019 [WARNING] root: Checkpoint is missing variable [bifpn_bn_3_0/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,019 [WARNING] root: Checkpoint is missing variable [bifpn_bn_4_1/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,019 [WARNING] root: Checkpoint is missing variable [class-2/pointwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,019 [WARNING] root: Checkpoint is missing variable [block4a_expand_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,019 [WARNING] root: Checkpoint is missing variable [block5a_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,020 [WARNING] root: Checkpoint is missing variable [bifpn_bn_1_1/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,020 [WARNING] root: Checkpoint is missing variable [bifpn_bn_4_2/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,020 [WARNING] root: Checkpoint is missing variable [box-2-bn-5/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,020 [WARNING] root: Checkpoint is missing variable [block1a_se_expand/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,020 [WARNING] root: Checkpoint is missing variable [block4c_se_expand/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,020 [WARNING] root: Checkpoint is missing variable [block5a_se_reduce/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,020 [WARNING] root: Checkpoint is missing variable [block6d_project_conv/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,020 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_2_1/depthwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,020 [WARNING] root: Checkpoint is missing variable [stem_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,020 [WARNING] root: Checkpoint is missing variable [block5b_project_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,020 [WARNING] root: Checkpoint is missing variable [bifpn_bn_2_0/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,020 [WARNING] root: Checkpoint is missing variable [class-1-bn-6/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,020 [WARNING] root: Checkpoint is missing variable [weighted_fusion_3_2/weighted_fusion_3_2/ExponentialMovingAverage]
2022-06-29 15:58:17,020 [WARNING] root: Checkpoint is missing variable [block6d_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,020 [WARNING] root: Checkpoint is missing variable [bifpn_bn_4_0/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,020 [WARNING] root: Checkpoint is missing variable [bifpn0_0_8_0_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,020 [WARNING] root: Checkpoint is missing variable [box-0/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,020 [WARNING] root: Checkpoint is missing variable [box-0-bn-3/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,020 [WARNING] root: Checkpoint is missing variable [block5a_project_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,020 [WARNING] root: Checkpoint is missing variable [block5b_expand_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,020 [WARNING] root: Checkpoint is missing variable [block6b_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,020 [WARNING] root: Checkpoint is missing variable [class-0-bn-5/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,020 [WARNING] root: Checkpoint is missing variable [class-2-bn-7/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,020 [WARNING] root: Checkpoint is missing variable [block4a_se_expand/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,020 [WARNING] root: Checkpoint is missing variable [block7a_project_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,021 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_0_0/pointwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,021 [WARNING] root: Checkpoint is missing variable [bifpn_bn_3_0/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,021 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_3_2/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,021 [WARNING] root: Checkpoint is missing variable [bifpn0_2_6_0_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,021 [WARNING] root: Checkpoint is missing variable [bifpn_bn_6_2/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,021 [WARNING] root: Checkpoint is missing variable [bifpn_bn_6_1/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,021 [WARNING] root: Checkpoint is missing variable [block3a_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,021 [WARNING] root: Checkpoint is missing variable [block6d_project_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,021 [WARNING] root: Checkpoint is missing variable [block7a_expand_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,021 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_1_0/depthwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,021 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_5_2/pointwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,021 [WARNING] root: Checkpoint is missing variable [class-1-bn-6/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,021 [WARNING] root: Checkpoint is missing variable [bifpn_bn_7_0/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,021 [WARNING] root: Checkpoint is missing variable [class-2-bn-6/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,021 [WARNING] root: Checkpoint is missing variable [block3b_expand_conv/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,021 [WARNING] root: Checkpoint is missing variable [block3b_project_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,021 [WARNING] root: Checkpoint is missing variable [box-0/depthwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,021 [WARNING] root: Checkpoint is missing variable [class-1-bn-5/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,021 [WARNING] root: Checkpoint is missing variable [block5c_expand_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,021 [WARNING] root: Checkpoint is missing variable [block6c_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,021 [WARNING] root: Checkpoint is missing variable [bifpn_bn_0_0/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,021 [WARNING] root: Checkpoint is missing variable [bifpn_bn_7_1/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,021 [WARNING] root: Checkpoint is missing variable [block2a_project_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,021 [WARNING] root: Checkpoint is missing variable [block6d_expand_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,022 [WARNING] root: Checkpoint is missing variable [bifpn0_2_6_0_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,022 [WARNING] root: Checkpoint is missing variable [weighted_fusion_1_2/weighted_fusion_1_2/ExponentialMovingAverage]
2022-06-29 15:58:17,022 [WARNING] root: Checkpoint is missing variable [block1a_se_reduce/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,022 [WARNING] root: Checkpoint is missing variable [block2a_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,022 [WARNING] root: Checkpoint is missing variable [block4b_dwconv/depthwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,022 [WARNING] root: Checkpoint is missing variable [block5a_se_expand/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,022 [WARNING] root: Checkpoint is missing variable [block5b_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,022 [WARNING] root: Checkpoint is missing variable [block7a_project_conv/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,022 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_7_2/depthwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,022 [WARNING] root: Checkpoint is missing variable [class-predict/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,022 [WARNING] root: Checkpoint is missing variable [block2a_project_conv/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,022 [WARNING] root: Checkpoint is missing variable [block2b_se_expand/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,022 [WARNING] root: Checkpoint is missing variable [block6a_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,022 [WARNING] root: Checkpoint is missing variable [p6/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,022 [WARNING] root: Checkpoint is missing variable [weighted_fusion_7_0/weighted_fusion_7_0/ExponentialMovingAverage]
2022-06-29 15:58:17,022 [WARNING] root: Checkpoint is missing variable [bifpn_bn_6_1/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,022 [WARNING] root: Checkpoint is missing variable [box-1/pointwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,022 [WARNING] root: Checkpoint is missing variable [block1a_project_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,022 [WARNING] root: Checkpoint is missing variable [block2a_expand_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,022 [WARNING] root: Checkpoint is missing variable [bifpn_bn_0_1/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,022 [WARNING] root: Checkpoint is missing variable [block4a_expand_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,022 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_5_0/depthwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,023 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_0_2/depthwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,023 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_2_2/pointwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,023 [WARNING] root: Checkpoint is missing variable [class-2-bn-3/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,023 [WARNING] root: Checkpoint is missing variable [class-2-bn-5/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,023 [WARNING] root: Checkpoint is missing variable [box-2-bn-3/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,023 [WARNING] root: Checkpoint is missing variable [block5c_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,023 [WARNING] root: Checkpoint is missing variable [block6a_project_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,023 [WARNING] root: Checkpoint is missing variable [block1a_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,023 [WARNING] root: Checkpoint is missing variable [block4a_project_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,023 [WARNING] root: Checkpoint is missing variable [block4b_expand_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,023 [WARNING] root: Checkpoint is missing variable [bifpn_bn_4_1/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,023 [WARNING] root: Checkpoint is missing variable [box-0-bn-4/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,023 [WARNING] root: Checkpoint is missing variable [bifpn0_0_8_0_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,023 [WARNING] root: Checkpoint is missing variable [box-1-bn-5/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,023 [WARNING] root: Checkpoint is missing variable [box-2-bn-6/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,023 [WARNING] root: Checkpoint is missing variable [block1a_dwconv/depthwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,023 [WARNING] root: Checkpoint is missing variable [block2a_project_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,023 [WARNING] root: Checkpoint is missing variable [block2b_expand_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,023 [WARNING] root: Checkpoint is missing variable [block4a_se_reduce/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,023 [WARNING] root: Checkpoint is missing variable [block6a_project_conv/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,023 [WARNING] root: Checkpoint is missing variable [bifpn_bn_1_0/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,023 [WARNING] root: Checkpoint is missing variable [bifpn0_2_10_0_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,023 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_1_1/depthwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,023 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_3_1/pointwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,024 [WARNING] root: Checkpoint is missing variable [weighted_fusion_4_1/weighted_fusion_4_1/ExponentialMovingAverage]
2022-06-29 15:58:17,024 [WARNING] root: Checkpoint is missing variable [block1a_se_expand/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,024 [WARNING] root: Checkpoint is missing variable [block4b_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,024 [WARNING] root: Checkpoint is missing variable [block5b_dwconv/depthwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,024 [WARNING] root: Checkpoint is missing variable [block6b_expand_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,024 [WARNING] root: Checkpoint is missing variable [block7a_dwconv/depthwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,024 [WARNING] root: Checkpoint is missing variable [bifpn0_1_9_0/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,024 [WARNING] root: Checkpoint is missing variable [bifpn_bn_6_0/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,024 [WARNING] root: Checkpoint is missing variable [weighted_fusion_2_1/weighted_fusion_2_1/ExponentialMovingAverage]
2022-06-29 15:58:17,024 [WARNING] root: Checkpoint is missing variable [class-0-bn-6/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,024 [WARNING] root: Checkpoint is missing variable [box-0-bn-7/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,024 [WARNING] root: Checkpoint is missing variable [class-0-bn-6/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,024 [WARNING] root: Checkpoint is missing variable [block6c_project_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,024 [WARNING] root: Checkpoint is missing variable [block2b_se_reduce/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,024 [WARNING] root: Checkpoint is missing variable [box-1/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,024 [WARNING] root: Checkpoint is missing variable [bifpn_bn_4_2/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,024 [WARNING] root: Checkpoint is missing variable [class-1-bn-4/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,024 [WARNING] root: Checkpoint is missing variable [class-2-bn-5/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,024 [WARNING] root: Checkpoint is missing variable [block2b_dwconv/depthwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,024 [WARNING] root: Checkpoint is missing variable [block3a_expand_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,024 [WARNING] root: Checkpoint is missing variable [block3a_dwconv/depthwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,024 [WARNING] root: Checkpoint is missing variable [block3b_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,024 [WARNING] root: Checkpoint is missing variable [block4a_se_expand/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,024 [WARNING] root: Checkpoint is missing variable [block7a_se_reduce/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,025 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_5_1/pointwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,025 [WARNING] root: Checkpoint is missing variable [class-predict/pointwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,025 [WARNING] root: Checkpoint is missing variable [block2b_project_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,025 [WARNING] root: Checkpoint is missing variable [block4c_project_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,025 [WARNING] root: Checkpoint is missing variable [bifpn_bn_7_2/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,025 [WARNING] root: Checkpoint is missing variable [weighted_fusion_7_2/weighted_fusion_7_2/ExponentialMovingAverage]
2022-06-29 15:58:17,025 [WARNING] root: Checkpoint is missing variable [class-0-bn-5/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,025 [WARNING] root: Checkpoint is missing variable [box-2-bn-7/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,025 [WARNING] root: Checkpoint is missing variable [block4c_expand_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,025 [WARNING] root: Checkpoint is missing variable [block5b_expand_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,025 [WARNING] root: Checkpoint is missing variable [bifpn_bn_6_1/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,025 [WARNING] root: Checkpoint is missing variable [box-2-bn-4/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,025 [WARNING] root: Checkpoint is missing variable [block4c_se_reduce/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,025 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_0_1/depthwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,025 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_7_1/depthwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,025 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_1_2/pointwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,025 [WARNING] root: Checkpoint is missing variable [block2b_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,025 [WARNING] root: Checkpoint is missing variable [p6_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,025 [WARNING] root: Checkpoint is missing variable [block3a_project_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,025 [WARNING] root: Checkpoint is missing variable [block6c_expand_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,025 [WARNING] root: Checkpoint is missing variable [weighted_fusion_3_0/weighted_fusion_3_0/ExponentialMovingAverage]
2022-06-29 15:58:17,025 [WARNING] root: Checkpoint is missing variable [weighted_fusion_4_0/weighted_fusion_4_0/ExponentialMovingAverage]
2022-06-29 15:58:17,025 [WARNING] root: Checkpoint is missing variable [block2a_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,026 [WARNING] root: Checkpoint is missing variable [block6b_project_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,026 [WARNING] root: Checkpoint is missing variable [block6d_project_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,026 [WARNING] root: Checkpoint is missing variable [bifpn_bn_7_0/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,026 [WARNING] root: Checkpoint is missing variable [class-1-bn-5/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,026 [WARNING] root: Checkpoint is missing variable [bifpn_bn_6_2/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,026 [WARNING] root: Checkpoint is missing variable [block2b_se_expand/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,026 [WARNING] root: Checkpoint is missing variable [block6c_dwconv/depthwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,026 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_2_0/depthwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,026 [WARNING] root: Checkpoint is missing variable [weighted_fusion_5_0/weighted_fusion_5_0/ExponentialMovingAverage]
2022-06-29 15:58:17,026 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_2_1/pointwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,026 [WARNING] root: Checkpoint is missing variable [block5a_expand_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,026 [WARNING] root: Checkpoint is missing variable [block5c_expand_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,026 [WARNING] root: Checkpoint is missing variable [block6a_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,026 [WARNING] root: Checkpoint is missing variable [block7a_project_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,026 [WARNING] root: Checkpoint is missing variable [bifpn_bn_0_0/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,026 [WARNING] root: Checkpoint is missing variable [block3b_project_conv/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,026 [WARNING] root: Checkpoint is missing variable [block4c_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,026 [WARNING] root: Checkpoint is missing variable [block5b_project_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,026 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_5_2/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,026 [WARNING] root: Checkpoint is missing variable [box-0-bn-4/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,026 [WARNING] root: Checkpoint is missing variable [box-2-bn-6/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,026 [WARNING] root: Checkpoint is missing variable [block5a_project_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,026 [WARNING] root: Checkpoint is missing variable [block5c_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,027 [WARNING] root: Checkpoint is missing variable [bifpn0_1_7_0_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,027 [WARNING] root: Checkpoint is missing variable [class-0-bn-4/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,027 [WARNING] root: Checkpoint is missing variable [block3b_project_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,027 [WARNING] root: Checkpoint is missing variable [block4a_expand_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,027 [WARNING] root: Checkpoint is missing variable [block6b_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,027 [WARNING] root: Checkpoint is missing variable [block6d_expand_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,027 [WARNING] root: Checkpoint is missing variable [weighted_fusion_3_1/weighted_fusion_3_1/ExponentialMovingAverage]
2022-06-29 15:58:17,027 [WARNING] root: Checkpoint is missing variable [class-0-bn-7/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,027 [WARNING] root: Checkpoint is missing variable [box-0-bn-6/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,027 [WARNING] root: Checkpoint is missing variable [class-2-bn-4/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,027 [WARNING] root: Checkpoint is missing variable [box-2-bn-4/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,027 [WARNING] root: Checkpoint is missing variable [block2a_expand_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,027 [WARNING] root: Checkpoint is missing variable [class-0-bn-7/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,027 [WARNING] root: Checkpoint is missing variable [class-2-bn-4/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,027 [WARNING] root: Checkpoint is missing variable [stem_conv/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,027 [WARNING] root: Checkpoint is missing variable [block6b_dwconv/depthwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,027 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_1_0/pointwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,027 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_4_0/depthwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,027 [WARNING] root: Checkpoint is missing variable [weighted_fusion_0_2/weighted_fusion_0_2/ExponentialMovingAverage]
2022-06-29 15:58:17,027 [WARNING] root: Checkpoint is missing variable [bifpn_bn_1_0/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,027 [WARNING] root: Checkpoint is missing variable [bifpn0_2_10_0_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,027 [WARNING] root: Checkpoint is missing variable [class-0-bn-3/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,027 [WARNING] root: Checkpoint is missing variable [box-1-bn-4/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,027 [WARNING] root: Checkpoint is missing variable [box-2-bn-6/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,028 [WARNING] root: Checkpoint is missing variable [block5a_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,028 [WARNING] root: Checkpoint is missing variable [block6b_expand_conv/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,028 [WARNING] root: Checkpoint is missing variable [box-1-bn-5/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,028 [WARNING] root: Checkpoint is missing variable [block4b_expand_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,028 [WARNING] root: Checkpoint is missing variable [bifpn_bn_6_0/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,028 [WARNING] root: Checkpoint is missing variable [bifpn_bn_1_1/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,028 [WARNING] root: Checkpoint is missing variable [block3b_expand_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,028 [WARNING] root: Checkpoint is missing variable [block4a_se_reduce/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,028 [WARNING] root: Checkpoint is missing variable [bifpn0_2_6_0/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,028 [WARNING] root: Checkpoint is missing variable [bifpn0_1_9_0_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,028 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_2_2/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,028 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_4_2/depthwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,028 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_7_2/pointwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,028 [WARNING] root: Checkpoint is missing variable [box-1/depthwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,028 [WARNING] root: Checkpoint is missing variable [class-1-bn-5/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,028 [WARNING] root: Checkpoint is missing variable [block6a_expand_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,028 [WARNING] root: Checkpoint is missing variable [class-0-bn-4/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,028 [WARNING] root: Checkpoint is missing variable [box-1-bn-7/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,028 [WARNING] root: Checkpoint is missing variable [bifpn_bn_2_0/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,028 [WARNING] root: Checkpoint is missing variable [block2a_expand_conv/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,028 [WARNING] root: Checkpoint is missing variable [bifpn_bn_2_0/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,028 [WARNING] root: Checkpoint is missing variable [weighted_fusion_7_1/weighted_fusion_7_1/ExponentialMovingAverage]
2022-06-29 15:58:17,028 [WARNING] root: Checkpoint is missing variable [weighted_fusion_6_2/weighted_fusion_6_2/ExponentialMovingAverage]
2022-06-29 15:58:17,029 [WARNING] root: Checkpoint is missing variable [block1a_project_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,029 [WARNING] root: Checkpoint is missing variable [block3b_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,029 [WARNING] root: Checkpoint is missing variable [block2a_se_expand/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,029 [WARNING] root: Checkpoint is missing variable [block6c_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,029 [WARNING] root: Checkpoint is missing variable [block7a_expand_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,029 [WARNING] root: Checkpoint is missing variable [bifpn0_1_7_0_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,029 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_5_0/pointwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,029 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_3_1/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,029 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_0_2/pointwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,029 [WARNING] root: Checkpoint is missing variable [bifpn_bn_3_2/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,029 [WARNING] root: Checkpoint is missing variable [box-2/pointwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,029 [WARNING] root: Checkpoint is missing variable [bifpn_bn_6_0/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,029 [WARNING] root: Checkpoint is missing variable [block1a_se_reduce/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,029 [WARNING] root: Checkpoint is missing variable [block4b_project_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,029 [WARNING] root: Checkpoint is missing variable [block6a_se_expand/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,029 [WARNING] root: Checkpoint is missing variable [bifpn0_0_8_0_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,029 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_6_2/depthwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,029 [WARNING] root: Checkpoint is missing variable [block4a_project_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,029 [WARNING] root: Checkpoint is missing variable [bifpn_bn_4_1/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,029 [WARNING] root: Checkpoint is missing variable [box-0-bn-3/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,029 [WARNING] root: Checkpoint is missing variable [box-2-bn-5/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,029 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_7_0/depthwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,029 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_1_1/pointwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,029 [WARNING] root: Checkpoint is missing variable [block4c_expand_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,030 [WARNING] root: Checkpoint is missing variable [bifpn_bn_6_1/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,030 [WARNING] root: Checkpoint is missing variable [box-1-bn-4/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,030 [WARNING] root: Checkpoint is missing variable [box-2-bn-5/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,030 [WARNING] root: Checkpoint is missing variable [box-1-bn-7/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,030 [WARNING] root: Checkpoint is missing variable [box-0-bn-7/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,030 [WARNING] root: Checkpoint is missing variable [class-2-bn-7/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,030 [WARNING] root: Checkpoint is missing variable [block2a_se_reduce/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,030 [WARNING] root: Checkpoint is missing variable [block5c_project_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,030 [WARNING] root: Checkpoint is missing variable [block6d_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,030 [WARNING] root: Checkpoint is missing variable [weighted_fusion_5_1/weighted_fusion_5_1/ExponentialMovingAverage]
2022-06-29 15:58:17,030 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_5_1/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,030 [WARNING] root: Checkpoint is missing variable [bifpn_bn_3_2/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,030 [WARNING] root: Checkpoint is missing variable [bifpn_bn_5_2/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,030 [WARNING] root: Checkpoint is missing variable [box-1-bn-6/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,030 [WARNING] root: Checkpoint is missing variable [box-1-bn-7/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,030 [WARNING] root: Checkpoint is missing variable [class-2-bn-7/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,030 [WARNING] root: Checkpoint is missing variable [block1a_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,030 [WARNING] root: Checkpoint is missing variable [bifpn_bn_0_0/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,030 [WARNING] root: Checkpoint is missing variable [block2a_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,030 [WARNING] root: Checkpoint is missing variable [block2a_project_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,030 [WARNING] root: Checkpoint is missing variable [box-0-bn-6/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,030 [WARNING] root: Checkpoint is missing variable [class-2-bn-4/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,030 [WARNING] root: Checkpoint is missing variable [block2a_se_reduce/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,031 [WARNING] root: Checkpoint is missing variable [bifpn0_2_10_0/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,031 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_4_1/depthwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,031 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_1_2/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,031 [WARNING] root: Checkpoint is missing variable [weighted_fusion_2_2/weighted_fusion_2_2/ExponentialMovingAverage]
2022-06-29 15:58:17,031 [WARNING] root: Checkpoint is missing variable [block2b_project_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,031 [WARNING] root: Checkpoint is missing variable [block6a_project_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,031 [WARNING] root: Checkpoint is missing variable [block4a_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,031 [WARNING] root: Checkpoint is missing variable [block4c_project_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,031 [WARNING] root: Checkpoint is missing variable [weighted_fusion_6_0/weighted_fusion_6_0/ExponentialMovingAverage]
2022-06-29 15:58:17,031 [WARNING] root: Checkpoint is missing variable [block2b_expand_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,031 [WARNING] root: Checkpoint is missing variable [box-1-bn-3/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,031 [WARNING] root: Checkpoint is missing variable [block4b_project_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,031 [WARNING] root: Checkpoint is missing variable [box-1-bn-6/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,031 [WARNING] root: Checkpoint is missing variable [block2b_se_reduce/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,031 [WARNING] root: Checkpoint is missing variable [block4c_expand_conv/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,031 [WARNING] root: Checkpoint is missing variable [block6a_expand_conv/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,031 [WARNING] root: Checkpoint is missing variable [block6c_expand_conv/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,031 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_0_1/pointwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,031 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_2_1/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,031 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_7_1/pointwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,031 [WARNING] root: Checkpoint is missing variable [bifpn_bn_2_2/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,031 [WARNING] root: Checkpoint is missing variable [block6b_expand_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,031 [WARNING] root: Checkpoint is missing variable [block5c_project_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,031 [WARNING] root: Checkpoint is missing variable [block3a_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,032 [WARNING] root: Checkpoint is missing variable [block6a_se_reduce/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,032 [WARNING] root: Checkpoint is missing variable [block7a_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,032 [WARNING] root: Checkpoint is missing variable [bifpn0_2_6_0/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,032 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_3_0/depthwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,032 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_7_0/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,032 [WARNING] root: Checkpoint is missing variable [bifpn_bn_2_2/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,032 [WARNING] root: Checkpoint is missing variable [bifpn_bn_5_2/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,032 [WARNING] root: Checkpoint is missing variable [class-0/depthwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,032 [WARNING] root: Checkpoint is missing variable [box-0-bn-4/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,032 [WARNING] root: Checkpoint is missing variable [block3a_project_conv/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,032 [WARNING] root: Checkpoint is missing variable [block3b_se_expand/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,032 [WARNING] root: Checkpoint is missing variable [block4c_dwconv/depthwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,032 [WARNING] root: Checkpoint is missing variable [block5a_project_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,032 [WARNING] root: Checkpoint is missing variable [block5b_expand_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,032 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_2_0/pointwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,032 [WARNING] root: Checkpoint is missing variable [bifpn0_1_9_0/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,032 [WARNING] root: Checkpoint is missing variable [bifpn_bn_3_1/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,032 [WARNING] root: Checkpoint is missing variable [block3a_expand_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,032 [WARNING] root: Checkpoint is missing variable [block6b_expand_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,032 [WARNING] root: Checkpoint is missing variable [class-2-bn-5/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,032 [WARNING] root: Checkpoint is missing variable [block7a_expand_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,032 [WARNING] root: Checkpoint is missing variable [block2a_se_expand/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,032 [WARNING] root: Checkpoint is missing variable [block2b_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,032 [WARNING] root: Checkpoint is missing variable [block5b_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,033 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_1_0/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,033 [WARNING] root: Checkpoint is missing variable [bifpn_bn_6_0/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,033 [WARNING] root: Checkpoint is missing variable [class-1-bn-4/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,033 [WARNING] root: Checkpoint is missing variable [box-1-bn-6/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,033 [WARNING] root: Checkpoint is missing variable [box-2/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,033 [WARNING] root: Checkpoint is missing variable [bifpn_bn_3_2/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,033 [WARNING] root: Checkpoint is missing variable [box-0-bn-7/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,033 [WARNING] root: Checkpoint is missing variable [block5b_project_conv/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,033 [WARNING] root: Checkpoint is missing variable [block6a_se_expand/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,033 [WARNING] root: Checkpoint is missing variable [bifpn0_1_7_0/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,033 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_6_2/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,033 [WARNING] root: Checkpoint is missing variable [class-0-bn-6/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,033 [WARNING] root: Checkpoint is missing variable [box-2-bn-3/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,033 [WARNING] root: Checkpoint is missing variable [box-predict/pointwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,033 [WARNING] root: Checkpoint is missing variable [block3a_expand_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,033 [WARNING] root: Checkpoint is missing variable [block6a_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,033 [WARNING] root: Checkpoint is missing variable [block6c_expand_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,033 [WARNING] root: Checkpoint is missing variable [box-0-bn-5/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,034 [WARNING] root: Checkpoint is missing variable [block3a_project_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,034 [WARNING] root: Checkpoint is missing variable [block3b_expand_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,034 [WARNING] root: Checkpoint is missing variable [bifpn0_2_6_0_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,034 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_4_0/pointwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,034 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_6_1/depthwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,034 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_7_2/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,034 [WARNING] root: Checkpoint is missing variable [box-0-bn-3/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,034 [WARNING] root: Checkpoint is missing variable [class-1-bn-3/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,034 [WARNING] root: Checkpoint is missing variable [block6c_expand_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,034 [WARNING] root: Checkpoint is missing variable [class-0-bn-6/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,035 [WARNING] root: Checkpoint is missing variable [stem_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,035 [WARNING] root: Checkpoint is missing variable [block5c_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,035 [WARNING] root: Checkpoint is missing variable [weighted_fusion_2_0/weighted_fusion_2_0/ExponentialMovingAverage]
2022-06-29 15:58:17,035 [WARNING] root: Checkpoint is missing variable [weighted_fusion_0_1/weighted_fusion_0_1/ExponentialMovingAverage]
2022-06-29 15:58:17,035 [WARNING] root: Checkpoint is missing variable [class-0-bn-5/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,035 [WARNING] root: Checkpoint is missing variable [box-2/depthwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,035 [WARNING] root: Checkpoint is missing variable [class-2-bn-7/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,035 [WARNING] root: Checkpoint is missing variable [block5a_expand_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,035 [WARNING] root: Checkpoint is missing variable [bifpn_bn_5_2/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,035 [WARNING] root: Checkpoint is missing variable [box-2-bn-4/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,035 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_0_2/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,036 [WARNING] root: Checkpoint is missing variable [bifpn_bn_1_2/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,036 [WARNING] root: Checkpoint is missing variable [stem_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,036 [WARNING] root: Checkpoint is missing variable [block3b_se_reduce/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,036 [WARNING] root: Checkpoint is missing variable [block5a_dwconv/depthwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,036 [WARNING] root: Checkpoint is missing variable [block5b_project_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,036 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_5_0/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,036 [WARNING] root: Checkpoint is missing variable [bifpn_bn_3_1/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,036 [WARNING] root: Checkpoint is missing variable [bifpn_bn_5_1/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,036 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_4_2/pointwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,036 [WARNING] root: Checkpoint is missing variable [block4b_project_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,036 [WARNING] root: Checkpoint is missing variable [block4a_expand_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,036 [WARNING] root: Checkpoint is missing variable [block4a_dwconv/depthwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,036 [WARNING] root: Checkpoint is missing variable [block4b_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,036 [WARNING] root: Checkpoint is missing variable [bifpn0_1_7_0_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,036 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_4_1/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,036 [WARNING] root: Checkpoint is missing variable [class-1-bn-7/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,036 [WARNING] root: Checkpoint is missing variable [block3b_project_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,036 [WARNING] root: Checkpoint is missing variable [block6b_project_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,036 [WARNING] root: Checkpoint is missing variable [block6c_project_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,037 [WARNING] root: Checkpoint is missing variable [block6d_expand_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,037 [WARNING] root: Checkpoint is missing variable [block4b_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,037 [WARNING] root: Checkpoint is missing variable [block2a_expand_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,037 [WARNING] root: Checkpoint is missing variable [block5c_project_conv/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,037 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_1_1/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,037 [WARNING] root: Checkpoint is missing variable [bifpn_bn_2_1/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,037 [WARNING] root: Checkpoint is missing variable [class-1/depthwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,037 [WARNING] root: Checkpoint is missing variable [box-2-bn-7/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,037 [WARNING] root: Checkpoint is missing variable [block5c_project_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,037 [WARNING] root: Checkpoint is missing variable [class-0-bn-5/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,037 [WARNING] root: Checkpoint is missing variable [block5c_project_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,037 [WARNING] root: Checkpoint is missing variable [bifpn_bn_1_0/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,037 [WARNING] root: Checkpoint is missing variable [bifpn0_2_10_0_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,037 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_6_2/pointwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,037 [WARNING] root: Checkpoint is missing variable [class-2-bn-3/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,037 [WARNING] root: Checkpoint is missing variable [bifpn_bn_2_2/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,037 [WARNING] root: Checkpoint is missing variable [class-2-bn-6/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,037 [WARNING] root: Checkpoint is missing variable [block4a_project_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,037 [WARNING] root: Checkpoint is missing variable [block4b_expand_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,038 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_7_0/pointwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,038 [WARNING] root: Checkpoint is missing variable [block2b_expand_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,038 [WARNING] root: Checkpoint is missing variable [block3a_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,038 [WARNING] root: Checkpoint is missing variable [bifpn_bn_2_2/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,038 [WARNING] root: Checkpoint is missing variable [class-0-bn-7/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,038 [WARNING] root: Checkpoint is missing variable [box-1-bn-3/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,038 [WARNING] root: Checkpoint is missing variable [box-2-bn-3/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,038 [WARNING] root: Checkpoint is missing variable [box-2-bn-7/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,038 [WARNING] root: Checkpoint is missing variable [class-2/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,038 [WARNING] root: Checkpoint is missing variable [box-0-bn-4/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,038 [WARNING] root: Checkpoint is missing variable [block2b_expand_conv/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,038 [WARNING] root: Checkpoint is missing variable [weighted_fusion_4_2/weighted_fusion_4_2/ExponentialMovingAverage]
2022-06-29 15:58:17,038 [WARNING] root: Checkpoint is missing variable [class-0-bn-3/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,038 [WARNING] root: Checkpoint is missing variable [block3b_project_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,038 [WARNING] root: Checkpoint is missing variable [block7a_project_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,038 [WARNING] root: Checkpoint is missing variable [bifpn_bn_3_1/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,038 [WARNING] root: Checkpoint is missing variable [class-1-bn-7/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,038 [WARNING] root: Checkpoint is missing variable [block5c_expand_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,038 [WARNING] root: Checkpoint is missing variable [block4b_project_conv/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,038 [WARNING] root: Checkpoint is missing variable [bifpn0_0_8_0_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,038 [WARNING] root: Checkpoint is missing variable [block6a_se_reduce/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,039 [WARNING] root: Checkpoint is missing variable [block4a_expand_conv/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,039 [WARNING] root: Checkpoint is missing variable [block5b_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,039 [WARNING] root: Checkpoint is missing variable [block7a_expand_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,039 [WARNING] root: Checkpoint is missing variable [bifpn0_1_9_0_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,039 [WARNING] root: Checkpoint is missing variable [block4a_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,039 [WARNING] root: Checkpoint is missing variable [block3b_se_expand/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,039 [WARNING] root: Checkpoint is missing variable [block4b_project_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,039 [WARNING] root: Checkpoint is missing variable [block4c_expand_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,039 [WARNING] root: Checkpoint is missing variable [block7a_expand_conv/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,039 [WARNING] root: Checkpoint is missing variable [bifpn0_1_7_0/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,039 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_6_0/depthwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,039 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_0_1/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,039 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_7_1/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,039 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_4_1/pointwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,039 [WARNING] root: Checkpoint is missing variable [bifpn_bn_7_2/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,039 [WARNING] root: Checkpoint is missing variable [block4c_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,039 [WARNING] root: Checkpoint is missing variable [box-1-bn-6/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,039 [WARNING] root: Checkpoint is missing variable [block1a_project_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,039 [WARNING] root: Checkpoint is missing variable [block2b_project_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,040 [WARNING] root: Checkpoint is missing variable [block4c_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,040 [WARNING] root: Checkpoint is missing variable [block5c_expand_conv/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,040 [WARNING] root: Checkpoint is missing variable [box-1-bn-3/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,040 [WARNING] root: Checkpoint is missing variable [bifpn_bn_3_2/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,040 [WARNING] root: Checkpoint is missing variable [class-1-bn-3/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,040 [WARNING] root: Checkpoint is missing variable [class-1-bn-3/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,040 [WARNING] root: Checkpoint is missing variable [block4c_project_conv/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,040 [WARNING] root: Checkpoint is missing variable [block6a_expand_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,040 [WARNING] root: Checkpoint is missing variable [block6b_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,040 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_2_0/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,040 [WARNING] root: Checkpoint is missing variable [bifpn_bn_5_0/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,040 [WARNING] root: Checkpoint is missing variable [bifpn_bn_0_2/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,040 [WARNING] root: Checkpoint is missing variable [class-0/pointwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,040 [WARNING] root: Checkpoint is missing variable [block5c_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,040 [WARNING] root: Checkpoint is missing variable [bifpn_bn_3_1/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,040 [WARNING] root: Checkpoint is missing variable [class-2-bn-3/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,040 [WARNING] root: Checkpoint is missing variable [box-2-bn-6/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,040 [WARNING] root: Checkpoint is missing variable [block3a_expand_conv/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,041 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_3_0/pointwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,041 [WARNING] root: Checkpoint is missing variable [bifpn_bn_5_1/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,041 [WARNING] root: Checkpoint is missing variable [bifpn_bn_1_2/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,041 [WARNING] root: Checkpoint is missing variable [bifpn_bn_1_0/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,041 [WARNING] root: Checkpoint is missing variable [class-1-bn-4/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,041 [WARNING] root: Checkpoint is missing variable [bifpn_bn_5_1/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,041 [WARNING] root: Checkpoint is missing variable [block5a_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,041 [WARNING] root: Checkpoint is missing variable [block6b_se_expand/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,041 [WARNING] root: Checkpoint is missing variable [bifpn_bn_1_2/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,041 [WARNING] root: Checkpoint is missing variable [bifpn_bn_1_1/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,041 [WARNING] root: Checkpoint is missing variable [class-0/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,041 [WARNING] root: Checkpoint is missing variable [box-1-bn-3/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,041 [WARNING] root: Checkpoint is missing variable [box-predict/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,041 [WARNING] root: Checkpoint is missing variable [block4b_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,041 [WARNING] root: Checkpoint is missing variable [block3a_se_expand/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,041 [WARNING] root: Checkpoint is missing variable [block4c_project_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,041 [WARNING] root: Checkpoint is missing variable [bifpn0_0_8_0/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,041 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_3_0/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,041 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_4_0/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,041 [WARNING] root: Checkpoint is missing variable [bifpn_bn_1_1/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,041 [WARNING] root: Checkpoint is missing variable [bifpn_bn_7_2/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,041 [WARNING] root: Checkpoint is missing variable [block5b_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,041 [WARNING] root: Checkpoint is missing variable [bifpn_bn_2_1/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,041 [WARNING] root: Checkpoint is missing variable [box-0-bn-6/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,042 [WARNING] root: Checkpoint is missing variable [block6a_expand_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,042 [WARNING] root: Checkpoint is missing variable [box-1-bn-4/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,042 [WARNING] root: Checkpoint is missing variable [block6a_project_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,042 [WARNING] root: Checkpoint is missing variable [block3b_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,042 [WARNING] root: Checkpoint is missing variable [weighted_fusion_5_2/weighted_fusion_5_2/ExponentialMovingAverage]
2022-06-29 15:58:17,042 [WARNING] root: Checkpoint is missing variable [bifpn_bn_5_1/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,042 [WARNING] root: Checkpoint is missing variable [block5a_project_conv/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,042 [WARNING] root: Checkpoint is missing variable [block5b_se_expand/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,042 [WARNING] root: Checkpoint is missing variable [block6a_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,042 [WARNING] root: Checkpoint is missing variable [block6c_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,042 [WARNING] root: Checkpoint is missing variable [block6c_project_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,042 [WARNING] root: Checkpoint is missing variable [p6_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,042 [WARNING] root: Checkpoint is missing variable [bifpn_bn_5_0/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,042 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_6_1/pointwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,042 [WARNING] root: Checkpoint is missing variable [bifpn_bn_0_2/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,042 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_4_2/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,042 [WARNING] root: Checkpoint is missing variable [bifpn_bn_3_0/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,042 [WARNING] root: Checkpoint is missing variable [block5a_project_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,042 [WARNING] root: Checkpoint is missing variable [block5b_expand_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,042 [WARNING] root: Checkpoint is missing variable [block6b_project_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,042 [WARNING] root: Checkpoint is missing variable [weighted_fusion_6_1/weighted_fusion_6_1/ExponentialMovingAverage]
2022-06-29 15:58:17,042 [WARNING] root: Checkpoint is missing variable [box-1-bn-7/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,042 [WARNING] root: Checkpoint is missing variable [box-predict/depthwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,042 [WARNING] root: Checkpoint is missing variable [block5a_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,042 [WARNING] root: Checkpoint is missing variable [class-1-bn-4/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,042 [WARNING] root: Checkpoint is missing variable [block6b_se_reduce/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,043 [WARNING] root: Checkpoint is missing variable [block6c_se_expand/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,043 [WARNING] root: Checkpoint is missing variable [block6d_expand_conv/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,043 [WARNING] root: Checkpoint is missing variable [block7a_project_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,043 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_3_2/depthwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,043 [WARNING] root: Checkpoint is missing variable [block3a_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,043 [WARNING] root: Checkpoint is missing variable [box-2-bn-7/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,043 [WARNING] root: Checkpoint is missing variable [block5c_se_expand/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,043 [WARNING] root: Checkpoint is missing variable [block6d_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,043 [WARNING] root: Checkpoint is missing variable [block6d_project_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,043 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_6_0/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,043 [WARNING] root: Checkpoint is missing variable [class-0-bn-4/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,043 [WARNING] root: Checkpoint is missing variable [class-1-bn-6/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,043 [WARNING] root: Checkpoint is missing variable [class-2/depthwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,043 [WARNING] root: Checkpoint is missing variable [class-2-bn-6/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,043 [WARNING] root: Checkpoint is missing variable [stem_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,043 [WARNING] root: Checkpoint is missing variable [bifpn_bn_7_2/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,043 [WARNING] root: Checkpoint is missing variable [block1a_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,043 [WARNING] root: Checkpoint is missing variable [block5a_expand_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,043 [WARNING] root: Checkpoint is missing variable [block6b_se_reduce/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,043 [WARNING] root: Checkpoint is missing variable [p6/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,043 [WARNING] root: Checkpoint is missing variable [bifpn_bn_2_0/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,043 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_6_1/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,044 [WARNING] root: Checkpoint is missing variable [bifpn_bn_7_1/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,044 [WARNING] root: Checkpoint is missing variable [class-1/pointwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,044 [WARNING] root: Checkpoint is missing variable [block3a_project_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,044 [WARNING] root: Checkpoint is missing variable [block3b_expand_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,044 [WARNING] root: Checkpoint is missing variable [block1a_project_conv/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,044 [WARNING] root: Checkpoint is missing variable [block3b_se_reduce/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,044 [WARNING] root: Checkpoint is missing variable [block6d_se_expand/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,044 [WARNING] root: Checkpoint is missing variable [bifpn0_1_9_0_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,044 [WARNING] root: Checkpoint is missing variable [box-0-bn-6/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,044 [WARNING] root: Checkpoint is missing variable [class-1-bn-7/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,044 [WARNING] root: Checkpoint is missing variable [class-2-bn-4/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,044 [WARNING] root: Checkpoint is missing variable [block6b_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,044 [WARNING] root: Checkpoint is missing variable [bifpn_bn_5_0/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,044 [WARNING] root: Checkpoint is missing variable [bifpn_bn_0_2/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,044 [WARNING] root: Checkpoint is missing variable [block4a_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,044 [WARNING] root: Checkpoint is missing variable [block5c_expand_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,044 [WARNING] root: Checkpoint is missing variable [bifpn_bn_0_1/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,044 [WARNING] root: Checkpoint is missing variable [class-1/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,044 [WARNING] root: Checkpoint is missing variable [box-2-bn-5/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,044 [WARNING] root: Checkpoint is missing variable [block5b_project_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,044 [WARNING] root: Checkpoint is missing variable [class-1-bn-7/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,044 [WARNING] root: Checkpoint is missing variable [bifpn_bn_4_0/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,044 [WARNING] root: Checkpoint is missing variable [bifpn_bn_2_1/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,044 [WARNING] root: Checkpoint is missing variable [block2a_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,044 [WARNING] root: Checkpoint is missing variable [block3a_se_reduce/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,044 [WARNING] root: Checkpoint is missing variable [block4a_project_conv/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,045 [WARNING] root: Checkpoint is missing variable [block4b_se_expand/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,045 [WARNING] root: Checkpoint is missing variable [block6b_se_expand/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,045 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_5_2/depthwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,045 [WARNING] root: Checkpoint is missing variable [class-1-bn-3/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,045 [WARNING] root: Checkpoint is missing variable [box-1-bn-4/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,045 [WARNING] root: Checkpoint is missing variable [box-2-bn-3/moving_mean/ExponentialMovingAverage]
2022-06-29 15:58:17,045 [WARNING] root: Checkpoint is missing variable [bifpn0_0_8_0/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,045 [WARNING] root: Checkpoint is missing variable [block1a_project_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,045 [WARNING] root: Checkpoint is missing variable [block2a_expand_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,045 [WARNING] root: Checkpoint is missing variable [block2a_dwconv/depthwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,045 [WARNING] root: Checkpoint is missing variable [block3a_se_expand/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,045 [WARNING] root: Checkpoint is missing variable [block3b_bn/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,045 [WARNING] root: Checkpoint is missing variable [block5c_se_reduce/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,045 [WARNING] root: Checkpoint is missing variable [block6d_dwconv/depthwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,045 [WARNING] root: Checkpoint is missing variable [block7a_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,045 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_0_0/depthwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,045 [WARNING] root: Checkpoint is missing variable [box-1-bn-5/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,045 [WARNING] root: Checkpoint is missing variable [block7a_bn/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,045 [WARNING] root: Checkpoint is missing variable [bifpn0_2_10_0/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,045 [WARNING] root: Checkpoint is missing variable [after_combine_dw_conv_6_0/pointwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,045 [WARNING] root: Checkpoint is missing variable [weighted_fusion_1_0/weighted_fusion_1_0/ExponentialMovingAverage]
2022-06-29 15:58:17,045 [WARNING] root: Checkpoint is missing variable [block5b_se_reduce/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,045 [WARNING] root: Checkpoint is missing variable [block6a_dwconv/depthwise_kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,045 [WARNING] root: Checkpoint is missing variable [block6c_se_reduce/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,045 [WARNING] root: Checkpoint is missing variable [bifpn_bn_4_2/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,045 [WARNING] root: Checkpoint is missing variable [class-0-bn-3/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,046 [WARNING] root: Checkpoint is missing variable [box-0-bn-7/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,046 [WARNING] root: Checkpoint is missing variable [class-2-bn-5/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,046 [WARNING] root: Checkpoint is missing variable [class-1-bn-5/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,046 [WARNING] root: Checkpoint is missing variable [box-1-bn-5/moving_variance/ExponentialMovingAverage]
2022-06-29 15:58:17,046 [WARNING] root: Checkpoint is missing variable [block2b_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,046 [WARNING] root: Checkpoint is missing variable [block4a_project_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,046 [WARNING] root: Checkpoint is missing variable [block4b_expand_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,046 [WARNING] root: Checkpoint is missing variable [block5b_se_expand/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,046 [WARNING] root: Checkpoint is missing variable [block5c_se_reduce/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,046 [WARNING] root: Checkpoint is missing variable [block6d_se_reduce/bias/ExponentialMovingAverage]
2022-06-29 15:58:17,046 [WARNING] root: Checkpoint is missing variable [block7a_se_expand/kernel/ExponentialMovingAverage]
2022-06-29 15:58:17,046 [WARNING] root: Checkpoint is missing variable [p6_bn/beta/ExponentialMovingAverage]
2022-06-29 15:58:17,046 [WARNING] root: Checkpoint is missing variable [bifpn_bn_4_2/gamma/ExponentialMovingAverage]
2022-06-29 15:58:17,046 [WARNING] root: Checkpoint is missing variable [block6c_bn/moving_mean/ExponentialMovingAverage]
2022-06-29 15:59:05,512 [INFO] root: Saving checkpoints for 0 into /workspace/tao-experiments/efficientdet/experiment_dir_retrain/model.step-0.tlt.
[GPU 00] Restoring pretrained weights (676 Tensors)
2022-06-29 15:59:09,033 [INFO] root: Pretrained weights loaded with success...

Very strange. So you can run training well but get stuck at retraining?

That was the case.

So what I’m doing right now is to run the training again which should take a while and try to do everything after that (prune/retrain).

Also, I downloaded the cv_samples_v1.4.1 instead of the v1.4.0.

The thing is, even in the train it gets stuck.

So, the training is also getting stuck, right? Could you share the log as well?

To narrow down, could you please use 510 driver instead of 515 driver?

I sure can!


2022-06-30 18:48:58.447553: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
Using TensorFlow backend.
2022-06-30 18:49:00.964795: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
Loading experiment spec at %s. /workspace/tao-experiments/efficientdet/specs/efficientdet_d0_train.txt
2022-06-30 18:49:01,647 [INFO] iva.efficientdet.utils.spec_loader: Merging specification from /workspace/tao-experiments/efficientdet/specs/efficientdet_d0_train.txt
2022-06-30 18:49:01,650 [INFO] root: Loading weights from /workspace/tao-experiments/efficientdet/experiment_dir_unpruned/model.step-0.tlt
2022-06-30 18:49:02,581 [INFO] root: Loading weights from /workspace/tao-experiments/efficientdet/experiment_dir_unpruned/model.step-0.tlt
2022-06-30 18:49:03,587 [INFO] iva.common.logging.logging: Log file already exists at /workspace/tao-experiments/efficientdet/experiment_dir_unpruned/status.json
2022-06-30 18:49:03,587 [INFO] root: Starting EfficientDet training.
2022-06-30 18:49:03,587 [INFO] root: [train] AMP is activated - Experiment Feature
2022-06-30 18:49:03,588 [INFO] root: Create EncryptCheckpointSaverHook.
2022-06-30 18:49:03,588 [INFO] root: Starting training cycle: 1, epoch: 0.
2022-06-30 18:49:03.711334: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1
2022-06-30 18:49:03.741249: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-30 18:49:03.741543: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1666] Found device 0 with properties: 
name: NVIDIA GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.65
pciBusID: 0000:41:00.0
2022-06-30 18:49:03.741572: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2022-06-30 18:49:03.746545: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11
2022-06-30 18:49:03.778876: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2022-06-30 18:49:03.779437: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10
2022-06-30 18:49:03.780409: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.11
2022-06-30 18:49:03.782294: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.11
2022-06-30 18:49:03.782559: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8
2022-06-30 18:49:03.782720: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-30 18:49:03.783055: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-30 18:49:03.783237: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1794] Adding visible gpu devices: 0
2022-06-30 18:49:09,997 [INFO] iva.efficientdet.models.det_model_fn: LR schedule method: cosine
2022-06-30 18:49:10,320 [INFO] iva.efficientdet.models.det_model_fn: clip gradients norm by 5.000000

I will try and reply the result.

Thanks, for again the help!

@Morganh to use the 510 driver instead of 515 driver, I would have to downgrade the Cuda from 11.7 to 11.6.

Since I’m running this inside the docker, is there a safe way to do this without breaking the server it is running on? (we use DeepStream 6.1 for other purposes there and other things there so to change the drive would be a problem if done in the system outside the docker)

Got it. So, let us postpone this experiment for 510 driver.

For your latest “training” log as above,

2022-06-30 18:49:03.783055: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-30 18:49:03.783237: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1794] Adding visible gpu devices: 0
2022-06-30 18:49:09,997 [INFO] iva.efficientdet.models.det_model_fn: LR schedule method: cosine
2022-06-30 18:49:10,320 [INFO] iva.efficientdet.models.det_model_fn: clip gradients norm by 5.000000

Any other further log? Do you mean it gets stuck at 2022-06-30 18:49:10,320 [INFO] iva.efficientdet.models.det_model_fn: clip gradients norm by 5.000000 ?

Actually, I passed not the last version. I noticed it right now, but the last run inside the docker ended stuck after this:

2022-07-05 00:06:36,901 [INFO] root: Saving checkpoints for 0 into /workspace/tao-experiments/efficientdet/experiment_dir_unpruned/model.step-0.tlt.
[GPU 00] Restoring pretrained weights (1364 Tensors)
2022-07-05 00:06:40,877 [INFO] root: Pretrained weights loaded with success...

So, I got tired and decided to try in another server. Got the same results if I try to run the docker command:

 docker run --runtime=nvidia -it --rm --entrypoint "" \
-v ~/cv_samples_v1.4.1:/cv_samples_v1.4.1 \
-v /workspace/tao-experiments:/workspace/tao-experiments -p 25104:8888 \ 
nvcr.io/nvidia/tao/tao-toolkit-tf:v3.22.05-tf1.15.5-py3 \
/bin/bash

The other machine has this configuration:

root@6a4a7c71c42d:/workspace/tao-experiments/data# nvidia-smi
Tue Jul  5 01:51:28 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.47.03    Driver Version: 510.47.03    CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA TITAN X ...  On   | 00000000:01:00.0 Off |                  N/A |
| 51%   84C    P2   248W / 250W |   8898MiB / 12288MiB |     90%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

The 510.47.03 driver and running the same docker run command as above. The training was again stuck.

So now, I ditched this idea of running inside the docker and having more control. Now, I’m running everything outside.

I deleted all the data and started over. Downloading everything from scratch.

I generated the TFRecords for the Training and Val sets.

Now, instead of using the following command

!efficientdet train -e $SPECS_DIR/efficientdet_d0_train.txt \
                        -d $USER_EXPERIMENT_DIR/experiment_dir_unpruned\
                        -k $KEY \
                        --gpus $NUM_GPUS

Now I’m using this command instead:

!tao efficientdet train -e $SPECS_DIR/efficientdet_d0_train.txt \
                        -d $USER_EXPERIMENT_DIR/experiment_dir_unpruned\
                        -k $KEY \
                        --gpu_index 0

By doing these steps, finally I got the training working!

For multi-GPU, change --gpus based on your machine.
2022-07-04 21:05:26,307 [INFO] root: Registry: ['nvcr.io']
2022-07-04 21:05:26,358 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit-tf:v3.22.05-tf1.15.5-py3
Matplotlib created a temporary config/cache directory at /tmp/matplotlib-b0mgue4q because the default path (/.config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
Using TensorFlow backend.
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
/usr/local/lib/python3.6/dist-packages/requests/__init__.py:91: RequestsDependencyWarning: urllib3 (1.26.5) or chardet (3.0.4) doesn't match a supported version!
  RequestsDependencyWarning)
Using TensorFlow backend.
Loading experiment spec at %s. /workspace/tao-experiments/efficientdet/specs/efficientdet_d0_train.txt
2022-07-05 00:05:32,589 [INFO] iva.efficientdet.utils.spec_loader: Merging specification from /workspace/tao-experiments/efficientdet/specs/efficientdet_d0_train.txt
2022-07-05 00:05:32,591 [INFO] root: Loading weights from /workspace/tao-experiments/efficientdet/experiment_dir_unpruned/model.step-0.tlt
2022-07-05 00:05:33,495 [INFO] root: Loading weights from /workspace/tao-experiments/efficientdet/experiment_dir_unpruned/model.step-0.tlt
2022-07-05 00:05:34,375 [INFO] iva.common.logging.logging: Log file already exists at /workspace/tao-experiments/efficientdet/experiment_dir_unpruned/status.json
2022-07-05 00:05:34,375 [INFO] root: Starting EfficientDet training.
2022-07-05 00:05:34,375 [INFO] root: [train] AMP is activated - Experiment Feature
2022-07-05 00:05:34,376 [INFO] root: Create EncryptCheckpointSaverHook.
2022-07-05 00:05:34,376 [INFO] root: Starting training cycle: 1, epoch: 0.
2022-07-05 00:05:39,294 [INFO] iva.efficientdet.models.det_model_fn: LR schedule method: cosine
2022-07-05 00:05:39,580 [INFO] iva.efficientdet.models.det_model_fn: clip gradients norm by 5.000000
2022-07-05 00:06:36,901 [INFO] root: Saving checkpoints for 0 into /workspace/tao-experiments/efficientdet/experiment_dir_unpruned/model.step-0.tlt.
[GPU 00] Restoring pretrained weights (1364 Tensors)
2022-07-05 00:06:40,877 [INFO] root: Pretrained weights loaded with success...

2022-07-05 00:07:18,122 [INFO] iva.efficientdet.hooks.logging_hook: Global step 10 (epoch 1/100): loss: 3.25445 learning rate: 0.00201
2022-07-05 00:07:20,923 [INFO] iva.efficientdet.hooks.logging_hook: Global step 20 (epoch 1/100): loss: 3.96144 learning rate: 0.00201
2022-07-05 00:07:23,717 [INFO] iva.efficientdet.hooks.logging_hook: Global step 30 (epoch 1/100): loss: 3.16883 learning rate: 0.00202
2022-07-05 00:07:26,504 [INFO] iva.efficientdet.hooks.logging_hook: Global step 40 (epoch 1/100): loss: 3.40883 learning rate: 0.00202
2022-07-05 00:07:29,293 [INFO] iva.efficientdet.hooks.logging_hook: Global step 50 (epoch 1/100): loss: 3.43683 learning rate: 0.00203
2022-07-05 00:07:32,091 [INFO] iva.efficientdet.hooks.logging_hook: Global step 60 (epoch 1/100): loss: 3.79120 learning rate: 0.00204
2022-07-05 00:07:34,887 [INFO] iva.efficientdet.hooks.logging_hook: Global step 70 (epoch 1/100): loss: 3.35940 learning rate: 0.00204

**[...]** THE TRAINING CONTINUES AFTER THIS.

So, what I’m thinking is there’s something wrong with either the data generated or the docker.

I will have to wait for the training to run and after try the Prune and Retrain to see if everything is working properly.

OK, thanks for the info! Please go ahead your experiments.

After that, we can come back to check if it can work inside the docker.

Hello @viniciusarasantos , Can you let us know if it is still an issue or we can close it? thanks.