Please provide the following information when requesting support.
• Hardware (P2000)
• Network Type (PointPillars)
• TLT Version (Please run “tlt info --verbose” and share “docker_tag” here) I don’t have TLT installed
• Training spec file(If have, please share here)
CLASS_NAMES: [‘Object’]
DATA_CONFIG:
DATASET: ‘GeneralPCDataset’
DATA_PATH: ‘~/tao_point_pillars/data/pointpillars’
DATA_SPLIT: {
‘train’: train,
‘test’: val
}
INFO_PATH: {
‘train’: [infos_train.pkl],
‘test’: [infos_val.pkl],
}
BALANCED_RESAMPLING: False
POINT_FEATURE_ENCODING: {
encoding_type: absolute_coordinates_encoding,
used_feature_list: [‘x’, ‘y’, ‘z’, ‘intensity’],
src_feature_list: [‘x’, ‘y’, ‘z’, ‘intensity’],
}
POINT_CLOUD_RANGE: [0, -39.68, -3, 69.12, 39.68, 1]
# DATA_AUGMENTOR:
# DISABLE_AUG_LIST: [‘placeholder’]
# AUG_CONFIG_LIST:
# - NAME: gt_sampling
# DB_INFO_PATH:
# - dbinfos_train.pkl
# PREPARE: {
# filter_by_min_points: [‘Car:5’, ‘Pedestrian:5’, ‘Cyclist:5’],
# }
# SAMPLE_GROUPS: [‘Car:15’,‘Pedestrian:15’, ‘Cyclist:15’]
# NUM_POINT_FEATURES: 4
# DATABASE_WITH_FAKELIDAR: False
# REMOVE_EXTRA_WIDTH: [0.0, 0.0, 0.0]
# LIMIT_WHOLE_SCENE: False
# - NAME: random_world_flip
# ALONG_AXIS_LIST: [‘x’]
# - NAME: random_world_rotation
# WORLD_ROT_ANGLE: [-0.78539816, 0.78539816]
# - NAME: random_world_scaling
# WORLD_SCALE_RANGE: [0.95, 1.05]
DATA_PROCESSOR:
- NAME: mask_points_and_boxes_outside_range
#REMOVE_OUTSIDE_BOXES: True
- NAME: shuffle_points
SHUFFLE_ENABLED: {
‘train’: True,
‘test’: False
}
- NAME: transform_points_to_voxels
VOXEL_SIZE: [0.16, 0.16, 6]
MAX_POINTS_PER_VOXEL: 32
MAX_NUMBER_OF_VOXELS: {
‘train’: 16000,
‘test’: 10000
}
NUM_WORKERS: 4
MODEL:
NAME: PointPillar
VFE:
NAME: PillarVFE
WITH_DISTANCE: False
USE_ABSLOTE_XYZ: True
USE_NORM: True
NUM_FILTERS: [64]
MAP_TO_BEV:
NAME: PointPillarScatter
NUM_BEV_FEATURES: 64
BACKBONE_2D:
NAME: BaseBEVBackbone
LAYER_NUMS: [3, 5, 5]
LAYER_STRIDES: [2, 2, 2]
NUM_FILTERS: [64, 128, 256]
UPSAMPLE_STRIDES: [1, 2, 4]
NUM_UPSAMPLE_FILTERS: [128, 128, 128]
DENSE_HEAD:
NAME: AnchorHeadSingle
CLASS_AGNOSTIC: False
USE_DIRECTION_CLASSIFIER: True
DIR_OFFSET: 0.78539
DIR_LIMIT_OFFSET: 0.0
NUM_DIR_BINS: 2
ANCHOR_GENERATOR_CONFIG: [
{
‘class_name’: ‘Object’,
‘anchor_sizes’: [[1.90, 1.88, 1.56]],
‘anchor_rotations’: [-3.14, 3.14],
‘anchor_bottom_heights’: [1.24],
‘align_center’: False,
‘feature_map_stride’: 2,
‘matched_threshold’: 0.6,
‘unmatched_threshold’: 0.45
}
]
TARGET_ASSIGNER_CONFIG:
NAME: AxisAlignedTargetAssigner
POS_FRACTION: -1.0
SAMPLE_SIZE: 512
NORM_BY_NUM_EXAMPLES: False
MATCH_HEIGHT: False
BOX_CODER: ResidualCoder
LOSS_CONFIG:
LOSS_WEIGHTS: {
‘cls_weight’: 1.0,
‘loc_weight’: 2.0,
‘dir_weight’: 0.2,
‘code_weights’: [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
}
POST_PROCESSING:
RECALL_THRESH_LIST: [0.3, 0.5, 0.7]
SCORE_THRESH: 0.1
OUTPUT_RAW_SCORE: False
EVAL_METRIC: kitti
NMS_CONFIG:
MULTI_CLASSES_NMS: False
NMS_TYPE: nms_gpu
NMS_THRESH: 0.01
NMS_PRE_MAXSIZE: 4096
NMS_POST_MAXSIZE: 500
SYNC_BN: False
OPTIMIZATION:
BATCH_SIZE_PER_GPU: 4
NUM_EPOCHS: 150
OPTIMIZER: adam_onecycle
LR: 0.0001
WEIGHT_DECAY: 0.01
MOMENTUM: 0.9
MOMS: [0.95, 0.85]
PCT_START: 0.4
DIV_FACTOR: 10
DECAY_STEP_LIST: [35, 45]
LR_DECAY: 0.1
LR_CLIP: 0.0000001
LR_WARMUP: False
WARMUP_EPOCH: 1
GRAD_NORM_CLIP: 10
RESUME_MODEL_PATH: null
PRETRAINED_MODEL_PATH: null
PRUNED_MODEL_PATH: null
TCP_PORT: 18888
RANDOM_SEED: null
CKPT_INTERVAL: 1
MAX_CKPT_SAVE_NUM: 30
MERGE_ALL_ITERS_TO_ONE_EPOCH: False
EVALUATION:
BATCH_SIZE: 1
CKPT: “/workspace/tao-experiments/pointpillars/ckpt/ checkpoint_epoch_150.tlt”
INFERENCE:
MAX_POINTS_NUM: 25000
BATCH_SIZE: 1
CKPT: “/workspace/tao-experiments/pointpillars/ckpt/ checkpoint_epoch_150.tlt”
VIS_CONF_THRESH: 0.1
train:
batch_size: 1
num_epochs: 150
tcp_port: 18888
checkpoint_interval: 10
max_checkpoint_save_num: 150
merge_all_iters_to_one_epoch: False
model:
sync_bn: False
dataset:
class_names: [‘Bale’]
num_workers: 4
• How to reproduce the issue ? (This is for errors. Please share the command line and the detailed log here.)
tao model pointpillars train -e $SPECS_DIR/ pointpillars.yaml -r $USER_EXPERIMENT_DIR -k $KEY
2024-05-21 10:19:11,348 [TAO Toolkit] [INFO] root 160: Registry: [‘nvcr.io’] 2024-05-21 10:19:11,411 [TAO Toolkit] [INFO] nvidia_tao_cli.components.instance_handler.local_instance 360: Running command in container: nvcr.io/nvidia/tao/ tao-toolkit:5.3.0-pyt 2024-05-21 10:19:11,429 [TAO Toolkit] [WARNING] nvidia_tao_cli.components.docker_handler.docker_handler 288: Docker will run the commands as root. If you would like to retain your local host permissions, please add the “user”:“UID:GID” in the DockerOptions portion of the “/home/david/ .tao_mounts.json” file. You can obtain your users UID and GID by using the “id -u” and “id -g” commands on the terminal. 2024-05-21 10:19:11,429 [TAO Toolkit] [INFO] nvidia_tao_cli.components.docker_handler.docker_handler 301: Printing tty value True 2024-05-21 15:19:16,069 [INFO] matplotlib.font_manager: generated new fontManager python [/usr/local/lib/python3.10/dist-packages/nvidia_tao_pytorch/pointcloud/pointpillars/scripts/ train.py](/usr/local/lib/python3.10/dist-packages/nvidia_tao_pytorch/pointcloud/pointpillars/scripts/ train.py) --cfg_file [/workspace/tao-experiments/pointpillars/specs/ pointpillars.yaml](/workspace/tao-experiments/pointpillars/specs/ pointpillars.yaml) --output_dir /workspace/tao-experiments/pointpillars --key tlt_encode Traceback (most recent call last): File “/usr/local/lib/python3.10/dist-packages/nvidia_tao_pytorch/pointcloud/pointpillars/scripts/ train.py”, line 203, in raise e File “/usr/local/lib/python3.10/dist-packages/nvidia_tao_pytorch/pointcloud/pointpillars/scripts/ train.py”, line 187, in main() File “/usr/local/lib/python3.10/dist-packages/nvidia_tao_pytorch/pointcloud/pointpillars/scripts/ train.py”, line 57, in main args, cfg = parse_config() File “/usr/local/lib/python3.10/dist-packages/nvidia_tao_pytorch/pointcloud/pointpillars/scripts/ train.py”, line 51, in parse_config cfg_from_yaml_file(expand_path(args.cfg_file), cfg) File “/usr/local/lib/python3.10/dist-packages/nvidia_tao_pytorch/pointcloud/pointpillars/pcdet/ config.py”, line 94, in cfg_from_yaml_file if not hasattr(config.train, “resume_training_checkpoint_path”): AttributeError: ‘EasyDict’ object has no attribute ‘train’ 2024-05-21 10:19:21,614 [TAO Toolkit] [INFO] nvidia_tao_cli.components.docker_handler.docker_handler 363: Stopping container.
I am trying to train a model in tao using the getting started with tao jupyter notebooks.
When I try to train a point pillars model, I get an error that EasyDict is not finding the parameters. If I add a lowercase train tag to the yaml file it gets through this issue, but it then runs into something similar with the next item. It seems like the param file used in the getting started with tao jupyter notebooks are not the same as the code.
I had to modify the output slightly because I am not allowed to post too many links