Sampling for Classification Task

As per the FAQ, sample weighting can be done to balance the dataset.
https://docs.nvidia.com/clara/clara-train-sdk/pt/clara_faq.html?highlight=sample%20weights%20classes#how-to-enable-sampling-for-classification-task-to-balance-the-dataset
However, on applying the same, the following error is observed
MMAR_ROOT set to /mmar/commands/…
2021-08-26 07:01:17,340 - torch.distributed.nn.jit.instantiator - INFO - Created a temporary directory at /tmp/tmppcyelmuh
2021-08-26 07:01:17,341 - torch.distributed.nn.jit.instantiator - INFO - Writing /tmp/tmppcyelmuh/_remote_module_non_sriptable.py
Error processing config /mmar/commands/…/config/config_train_pe.json: object of type ‘int’ has no len()
Traceback (most recent call last):
File “/opt/conda/lib/python3.8/runpy.py”, line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File “/opt/conda/lib/python3.8/runpy.py”, line 87, in _run_code
exec(code, run_globals)
File “apps/train.py”, line 35, in
File “apps/train.py”, line 27, in main
File “apps/mmar_conf.py”, line 21, in train_mmar
File “<nvflare-0.1.4>/dlmed/utils/wfconf.py”, line 172, in configure
File “<nvflare-0.1.4>/dlmed/utils/wfconf.py”, line 167, in configure
File “<nvflare-0.1.4>/dlmed/utils/wfconf.py”, line 163, in _do_configure
File “apps/train_configer.py”, line 471, in finalize_config
File “apps/utils.py”, line 120, in sample_weights_by_classes
File “apps/utils.py”, line 120, in
File “apps/utils.py”, line 117, in _convert_label
TypeError: object of type ‘int’ has no len()

the json for dataset part looks as which is exactly the same as given in the example:
“dataset”: {
“name”: “Dataset”,
“data_list_file_path”: “{DATASET_JSON}”,
“data_file_base_dir”: “{DATA_ROOT}”,
“data_list_key”: “training”,
“sampling”:{
“mode”: “auto”
}
Any idea on what could be wrong?
thanks

Bump?

Hi Srikishnan. What does your labels look like? In order to perform the sampling, we try to get all the classes and find the list of unique classes from labels. This error is being thrown in that conversion.

Hello,
Attached is the dataset json I am using. training configuration JSON is exactly as that of
clara_pt_covid19_3d_ct_classification_1.

Please let me know if you need some more information or where I am going wrong.

Regards,
Krishnan

strat_classification_0.json (9.5 KB)