Clara Train FAQ

Q: How can I debug my data preparation/ augmentation output ?

A: You can use the save as nifti transformation at the end of your pre-tranforms to see samples of the out put. Also make sure to have interrupt flag set to true so the training stops and waits for a key press to generate new samples

{   "name": "SaveAsNifti",
   "args": {
       "fields": ["image","label"],
       "out_dir": "/data/_debugPatches/",
       "interrupt": true
   }
},

Q: For multi label segmentation problem, How to make network focus on the labels not the background ?

A: Set the skip background to true in the dice loss as below or you can write your own loss function as per the documentation

"loss": {
  "name": "Dice",
  "args": {
"skip_background": true 
  }
},

Q: How to show train/validation dice per label on tensorboard?

A: Change the metrics in the train.json file, below is example for total of 4 labels, 3 organs and the background. Also not the stopping flag in the validation is focusing on the 3rd label note the average

In the training section add

"aux_ops": [
  {
	"name": "DiceMaskedOutput",
	"args": {
  	"is_onehot_targets": false,
  	"skip_background": false,
  	"is_independent_predictions": false,
  	"tags": [
    	"dice_ALL",
    	"dice_d00",
    	"dice_d01",
    	"dice_d02",
    	"dice_d03"
  	]
	},
   "do_summary": true,
   "do_print": false
  }
],

Similarly in the validation section

"validate": {
  "metrics": [
	{"name": "MetricAverageFromArrayDice",
  	"args": {"name": "mean_dice","applied_key": "model"}},
	{"name": "MetricAverage", "args": {"name": "val_dice", "field": "dice_ALL"}},
	{"name": "MetricAverage", "args": {"name": "val_dice_00", "field": "dice_d00"}},
	{"name": "MetricAverage", "args": {"name": "val_dice_01", "field": "dice_d01"}},
	{"name": "MetricAverage", "args": {"name": "val_dice_02", "field": "dice_d02"}},
	{"name": "MetricAverage", "args": {"stopping_metric": true,"name": "val_dice_03", "field": "dice_d03"}
	}
  ],

Q:What are the supported 3D Data transforms?

A:

  1. TransformVolumeCropROI is a 3D elastic transform that has parameters to do:
    • 3d crop
    • 3d rotation
    • 3d scale
  2. Sampling ration
  3. NPResize3D
  4. NPRandomFlip3D
  5. NPRandomZoom3D

Q: How can I have a metric just on the loss?

A: For now you can use custom code as

class NegLoss(AuxiliaryOperation):

	def __init__(self, tag: str, do_summary=True, do_print=True):
    	AuxiliaryOperation.__init__(self, do_summary, do_print)
    	self.tag = tag

	def get_output_tensors(self, predictions, label, build_ctx: BuildContext):
    	loss = build_ctx.must_get(BuildContext.KEY_LOSS)
    	neg_loss = -loss
    	return {self.tag: neg_loss}

Then in the metric section of “validate”:

"metrics":
[
	{
    	"name": "ComputeAverage",
    	"args": {
      	"name": "negloss",
      	"is_key_metric": true,
      	"field": "negloss"
               }
        }
]

the following into the “train” section:

"aux_ops": [
  {
	"path": "your.path.to.NegLoss",
	"args": {
	      "tag”: “negloss”   
   }
  }
],

Q: Train config section: what is the different between train and val transforms?

A: Pre-transforms for the val section should be a subset of the train pre-transforms. Simply removing the augmentation transforms from the train pre-transforms

Q: Should train and val pipeline be the same ?

A: Most of the time, yes except when using caching pipeline. Caching should NOT be used in the val as it will give the wrong values since you cached the data and not ran all the validation dataset

Q: How can I fix the error below ?

Exception: <class 'ValueError'>: Cannot feed value of shape (1, 128, 128, 128) for Tensor 'NV_MODEL_INPUT:0', which has shape '(?, 1, 128, 128, 128)'

A: You are missing batching. You should either have batching done by the transform or by the pipeline

Q: How can I write a custom loader to read h5 files ?

A:

from ai4med.common.shape_format import ShapeFormat
from ai4med.common.transform_ctx import TransformContext
from ai4med.components.transforms.multi_field_transformer import MultiFieldTransformer
import h5py

class Myh5pyLoader(MultiFieldTransformer):
    def __init__(self, fields,shape="DHW", dtype="float32",h5Key='data'):
        MultiFieldTransformer.__init__(self, fields=fields)
        self._logger = logging.getLogger(self.__class__.__name__)
        self._dtype = dtype
        self._shape = ShapeFormat(shape)
        self._h5Key=h5Key

    def transform(self, transform_ctx: TransformContext):
        for field in self.fields:
            file_name = transform_ctx[field]
            img=self.load_h5py(file_name)
            transform_ctx.set_image(field, img)
        return transform_ctx

    def load_h5py(self,file_name):
        # assert shape, "Please provide a valid shape."
        assert file_name, "Please provide a filename."

        if isinstance(file_name, (bytes, bytearray)):
            file_name = file_name.decode('UTF-8')
        with h5py.File(file_name,'r') as hf:
            data = np.array(hf[self._h5Key])
            data = data.astype(self._dtype)

        img = MedicalImage(data, self._shape)
        img.set_property(ImageProperty.ORIGINAL_SHAPE, data.shape)
        img.set_property(ImageProperty.FILENAME, file_name)

        return img

Q: How can I get the class probability from segmentation prior to argmax as well as the result after the argmax?

A: you should use the fields_new arg in the argmax transformation in order to rename the result out of the argmax

"post_transforms": [
  {
    "name": "ArgmaxAcrossChannels",
    "args": {
      "fields": "model",
      "fields_new ": "modelArgMax"
    }
  }
],

Step 2- create 2 nifti writers one for model and one for modelArgMax note the dtype needs to be float to hold probabilities as:

"writers": [
    {
      "name": "WriteNifti",
      "args": {
        "field": "model",
        "dtype": "float32",
        "write_path": "{MMAR_EVAL_OUTPUT_PATH}"
      }
    },
    {
      "name": "WriteNifti",
      "args": {
        "field": "modelArgMax",
        "dtype": "uint8",
        "write_path": "{MMAR_EVAL_OUTPUT_PATH}"
      }
    }

  ]

Q: I don’t use Docker instead we use singularity. How can I convert Clara docker to singularity

A: please consult https://devblogs.nvidia.com/how-to-run-ngc-deep-learning-containers-with-singularity/