Segmentation Training Configuration

Hello, I have some questions regarding segmentation training. My goal is to use this SDK to perform segmentation with different models in 3D, 2D and potentially even 1D. I have looked at the online webinar, the SDK documentation and of course the showcase video so far, if there are any further resources I missed, let me know.

  1. The webinar slide about Bring Your Own Model models explains how the “args” key in the model definition (in config_train.json) becomes the arguments to the init function of the Model subclass. I am assuming something similar happens to the “args” key under the hood when we use one of NVIDIA’s custom built-in models like SegmAhnet3D, SegResnet or DenseNet121. Is there any documentation to the exact arguments for each of these models, and any documentation for any other built in models and their arguments?

  2. Are there built-in models for 2D and 1D segmentation (not necessarily pre-trained)? If so, what are their names and expected arguments?

  3. The 3D Anisotropic Hybrid Network paper on Arxiv mentions in Section 3.1 that the method should be able to use any network as the backbone, not necessarily just Resnet50. Does the SDK support using a different model as the backbone? Simply replacing the PRETRAIN_WEIGHTS_FILE with a smaller model does not work, since the system appears to expect Resnet50 regardless of what is given to it. If this is not possible directly by replacing the pre-trained weights of the backbone model, is it by any chance possible to supply the backbone model as a custom Bring Your Own Model model?

  4. Final question, just a sanity check for myself: Running train.sh on a 6GB GPU fails by running out of memory. Is this expected? Another thread on this forum lists the GPUs that Clara is supported on and they are all stronger than the one I am using, yet I can run the AIAA server without any issues, even for multiple models at once. Does training with SegmAhnet3D take a significantly larger amount of memory than inference?

Hi there,

  1. You can find documentation of builtin models in here: https://docs.nvidia.com/clara/tlt-mi/clara-train-sdk-v2.0/ai4med/apidocs/ai4med.libs.models.html

  2. Currently no.

  3. Right now no, what you can do is to implement your own network end to end.

  4. Yes, this is a large network. We recommend you get a GPU with at least 16 G