Building Medical 3D Image Segmentation Using Jupyter Notebooks from the NGC Catalog

Originally published at: Building Medical 3D Image Segmentation Using Jupyter Notebooks from the NGC Catalog | NVIDIA Developer Blog

The NVIDIA NGC team is hosting a webinar with live Q&A to dive into this Jupyter notebook available from the NGC catalog. Learn how to use these resources to kickstart your AI journey. Register now: NVIDIA NGC Jupyter Notebook Day: Medical Imaging Segmentation. Image segmentation partitions a digital image into multiple segments by changing the…

I tried to run the demo from below link. I think below python command line argument --data_dir, should be “/data/preprocessed” instead of “/data/preprocessed_test” at “Prediction” section. Please confirm it. This demo trained with Brain Tumor 2019 data. Can I simply set Brain Tumor 2020 data to --data_dir after conversion of data by execution of dataset/preprocess_data.py? or Does it cause a problem?

command: python main.py --model_dir /results --exec_mode predict --data_dir /data/preprocessed_test

URL: Building Medical 3D Image Segmentation Using Jupyter Notebooks from the NGC Catalog | NVIDIA Developer Blog

I am trying to download the data set does anyone have the exact url? I went to the url given: Cancer Imaging Phenomics Toolkit (CaPTk) | CBICA | Perelman School of Medicine at the University of Pennsylvania but cannot find the data itsself.

Hi @thironaka in order to have the data set you need to register in that website and ask for dataset. The format of 2020 dataset is different from the one that we used (2019). If you want to use 2020 you need to adjust the format of dataset or adjust the preprocess code to work with new format. the name of folders inside the 2020 dataset folder is not similar to 2019.
In demo and in the notebook I mentioned that after preprocessing we need to select part of data as test data so I made a folder called it preprocess-test and put the test data in it and I use this folder for testing the model not the data in preprocessed that I used for training.

The dataset is not available for downloading. you need to register in the website and ask them to send you the dataset.

@skouchak I have already downloaded both 2019 and 2020 datasets. As you mentioned that 2020 data was the different format. However, I could possibly convert 2020 data into .tfrecord data format up to data number 49 by executing “preprocess_data.py” with “–single_data_dir” option, but this caused the problem at data number 50. See below error message.
Although, the rest of data after data number 50 needed to be converted. I could test the first data up to data 49 from 2020 to get segmentation. I segmented the first data and it looked different from the last one. It seemed to segmentation of the first data from 2020 was successful. Bit, I still need to confirm the segmentation data was correct or not by overlapping the segmentation mask with the original data.

50/93 tfrecord files created
Traceback (most recent call last):
File “/usr/local/lib/python3.6/dist-packages/nibabel/loadsave.py”, line 42, in load
stat_result = os.stat(filename)
NotADirectoryError: [Errno 20] Not a directory: ‘/data/MICCAI_BraTS2020_TrainingData/name_mapping.csv/name_mapping.csv_t1.nii.gz’

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “dataset/preprocess_data.py”, line 160, in
main()
File “dataset/preprocess_data.py”, line 124, in main
features = load_features(folder)
File “dataset/preprocess_data.py”, line 44, in load_features
vol = load_single_nifti(os.path.join(path, name+modality)).astype(np.float32)
File “dataset/preprocess_data.py”, line 58, in load_single_nifti
data = nib.load(path).get_fdata().astype(np.int16)
File “/usr/local/lib/python3.6/dist-packages/nibabel/loadsave.py”, line 44, in load
raise FileNotFoundError(f"No such file or no access: ‘{filename}’")
FileNotFoundError: No such file or no access: ‘/data/MICCAI_BraTS2020_TrainingData/name_mapping.csv/name_mapping.csv_t1.nii.gz’

@Peita the model built and trained using 2019 dataset. I didn’t try 2020 dataset for training and testing but as long as you have the same format, same directory names the scripts should work fine. the errors that I can see here is not founding specific file which are related to the name of directories in dataset .

Removing name_mapping.csv and survival_info.csv from MICCAI_BraTS2020_TrainingData, solved above “NotADirectoryError: [Errno 20] Not a directory.” I could convert all 2020 data into .tfrecord.

I got below error while running: !bash examples/unet3d_train_single.sh 10 /data/preprocessed /results 2

[1,0]:[ce2491b123f8:13974] Read -1, expected 16521, errno = 1
[1,0]:[ce2491b123f8:13974] Read -1, expected 13185, errno = 1
[1,0]:[ce2491b123f8:13974] Read -1, expected 9553, errno = 1
[1,0]:[ce2491b123f8:13974] Read -1, expected 19801, errno = 1

@ythacker
I visit below URL. You need to scroll down to the bottom of page and register. CBICA will take for a couple of days to approval you. I think I received my login approval for 2 dyas. Then, you login, browse datasets, and click to request for download. After then, you receive a email from CBCIA with a zip file. Unzip it and open PDF file contains download links.

URL: https://ipp.cbica.upenn.edu/

Registration:

1 Like

Awesome thank you!

@ythacker no problem.