Building Image Segmentation Faster Using Jupyter Notebooks from NGC

Originally published at: https://developer.nvidia.com/blog/building-image-segmentation-faster-using-jupyter-notebooks-from-ngc/

The NVIDIA NGC team is hosting a webinar with live Q&A to dive into this Jupyter notebook available from the NGC catalog. Learn how to use these resources to kickstart your AI journey. Register now: NVIDIA NGC Jupyter Notebook Day: Image Segmentation. Image segmentation is the process of partitioning a digital image into multiple segments…

Thank you for providing today’s webinar !

When attempting to train with the suggested command

./UNet_1GPU.sh /results /data 1

the script crashes due to the following error:

Not found: /data/raw_images/private/Class1/train_list.csv; No such file or directory
	 [[{{node IteratorGetNext}}]]

I downloaded the dataset from two different sources (*) and none includes the CSV file.
Could you elaborate on how to generate the CSV file from the provided dataset ?

Thanks !

(*) data sources

Just found that the order of some commands should be different than suggested. Rather than executing

chmod + x ./download_and_preprocess_dagm2007.sh

./download_and_preprocess_dagm2007.sh /data

# download dataset from https://hci.iwr.uni-heidelberg.de/content/weakly-supervised-learning-industrial-optical-inspection

docker cp /home/<your file path> :/data/raw_images/ private

unzip /folder/path/.zip

one should first download the dataset from Weakly Supervised Learning for Industrial Optical Inspection | Heidelberg Collaboratory for Image Processing (HCI) and then execute:

docker cp /home/<your file path> :/data/raw_images/ private

chmod + x ./download_and_preprocess_dagm2007.sh

./download_and_preprocess_dagm2007.sh /data

Thank you for your comment. these two lines download the public part of data:
chmod + x ./download_and_preprocess_dagm2007.sh

./download_and_preprocess_dagm2007.sh /data

after that we can download the private part of the dataset manually from provided link by the script and put it inside the container using these two commands:

docker cp /home/ :/data/raw_images/ private

unzip /folder/path/.zip

the order is correct.

For the demo, I used the link that script provided to make the account. When you have the account you can send a request for private data. they will send you a link to the correct dataset. you will have access to the dataset for limited time to download the correct dataset. I didn’t generate csv file, the link that I received after sending request for private data has everything.

2 questions:

  • ./UNet_1GPU.sh /results /workspace/unet_industrial/data 1

does it run the training for just one class? How should I understand the guide?

In addition, the script fails becuase it cannot find train_list.csv

@joepareti54 each script train using one class dataset for training the class id at the end of command shows the class that the script use for training. run the command like this: * ./UNet_1GPU.sh /results /data 1 . You don’t need to make any change except the class id , if you want to train the model using another class.

yes, but what is the meaning of training a neural network on just one class? What is the best performance option? I suppose you may want tot use the entire training set? and why is the train_list.csv file not there?

@joepareti54 the dataset that we used includes 10 different smaller datasets and the images in each smaller dataset have similar background and different defect parts. these are 10 independent datasets, it is not one datasets that divided to 10 parts.
for training you need to download the private part of dataset that you want to use for training. for example in the demo and here we downloaded private part of dataset class1 and put it in the private folder inside container in this path: data/raw_images/private . if you download private part of dataset and put it in the correct folder it has this csv file in it.

is this syntax correct:

docker cp /home/skouchak/Class1.zip 1877b7cc7625:/data/raw_images/private

meaning that one must create the directories data raw_images private inside the container?

no way to find train_list.csv, where is it supposed to come from