Downloading ILSVRC12 dataset

jhuds65 · November 12, 2017, 4:35pm

Hello, I just wanted to see if there was any way I could find a faster alternative to downloading the ILSVRC12 dataset; in the tutorial (two days to a demo) it mentions that it should take overnight on a decent Internet connection. I am guessing the author must have has a T-1 connection, as it apparently is going to take ~5 days to download all the images on my connection.

My system has other jobs to do, and I can’t tie it up for an uninterrupted 5 day download session. Might there at least be a way to download the dataset in chunks and not have to start from the beginning again?

John

Honey_Patouceul · November 12, 2017, 7:18pm

You may try --continue option of command wget:

wget -c your_target_url

dusty_nv · November 12, 2017, 7:43pm

Hi, you could join image-net.org if applicable (see download-faq).

The crawler script reads a file of image URLs, you could remove those URLs that you already have or remove URLs that you aren’t interested in (either in the file or by modifying the script). Also here appears a recent project that crawls Google Images using Python.

If you would prefer to use your own images, the imageNet example from 2 Days to a Demo is pretty easy to get images for in DIGITS image directory format - save a bunch of images of what you want to recognize into a folder named after each class.

Finally, you can skip ahead to the detectNet or segNet portion of the tutorial while the crawler script runs, if you prefer.

jhuds65 · November 12, 2017, 8:58pm

Thanks for the suggestions. I also found an academic torrent site that might speeed things up.

jhuds65 · November 12, 2017, 11:21pm

That’s a great point; I was looking at the datasets and wasn’t really interested in identifying birds, etc. Our application is much more specific. As far as collecting our own images, is it fair to say that it really doesn’t matter what kind of crawler you use to get them, as long as they are arranged in the expected file/folder hierarchy?

Thanks again!

dusty_nv · November 13, 2017, 2:32pm

Hi jhuds65, yes you are correct, as long as you organize your images in a directory structure like so, you can use any domain-specific images you like for your application:

+ cat/
   - cat_0.jpg
   - other-cat.jpg
   - any_file-name_OK.png
   ...
   - cat_N.jpg
+ dog/
   + chihuahua/
       - woof_woof.png
       - subdirectories_are_ok.jpg
   + labrador/
       - they_get_flattened_into_Dog_class.jpg

jhuds65 · November 13, 2017, 6:29pm

That’s great, and should be enough to get me off to a good start.

Again, thanks to everyone for their help!

Now, off to build something cool (I hope).

Topic		Replies	Views
Seeking for GoogleNet-ILSVRC12-subset Jetson TX2	2	1434	October 18, 2021
Network in folder Jetson Nano ai-training	4	552	October 18, 2021
Can DetectNet and ImageNet be trained to detect objects? Jetson TX2	3	872	October 18, 2021
Allocate the amount of photos Jetson Nano jetson-inference	4	357	October 18, 2021
Building Image Segmentation Faster Using Jupyter Notebooks from NGC Technical Blog	10	629	March 5, 2021
Hello AI World: Using ./imagenet Creates Endless Loop of Tactic Finding? Jetson Nano jetson-inference	4	687	October 15, 2021
Xavier: Yolo ~15fps << ResNet10 120fps ? DeepStream SDK	6	1697	October 12, 2021
Person detection, split "loading layers of weight-file" and "image detection" Jetson Nano jetson-inference	8	581	October 18, 2021
Object Detection training with downloaded images Jetson Nano ai-training	4	1493	October 15, 2021
I am struggling to find a way to use pypy with the jetson inference and jetson utils modules Jetson Xavier NX jetson-inference	9	718	May 23, 2022

Downloading ILSVRC12 dataset

Related topics