Here we will use multiprocessing to download images in batch with python.
This saved me a lot of time while downloading images.
git clone https://github.com/nOOBIE-nOOBIE/image_downloader_multiprocessing_python
pip install -r requirements.txt
python3 image_downloader.py <filename_with_urls_seperated_by_newline.txt> <num_of_process>
This will read all the urls in the text file and download them into a folder with name same as the filename. num_of_process is optional.(by default it uses 10 process)
╰─ make help
image_downloader_aio download images with asynchronous version
image_downloader_mp download images with multiprocessing version
nodejs_install install nodejs packages
nodejs_clean remove node_modules
nodejs_image_downloader download images with node-js version
clean remove all venv, build, coverage and Python artifacts
img-export-dir create images export directory
clean-img remove images files
clean-pyc remove Python file artifacts (*.pyc,*.pyo,*~,__pycache__)
python3 image_downloader.py cats.txt
1183 images in 121.99 seconds with 10 process.
╰─ /usr/bin/time -v make image_downloader_mp
find cats -name '*.jpg' -exec rm -f {} +
Nb url images: 1183
MESSAGE: Running 10 process
Downloading: https://cdn.pixabay.com/photo/2017/06/12/19/02/cat-2396473__480.jpg
[...]
Download complete: https://cdn.pixabay.com/photo/2015/07/13/21/54/gray-cat-843916__480.jpg
Command being timed: "make image_downloader_mp"
User time (seconds): 39.94
System time (seconds): 1.81
Percent of CPU this job got: 113%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:36.74
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 72540
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 31988
Voluntary context switches: 55631
Involuntary context switches: 2240
Swaps: 0
File system inputs: 0
File system outputs: 132160
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
╰─ /usr/bin/time -v make image_downloader_aio
find cats -name '*.jpg' -exec rm -f {} +
Nb url images: 1183
Downloading: https://cdn.pixabay.com/photo/2017/06/12/19/02/cat-2396473__480.jpg
[...]
Download complete: https://cdn.pixabay.com/photo/2014/10/29/22/12/cat-508665__480.jpg
Command being timed: "make image_downloader_aio"
Command being timed: "make image_downloader_aio"
User time (seconds): 6.26
System time (seconds): 1.43
Percent of CPU this job got: 54%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:14.24
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 60476
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 21267
Voluntary context switches: 38882
Involuntary context switches: 185
Swaps: 0
File system inputs: 0
File system outputs: 132632
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
╰─ /usr/bin/time -v make nodejs_image_downloader
find cats -name '*.jpg' -exec rm -f {} +
0
1
2
3
[...]
52
51
50
End.
Command being timed: "make nodejs_image_downloader"
User time (seconds): 6.77
System time (seconds): 2.02
Percent of CPU this job got: 54%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:16.19
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 95616
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 23830
Voluntary context switches: 35553
Involuntary context switches: 192
Swaps: 0
File system inputs: 0
File system outputs: 132936
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0