Open In App

How to Download All Images from a Web Page in Python?

Last Updated : 16 Oct, 2021
Improve
Improve
Like Article
Like
Save
Share
Report

Prerequisite:

Web scraping is a technique to fetch data from websites. While surfing on the web, many websites don’t allow the user to save data for personal use. One way is to manually copy-paste the data, which both tedious and time-consuming. Web Scraping is the automation of the data extraction process from websites. In this article we will discuss how we can download all images from a web page using python.

Modules Needed

  • bs4: Beautiful Soup(bs4) is a Python library for pulling data out of HTML and XML files. This module does not come built-in with Python.
  • requests:  Requests allows you to send HTTP/1.1 requests extremely easily. This module also does not come built-in with Python.
  • os: The OS module in python provides functions for interacting with the operating system. OS, comes under Python’s standard utility modules. This module provides a portable way of using operating system dependent functionality.

Approach

  • Import module
  • Get HTML Code
  • Get list of img tags from HTML Code using findAll method in Beautiful Soup.
images = soup.findAll('img')

Create separate folder for downloading images using mkdir method in os.

os.mkdir(folder_name)
  • Iterate through all images and get the source URL of that image.
  • After getting the source URL, last step is download the image
  • Fetch Content of Image
r = requests.get(Source URL).content
  • Download image using File Handling
# Enter File Name with Extension like jpg, png etc..
with open("File Name","wb+") as f:
      f.write(r)

Program:

Python3




from bs4 import *
import requests
import os
 
# CREATE FOLDER
def folder_create(images):
    try:
        folder_name = input("Enter Folder Name:- ")
        # folder creation
        os.mkdir(folder_name)
 
    # if folder exists with that name, ask another name
    except:
        print("Folder Exist with that name!")
        folder_create()
 
    # image downloading start
    download_images(images, folder_name)
 
 
# DOWNLOAD ALL IMAGES FROM THAT URL
def download_images(images, folder_name):
   
    # initial count is zero
    count = 0
 
    # print total images found in URL
    print(f"Total {len(images)} Image Found!")
 
    # checking if images is not zero
    if len(images) != 0:
        for i, image in enumerate(images):
            # From image tag ,Fetch image Source URL
 
                        # 1.data-srcset
                        # 2.data-src
                        # 3.data-fallback-src
                        # 4.src
 
            # Here we will use exception handling
 
            # first we will search for "data-srcset" in img tag
            try:
                # In image tag ,searching for "data-srcset"
                image_link = image["data-srcset"]
                 
            # then we will search for "data-src" in img
            # tag and so on..
            except:
                try:
                    # In image tag ,searching for "data-src"
                    image_link = image["data-src"]
                except:
                    try:
                        # In image tag ,searching for "data-fallback-src"
                        image_link = image["data-fallback-src"]
                    except:
                        try:
                            # In image tag ,searching for "src"
                            image_link = image["src"]
 
                        # if no Source URL found
                        except:
                            pass
 
            # After getting Image Source URL
            # We will try to get the content of image
            try:
                r = requests.get(image_link).content
                try:
 
                    # possibility of decode
                    r = str(r, 'utf-8')
 
                except UnicodeDecodeError:
 
                    # After checking above condition, Image Download start
                    with open(f"{folder_name}/images{i+1}.jpg", "wb+") as f:
                        f.write(r)
 
                    # counting number of image downloaded
                    count += 1
            except:
                pass
 
        # There might be possible, that all
        # images not download
        # if all images download
        if count == len(images):
            print("All Images Downloaded!")
             
        # if all images not download
        else:
            print(f"Total {count} Images Downloaded Out of {len(images)}")
 
# MAIN FUNCTION START
def main(url):
   
    # content of URL
    r = requests.get(url)
 
    # Parse HTML Code
    soup = BeautifulSoup(r.text, 'html.parser')
 
    # find all images in URL
    images = soup.findAll('img')
 
    # Call folder create function
    folder_create(images)
 
 
# take url
url = input("Enter URL:- ")
 
# CALL MAIN FUNCTION
main(url)


Output:



Previous Article
Next Article

Similar Reads

How to download Google Images using Python
Python is a multi-purpose language and widely used for scripting. We can write Python scripts to automate day-to-day things. Let’s say we want to download google images with multiple search queries. Instead of doing it manually we can automate the process. How to install needed Module : pip install google_images_download Let’s see how to write a Py
2 min read
Arithmetic Operations on Images using OpenCV | Set-2 (Bitwise Operations on Binary Images)
Prerequisite: Arithmetic Operations on Images | Set-1Bitwise operations are used in image manipulation and used for extracting essential parts in the image. In this article, Bitwise operations used are : ANDORXORNOT Also, Bitwise operations helps in image masking. Image creation can be enabled with the help of these operations. These operations can
4 min read
Close specific Web page using Selenium in Python
Prerequisites: Selenium Basics, Selenium close() and quit() Selenium is a powerful tool for controlling web browsers through programs and performing browser automation. It is functional for all browsers, works on all major OS and its scripts are written in various languages i.e Python, Java, C#, etc, we will be working with Python. The close() meth
2 min read
Scroll Web Page Base On Pixel Method Using Selenium in Python
Selenium is a powerful tool for controlling web browsers through programs and performing browser automation. It is functional for all browsers, works on all major OS and its scripts are written in various languages i.e Python, Java, C#, etc, we will be working with Python. A Scrollbar is helped you to circulate round display in vertical route if th
2 min read
Delete all the Png Images from a Folder in Python
Python is mostly used to automate tasks, including file management operations. Deleting all PNG images from a folder can be efficiently handled using Python. In this article, we will explore two different approaches to Deleting all the PNG images from a Folder in Python. Delete all the PNG images from a FolderBelow are the possible approaches to De
2 min read
Get all text of the page using Selenium in Python
As we know Selenium is an automation tool through which we can automate browsers by writing some lines of code. It is compatible with all browsers, Operating systems, and also its program can be written in any programming language such as Python, Java, and many more. Selenium provides a convenient API to access Selenium WebDrivers like Firefox, IE,
3 min read
Get emotions of images using Microsoft emotion API in Python
The emotions of images like happy, sad, neutral, surprise, etc. can be extracted using Microsoft emotion API for any development purpose. It is very simple to use and can be called via API through terminal or any of languages like Python or PHP. Microsoft provides free subscription of 30 days for making total of 30,000 requests. The details of the
2 min read
Addition and Blending of images using OpenCV in Python
When we talk about images, we know its all about the matrix either binary image(0, 1), gray scale image(0-255) or RGB image(255 255 255). So additions of the image is adding the numbers of two matrices. In OpenCV, we have a command cv2.add() to add the images. Below is code for Addition of two images using OpenCV : # Python program for adding # ima
2 min read
Filtering Images based on size attributes in Python
Prerequisite : PIL_working-images-python Given an Image directory, our program will create a new image directory based on given threshold size. A simple Python3 function that inputs the path of python file, the threshold width in pixels and the threshold height in pixels, searches all the images present in that only directory and creates a new dire
2 min read
Reading images in Python
Python supports very powerful tools when comes to image processing. Let's see how to process the images using different libraries like ImageIO, OpenCV, Matplotlib, PIL, etc. Using ImageIO : Imageio is a Python library that provides an easy interface to read and write a wide range of image data, including animated images, video, volumetric data, and
3 min read
three90RightbarBannerImg