Open In App

How to scrape multiple pages using Selenium in Python?

Last Updated : 03 Apr, 2023
Improve
Improve
Like Article
Like
Save
Share
Report

As we know, selenium is a web-based automation tool that helps us to automate browsers. Selenium is an Open-Source testing tool which means we can easily download it from the internet and use it. With the help of Selenium, we can also scrap the data from the webpages. Here, In this article, we are going to discuss how to scrap multiple pages using selenium. 

There can be many ways for scraping the data from webpages, we will discuss one of them. Looping over the page number is the most simple way for scraping the data. We can use an incrementing counter for changing one page to another page. As many times, our loop will run, the program will scrap the data from webpages.

First Page URL:

https://webscraper.io/test-sites/e-commerce/static/computers/laptops?page=1

At last, the Only page numbers will increment like page=1, page=2… Now, Let see for second page URL.

Second Page URL:

https://webscraper.io/test-sites/e-commerce/static/computers/laptops?page=2

Now, Let discuss the approach

Installation:

Our first step, before writing a single line of code. We have to install the selenium for using webdriver class. Through which we can instantiate the browsers and get the webpage from the targeted URL.

pip install selenium

Once selenium installed successfully. Now, we can go to the next step for installing our next package. 

The next package is webdriver_manager, Let install it first,

pip install webdriver_manager

Yeah! We are done with the Installation of Important or necessary packages

Now, Let see the implementation below:

  • Here in this program, with the help of for loop, We will scrap two webpages because we are running for loop two times only. If we want to scrap more pages, so, we can increase the loop count.
  • Store the page URL in a string variable page_url, and increment its page number count using the for loop counter.
  • Now, Instantiate the Chrome web browser
  • Open the page URL in Chrome browser using driver object
  • Now, Scraping data from the webpage using element locators like find_elements method. This method will return a list of types of elements.  We will store all necessary data inside the list variable such as title, price, description, and rating.
  • Store all the data as list of list of a single product. In element_list, we will store this resultant list.
  • Finally, Print element_list. Then close the driver object.

Python3




# importing necessary packages
from selenium import webdriver
from selenium.webdriver.common.by import By
from webdriver_manager.chrome import ChromeDriverManager
 
# for holding the resultant list
element_list = []
 
for page in range(1, 3, 1):
   
    driver = webdriver.Chrome(ChromeDriverManager().install())
    driver.get(page_url)
    title = driver.find_elements(By.CLASS_NAME, "title")
    price = driver.find_elements(By.CLASS_NAME, "price")
    description = driver.find_elements(By.CLASS_NAME, "description")
    rating = driver.find_elements(By.CLASS_NAME, "ratings")
 
    for i in range(len(title)):
        element_list.append([title[i].text, price[i].text, description[i].text, rating[i].text])
 
print(element_list)
 
#closing the driver
driver.close()


Output:

Storing data in Excel File:

Now, We will store the data from element_list to Excel file using xlsxwriter package. So, First, we have to install this xlsxwriter package.

pip install xlsxwriter

Once’s installation get done. Let’s see the simple code through which we can convert the list of elements into an Excel file.

Python3




with xlsxwriter.Workbook('result.xlsx') as workbook:
    worksheet = workbook.add_worksheet()
 
    for row_num, data in enumerate(element_list):
        worksheet.write_row(row_num, 0, data)


First, we are creating a workbook named result.xlsx. After that, We will consider the list of a single product as a single row. Enumerate the list as a row and its data as columns inside the Excel file which is starting as a row number 0 and column number 0. 

Now, Let’s see its implementation:

Python3




import xlsxwriter
from selenium import webdriver
from selenium.webdriver.common.by import By
from webdriver_manager.chrome import ChromeDriverManager
 
element_list = []
 
for page in range(1, 3, 1):
   
    driver = webdriver.Chrome(ChromeDriverManager().install())
    driver.get(page_url)
    title = driver.find_elements(By.CLASS_NAME, "title")
    price = driver.find_elements(By.CLASS_NAME, "price")
    description = driver.find_elements(By.CLASS_NAME, "description")
    rating = driver.find_elements(By.CLASS_NAME, "ratings")
 
    for i in range(len(title)):
        element_list.append([title[i].text, price[i].text, description[i].text, rating[i].text])
 
with xlsxwriter.Workbook('result.xlsx') as workbook:
    worksheet = workbook.add_worksheet()
 
    for row_num, data in enumerate(element_list):
        worksheet.write_row(row_num, 0, data)
 
driver.close()


Output:

Output file.

Click here for downloading the output file.



Previous Article
Next Article

Similar Reads

How to Scrape Multiple Pages of a Website Using Python?
Web Scraping is a method of extracting useful data from a website using computer programs without having to manually do it. This data can then be exported and categorically organized for various purposes. Some common places where Web Scraping finds its use are Market research & Analysis Websites, Price Comparison Tools, Search Engines, Data Col
6 min read
Scrape and Save Table Data in CSV file using Selenium in Python
Selenium WebDriver is an open-source API that allows you to interact with a browser in the same way a real user would and its scripts are written in various languages i.e. Python, Java, C#, etc. Here we will be working with python to scrape data from tables on the web and store it as a CSV file. As Google Chrome is the most popular browser, to make
3 min read
Scrape LinkedIn Using Selenium And Beautiful Soup in Python
In this article, we are going to scrape LinkedIn using Selenium and Beautiful Soup libraries in Python. First of all, we need to install some libraries. Execute the following commands in the terminal. pip install selenium pip install beautifulsoup4In order to use selenium, we also need a web driver. You can download the web driver of either Interne
7 min read
Scrape a popup using python and selenium
Web scraping is the process of extracting data from websites. It involves programmatically accessing web pages, parsing the HTML or XML content, and extracting the desired information. With the help of libraries like BeautifulSoup, Selenium, or Scrapy in Python, web scraping can be done efficiently. It enables automating data collection, scraping p
3 min read
Bulk Posting on Facebook Pages using Selenium
As we are aware that there are multiple tasks in the marketing agencies which are happening manually, and one of those tasks is bulk posting on several Facebook pages, which is very time-consuming and sometimes very tedious to do. In this project-based article, we are going to explore a solution that is on the python selenium library and see how we
9 min read
Scrape Tables From any website using Python
Scraping is a very essential skill for everyone to get data from any website. Scraping and parsing a table can be very tedious work if we use standard Beautiful soup parser to do so. Therefore, here we will be describing a library with the help of which any table can be scraped from any website easily. With this method you don't even have to inspec
3 min read
Scrape most reviewed news and tweet using Python
Many websites will be providing trendy news in any technology and the article can be rated by means of its review count. Suppose the news is for cryptocurrencies and news articles are scraped from cointelegraph.com, we can get each news item reviewer to count easily and placed in MongoDB collection. Modules Needed Tweepy: Tweepy is the Python clien
5 min read
How to Scrape Data From Local HTML Files using Python?
BeautifulSoup module in Python allows us to scrape data from local HTML files. For some reason, website pages might get stored in a local (offline environment), and whenever in need, there may be requirements to get the data from them. Sometimes there may be a need to get data from multiple Locally stored HTML files too. Usually HTML files got the
4 min read
Scrape Instagram using Instagramy in Python
In this article, we will learn how can we get Instagram profile details using web scraping. Python provides powerful tools for web scraping, we will be using Instagramy here. This tool is specifically made for Instagram and also analyzes the data using Pandas. Installation The python package Instagramy is used to scrape Instagram quick and easily.
1 min read
How to scrape all the text from body tag using Beautifulsoup in Python?
strings generator is provided by Beautiful Soup which is a web scraping framework for Python. Web scraping is the process of extracting data from the website using automated tools to make the process faster. One drawback of the string attribute is that it only works for tags with string inside it and returns nothing for tags with further tags insid
2 min read