Open In App

Controlling the Web Browser with Python

Last Updated : 23 Sep, 2021
Improve
Improve
Like Article
Like
Save
Share
Report

In this article, we are going to see how to control the web browser with Python using selenium. Selenium is an open-source tool that automates web browsers. It provides a single interface that lets you write test scripts in programming languages like Ruby, Java, NodeJS, PHP, Perl, Python, and C#, etc.

To install this module, run these commands into your terminal:

pip install selenium

For automation please download the latest Google Chrome along with chromedriver from here.

Here we will automate the authorization at “https://auth.geeksforgeeks.org” and extract the Name, Email, Institute name from the logged-in profile.

Initialization and Authorization

First, we need to initiate the web driver using selenium and send a get request to the url and Identify the HTML document and find the input tags and button tags that accept username/email, password, and sign-in button.

To send the user given email and password to the input tags respectively:

driver.find_element_by_name('user').send_keys(email)
driver.find_element_by_name('pass').send_keys(password)

Identify the button tag and click on it using the CSS selector via selenium webdriver:

 driver.find_element_by_css_selector(‘button.btn.btn-green.signin-button’).click()

Scraping Data

Scraping Basic Information from GFG Profile

After clicking on Sign in, a new page should be loaded containing the Name, Institute Name, and Email id. Identify the tags containing the above data and select them.

container = driver.find_elements_by_css_selector(‘div.mdl-cell.mdl-cell–9-col.mdl-cell–12-col-phone.textBold’)

Get the text from each of these tags from the returned list of selected css selectors:

name = container[0].text
try:
    institution = container[1].find_element_by_css_selector('a').text
except:
    institution = container[1].text
email_id = container[2].text

Finally, print the output:

print({"Name": name, "Institution": institution, "Email ID": email})

Scraping Information from Practice tab

Click on the Practice tab and wait for few seconds to load the page.

driver.find_elements_by_css_selector('a.mdl-navigation__link')[1].click()

Find the container containing all the information and select the grids using CSS selector from the container having information.

container = driver.find_element_by_css_selector(‘div.mdl-cell.mdl-cell–7-col.mdl-cell–12-col-phone.whiteBgColor.mdl-shadow–2dp.userMainDiv’)

grids = container.find_elements_by_css_selector(‘div.mdl-grid’)

Iterate each of the selected grids and extract the text from it and add it to a set/list for output.

res = set()
for grid in grids:
    res.add(grid.text.replace('\n',':'))

Below is the full implementation:

Python3




# Import the required modules
from selenium import webdriver
import time
  
# Main Function
if __name__ == '__main__':
  
    # Provide the email and password
    email = 'example@example.com'
    password = 'password'
  
    options = webdriver.ChromeOptions()
    options.add_argument("--start-maximized")
    options.add_argument('--log-level=3')
  
    # Provide the path of chromedriver present on your system.
    driver = webdriver.Chrome(executable_path="C:/chromedriver/chromedriver.exe",
                              chrome_options=options)
    driver.set_window_size(1920,1080)
  
    # Send a get request to the url
    driver.get('https://auth.geeksforgeeks.org/')
    time.sleep(5)
  
    # Finds the input box by name in DOM tree to send both 
    # the provided email and password in it.
    driver.find_element_by_name('user').send_keys(email)
    driver.find_element_by_name('pass').send_keys(password)
      
    # Find the signin button and click on it.
    driver.find_element_by_css_selector(
        'button.btn.btn-green.signin-button').click()
    time.sleep(5)
  
    # Returns the list of elements
    # having the following css selector.
    container = driver.find_elements_by_css_selector(
        'div.mdl-cell.mdl-cell--9-col.mdl-cell--12-col-phone.textBold')
      
    # Extracts the text from name, 
    # institution, email_id css selector.
    name = container[0].text
    try:
        institution = container[1].find_element_by_css_selector('a').text
    except:
        institution = container[1].text
    email_id = container[2].text
  
    # Output Example 1
    print("Basic Info")
    print({"Name": name, 
           "Institution": institution,
           "Email ID": email})
  
    # Clicks on Practice Tab
    driver.find_elements_by_css_selector(
      'a.mdl-navigation__link')[1].click()
    time.sleep(5)
  
    # Selected the Container containing information
    container = driver.find_element_by_css_selector(
      'div.mdl-cell.mdl-cell--7-col.mdl-cell--12-col-phone.\
      whiteBgColor.mdl-shadow--2dp.userMainDiv')
      
    # Selected the tags from the container
    grids = container.find_elements_by_css_selector(
      'div.mdl-grid')
      
    # Iterate each tag and append the text extracted from it.
    res = set()
    for grid in grids:
        res.add(grid.text.replace('\n',':'))
  
    # Output Example 2
    print("Practice Info")
    print(res)
  
    # Quits the driver
    driver.close()
    driver.quit()


Output:



Previous Article
Next Article

Similar Reads

Python | Multiple Sliders widgets Controlling Background Screen or WindowColor in Kivy
Prerequisite - Slider in KivyKivy is a platform independent GUI tool in Python. As it can be run on Android, IOS, linux and Windows etc. It is basically used to develop the Android application, but it does not mean that it can not be used on Desktops applications.In this article, we will learn How we can control the background color in kivy that me
3 min read
Python | Launch a Web Browser using webbrowser module
In Python, webbrowser module is a convenient web browser controller. It provides a high-level interface that allows displaying Web-based documents to users. webbrowser can also be used as a CLI tool. It accepts a URL as the argument with the following optional parameters: -n opens the URL in a new browser window, if possible, and -t opens the URL i
2 min read
Python Script to Open a Web Browser
In this article we will be discussing some of the methods that can be used to open a web browser (of our choice) and visit the URL we specified, using python scripts. In the Python package, we have a module named webbrowser, which includes a lot of methods that we can use to open the required URL in any specified browser we want. For that, we just
4 min read
Delete Google Browser History using Python
In this article, you will learn to write a Python program which will take input from the user as a keyword like Facebook, amazon, geeksforgeeks, Flipkart, youtube, etc. and then search your google chrome browser history for that keyword and if the keyword is found in any of the URL then it will delete it. For example, suppose you have entered the k
2 min read
Get Browser History using Python in Ubuntu
In order to get the browser history of chrome and Mozilla Firefox browser os module and sqlite3 modules are used. The Chrome and Firefox history data are stored in SQLite database. So SQLite Python package is needed to extract the data from the browser history. Get History from Firefox The Firefox browser stores all details in .mozilla/firefox fold
3 min read
How to Open URL in Firefox Browser from Python Application?
In this article, we'll look at how to use a Python application to access a URL in the Firefox browser. To do so, we'll use the webbrowser Python module. We don't have to install it because it comes pre-installed. There are also a variety of browsers pre-defined in this module, and for this article, we'll be utilizing Firefox. The webbrowser module
2 min read
Automated Browser Testing with Edge and Selenium in Python
Cross-browser testing is mandatory in the software industry. We all know that there are many browsers like Firefox, Chrome, Edge, Opera etc., are available. Rather than writing separate code to each and every browser, it is always a decent approach to go towards automated testing. Let us see how to do that using Selenium for Edge browser in Python.
5 min read
Creating a tabbed browser using PyQt5
In this article, we will see how we can create a tabbed browser using PyQt5. Web browser is a software application for accessing information on the World Wide Web. When a user requests a web page from a particular website, the web browser retrieves the necessary content from a web server and then displays the page on the screen.Tabbing : Adding tab
6 min read
Creating a simple browser using PyQt5
In this article we will see how we can create a simple browser using PyQt5. Web browser is a software application for accessing information on the World Wide Web. When a user requests a web page from a particular website, the web browser retrieves the necessary content from a web server and then displays the page on the screen. PyQt5 is cross-platf
4 min read
Browser Automation Using Selenium
Selenium is a powerful tool for controlling a web browser through the program. It is functional for all browsers, works on all major OS and its scripts are written in various languages i.e Python, Java, C#, etc, we will be working with Python. Mastering Selenium will help you automate your day to day tasks like controlling your tweets, Whatsapp tex
3 min read