Open In App

How to get the Daily News using Python

Last Updated : 14 Sep, 2021
Improve
Improve
Like Article
Like
Save
Share
Report

In this article, we are going to see how to get daily news using Python. Here we will use Beautiful Soup and the request module to scrape the data.

Modules needed

  • bs4: Beautiful Soup(bs4) is a Python library for pulling data out of HTML and XML files. This module does not come built-in with Python. To install this type the below command in the terminal.
pip install bs4
  • requests: Request allows you to send HTTP/1.1 requests extremely easily. This module also does not come built-in with Python. To install this type the below command in the terminal.
pip install requests

Stepwise Implementation:

Step 1: First of all, make sure to import these libraries.

Python3




import requests
from bs4 import BeautifulSoup


Step 2: Then to get the HTML contents of https://www.bbc.com/news, add these 2 lines of code:

Python3




response = requests.get(url)


Step 3: Get specific HTML tag

In order to find the HTML tags within which news headlines are contained, head over to https://www.bbc.com/news and inspect a news headline by right-clicking it and clicking “inspect”:

You will see that all headlines are contained within “<h3>” tags. Therefore, to scrape all “<h3>” tags within this webpage, add these lines of code to your script:

First, we define “soup” as the HTML content of the BBC news webpage. Next, we define “headlines” as an array of all “<h3>” tags found within the webpage. Finally, the script paddles through the “headlines” array and displays all of its contents one by one, ridding each element of its outerHTML and displaying only its text contents using the “text.strip()” method.

Python3




soup = BeautifulSoup(response.text, 'html.parser')
headlines = soup.find('body').find_all('h3')
for x in headlines:
    print(x.text.strip())


Below is the implementation:

Python3




import requests
from bs4 import BeautifulSoup
  
response = requests.get(url)
  
soup = BeautifulSoup(response.text, 'html.parser')
headlines = soup.find('body').find_all('h3')
for x in headlines:
    print(x.text.strip())


Output:

Cleaning the data

You might have noticed that your output contains duplicate news headlines and text contents that aren’t news headlines.

Create a list of all the text elements you want to get rid of:

unwanted = [‘BBC World News TV’, ‘BBC World Service Radio’, ‘News daily newsletter’, ‘Mobile app’, ‘Get in touch’]

Then print text elements only if they are not in this list by putting:

print(x.text.strip())

Below is the implementation:

Python3




import requests
from bs4 import BeautifulSoup
  
response = requests.get(url)
  
soup = BeautifulSoup(response.text, 'html.parser')
headlines = soup.find('body').find_all('h3')
unwanted = ['BBC World News TV', 'BBC World Service Radio',
            'News daily newsletter', 'Mobile app', 'Get in touch']
  
for x in list(dict.fromkeys(headlines)):
    if x.text.strip() not in unwanted:
        print(x.text.strip())


Output:



Previous Article
Next Article

Similar Reads

Daily Latest News webapp Using PyWebio in Python
In this article, we are going to create a web app to get gaily News Using PyWebio As we all download an app for receiving daily news but as a python lovers we try to do all these things via a python script. So here is a python script that notifies the daily news. In this script, we will create a web app using pywebio which shows all the top headlin
5 min read
Build an Application to extract news from Google News Feed Using Python
Prerequisite- Python tkinter In this article, we are going to write a python script to extract news articles from Google News Feed by using gnewsclient module and bind it with a GUI application. gnewsclient is a python client for Google News Feed. This API has to installed explicitly first in order to be used. Installation The following terminal co
2 min read
Fetching top news using News API
News API is a simple JSON-based REST API for searching and retrieving news articles from all over the web. Using this, one can fetch the top stories running on a news website or can search top news on a specific topic (or keyword). News can be retrieved based on some criteria. Say the topic (keyword) to be searched is 'Geeksforgeeks' or might be co
5 min read
Read latest news using newsapi | Python
In this article, we will learn how to create a Python script to read the latest news. We will fetch news from news API and after that, we will read news using pyttsx3. Modules required : pyttsx3 - pip install pyttsx3 requests - pip install requests Getting news API : To get a API for news we will use newsapi.org. we will create account and take API
2 min read
Newspaper scraping using Python and News API
There are mainly two ways to extract data from a website: Use the API of the website (if it exists). For example, Facebook has the Facebook Graph API which allows retrieval of data posted on Facebook.Access the HTML of the webpage and extract useful information/data from it. This technique is called web scraping or web harvesting or web data extrac
4 min read
Implementing News Parser using Template Method Design Pattern in Python
While defining algorithms, programmers often neglect the importance of grouping the same methods of different algorithms. Normally, they define algorithms from start to end and repeat the same methods in every algorithm. This practice leads to code duplication and difficulties in code maintenance – even for a small logic change, the programmer has
4 min read
Scrape most reviewed news and tweet using Python
Many websites will be providing trendy news in any technology and the article can be rated by means of its review count. Suppose the news is for cryptocurrencies and news articles are scraped from cointelegraph.com, we can get each news item reviewer to count easily and placed in MongoDB collection. Modules Needed Tweepy: Tweepy is the Python clien
5 min read
Web Scraping Financial News Using Python
In this article, we will cover how to extract financial news seamlessly using Python. This financial news helps many traders in placing the trade in cryptocurrency, bitcoins, the stock markets, and many other global stock markets setting up of trading bot will help us to analyze the data. Thus all this can be done with the help of web scraping usin
3 min read
Schedule a Python Script to Run Daily
In this article, we are going to see how to schedule a Python Script to run daily. Scheduling a Python Script to run daily basically means that your Python Script should be executed automatically daily at a time you specify. Preparing the Python Script.Create a Python Script that you want to schedule. In our example, we made a simple Python Script
3 min read
Flask NEWS Application Using Newsapi
In this article, we will create a News Web Application using Flask and NewsAPI. The web page will display top headlines and a search bar where the user can enter a query, after processing the query, the webpage will display all relevant articles (up to a max of 100 headlines). We will create a simple user interface using HTML and bootstrap. You can
7 min read