Open In App

How to extract youtube data in Python?

Last Updated : 17 Feb, 2023
Improve
Improve
Like Article
Like
Save
Share
Report

Prerequisites: Beautifulsoup

YouTube statistics of a YouTube channel can be used for analysis and it can also be extracted using python code. A lot of data like viewCount, subscriberCount, and videoCount can be retrieved. This article discusses 2 ways in which this can be done.

Method 1: Using YouTube API

First we need to generate an API key. You need a Google Account to access the Google API Console, request an API key, and register your application. You can use Google APIs page to do so.

To extract data we need the channel id of the YouTube channel whose stats we want to view. To get the channel id visit that particular YouTube channel and copy the last part of the URL (In the examples given below channel id of GeeksForGeeks channel are used). 

Approach

  • First create youtube_statistics.py 
  • In this file extract data using YTstats class and generate a json file will all the data extracted.
  • Now create main.py
  • In main import youtube_statistics.py
  • Add API key and channel id
  • Now using the first file data corresponding to the key given will be retrieved and saved to json file.

Example :

Code for main.py file :

Python3




from youtube_statistics import YTstats
 
# paste the API key generated by you here
API_KEY = "AIzaSyA-0KfpLK04NpQN1XghxhSlzG-WkC3DHLs"
 
 # paste the channel id here
channel_id = "UC0RhatS1pyxInC00YKjjBqQ"
 
yt = YTstats(API_KEY, channel_id)
yt.get_channel_statistics()
yt.dump()


Code for youtube_statistics.py file :

Python3




import requests
import json
 
 
class YTstats:
 
    def __init__(self, api_key, channel_id):
        self.api_key = api_key
        self.channel_id = channel_id
        self.channel_statistics = None
 
    def get_channel_statistics(self):
        url = f'https://www.googleapis.com/youtube/v3/channels?part=statistics&id={self.channel_id}&key={self.api_key}'
 
        json_url = requests.get(url)
        data = json.loads(json_url.text)
 
        try:
            data = data["items"][0]["statistics"]
        except:
            data = None
 
        self.channel_statistics = data
        return data
 
    def dump(self):
        if self.channel_statistics is None:
            return
 
        channel_title = "GeeksForGeeks"
        channel_title = channel_title.replace(" ", "_").lower()
 
        # generate a json file with all the statistics data of the youtube channel
        file_name = channel_title + '.json'
        with open(file_name, 'w') as f:
            json.dump(self.channel_statistics, f, indent=4)
        print('file dumped')


Output:

Method 2: Using BeautifulSoup

Beautiful Soup is a Python library for pulling data out of HTML and XML files. In this approach we will use BeautifulSoup and Selenium to scrape data from YouTube channels. This program will tell the views, time since posted, title and urls of the videos and print them using Python’s formatting.

Approach

  • Import module
  • Provide url of the channel whose data is to be fetched
  • Extract data
  • Display data fetched.

Example:

Python3




# import required packages
from selenium import webdriver
from bs4 import BeautifulSoup
 
# provide the url of the channel whose data you want to fetch
urls = [
]
 
 
def main():
    driver = webdriver.Chrome()
    for url in urls:
        driver.get('{}/videos?view=0&sort=p&flow=grid'.format(url))
        content = driver.page_source.encode('utf-8').strip()
        soup = BeautifulSoup(content, 'lxml')
        titles = soup.findAll('a', id='video-title')
        views = soup.findAll(
            'span', class_='style-scope ytd-grid-video-renderer')
        video_urls = soup.findAll('a', id='video-title')
        print('Channel: {}'.format(url))
        i = 0  # views and time
        j = 0  # urls
        for title in titles[:10]:
            print('\n{}\t{}\t{}\thttps://www.youtube.com{}'.format(title.text,
                                                                   views[i].text, views[i+1].text, video_urls[j].get('href')))
            i += 2
            j += 1
 
 
main()


Output



Previous Article
Next Article

Similar Reads

How to Extract YouTube Comments Using Youtube API - Python
Prerequisite: YouTube API Google provides a large set of API’s for the developer to choose from. Each and every service provided by Google has an associated API. Being one of them, YouTube Data API is very simple to use provides features like – Search for videosHandle videos like retrieve information about a video, insert a video, delete a video et
2 min read
GUI to get views, likes, and title of a YouTube video using YouTube API in Python
Prerequisite: YouTube API, Tkinter Python offers multiple options for developing GUI (Graphical User Interface). Out of all the GUI methods, Tkinter is the most commonly used method. It is a standard Python interface to the Tk GUI toolkit shipped with Python. Python with Tkinter is the fastest and easiest way to create GUI applications. In this art
2 min read
How to extract image information from YouTube Playlist using Python?
Prerequisite: YouTube API Google provides a large set of APIs for the developer to choose from. Each and every service provided by Google has an associated API. Being one of them, YouTube Data API is very simple to use provides features like – Search for videosHandle videos like retrieve information about a video, insert a video, delete a video, et
2 min read
Youtube Data API for handling videos | Set-1
Before proceeding further let's have a look at what we have in store for the videos section. Youtube Data API allows to perform following operations on the video: list insert update rate getRating reportAbuse delete Let's discuss how to use Youtube Data API for handling videos. Please follow the steps below to enable the API and start using it. Cre
4 min read
Youtube Data API | Set-1
Google provides a large set of API's for the developer to choose from. Each and every service provided by Google has an associated API. Being one of them, Youtube Data API is very simple to use provides features like - Search for videosHandle videos like retrieve information about a video, insert a video, delete a video etc.Handle Subscriptions lik
5 min read
Youtube Data API | Set-2
Prerequisite: Youtube Data API | Set-1 In the previous article we have discussed first two variants of search method. Now let's discuss the remaining three- Search Live Events, Search Related Videos and Search My Videos. Search by Live Events: Given example retrieves top 5 live broadcasts associated with the query string Python Programming. type pa
6 min read
Youtube Data API for handling videos | Set-2
Prerequisite: Youtube Data API for handling videos | Set-1 We have recently discussed first two variants of video list method, Video List by Video Id and Video List by Multiple Video Ids. Now, let's discuss the other two variants - Video List Most Popular Videos Video List My Liked Videos List most popular videos: The regionCode parameter states th
6 min read
Youtube Data API for handling videos | Set-3
Prerequisite: Youtube Data API for handling videos | Set-1, Set-2Before proceeding further, first see how to get a list of valid Video Categories, to determine in which category to put a video. For the example we have used the "IN" value for the regionCode parameter. You can use any other value. Only categoryId's that have assignable parameter as v
7 min read
Youtube Data API for handling videos | Set-4
Prerequisite: Youtube Data API for handling videos | Set-1, Set-2, Set-3 In this article we will discuss two methods related to video: Update a Video, Delete a Video. Updating a video and deleting an uploaded video requires user's authorization, so we will be creating OAuth type of credential for this example. Follow the steps below to generate a C
7 min read
Youtube Data API for handling videos | Set-5
Prerequisite: Youtube Data API for handling videos | Set-1, Set-2, Set-3 In this article, we will be discussing Rate a Video and Get Rating of a Video. Examples in this article will be requiring user authorization. So, we will be first creating OAuth Credential and install additional libraries. Follow the steps below to generate a Client Id and a S
5 min read