Skip to content

This model is to simplify the youtube trending list on the basis of given data and tell which industry comes in trending list on YouTube platform.

Notifications You must be signed in to change notification settings

piyush033/Trending-Youtube-Video-Statistics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 

Repository files navigation

Trending-Youtube-Video-Statistics

Problem Statement

Simplify the Trending list with the basis of given data and tell which industry comes in trending list.

Table of Content

Dataset Used

The original data come from Kaggle Trending YouTube Video Statistics.

You can find this also in my repository: My Repository

Frameworks Used

Trending YouTube Video Statistics Data Dictionary

Context

YouTube (the world-famous video sharing website) maintains a list of the top trending videos on the platform. According to Variety magazine, “To determine the year’s top-trending videos, YouTube uses a combination of factors including measuring users interactions (number of views, shares, comments and likes). Note that they’re not the most-viewed videos overall for the calendar year”. Top performers on the YouTube trending list are music videos (such as the famously virile “Gangam Style”), celebrity and/or reality TV performances, and the random dude-with-a-camera viral videos that YouTube is well-known for.

This dataset is a daily record of the top trending YouTube videos.

Note that this dataset is a structurally improved version of this dataset.

Content

This dataset includes several months (and counting) of data on daily trending YouTube videos. Data is included for the US, GB, DE, CA, and FR regions (USA, Great Britain, Germany, Canada, and France, respectively), with up to 200 listed trending videos per day.

EDIT: Now includes data from RU, MX, KR, JP and IN regions (Russia, Mexico, South Korea, Japan and India respectively) over the same time period.

Each region’s data is in a separate file. Data includes the video title, channel title, publish time, tags, views, likes and dislikes, description, and comment count.

The data also includes a category_id field, which varies between regions. To retrieve the categories for a specific video, find it in the associated JSON. One such file is included for each of the five regions in the dataset.

For more information on specific columns in the dataset refer to the column metadata.

Acknowledgements

This dataset was collected using the YouTube API.

Inspiration

Possible uses for this dataset could include:

  • Sentiment analysis in a variety of forms
  • Categorising YouTube videos based on their comments and statistics.
  • Training ML algorithms like RNNs to generate their own YouTube comments.
  • Analysing what factors affect how popular a YouTube video will be.
  • Statistical analysis over time.

For further inspiration, see the kernels on this dataset!

Conclusion

  • The music category is in the 1st place on Youtube Platform.
  • The entertainment category comes after the music category.

Model Visualizations

Data:

Screenshot 2022-07-23 124055

Screenshot 2022-07-23 124207

Screenshot 2022-07-23 124308

Screenshot 2022-07-23 124401

Screenshot 2022-07-23 124436

Screenshot 2022-07-23 124524

Heatmaps updates:

image

New dataframe updates:

image

About

This model is to simplify the youtube trending list on the basis of given data and tell which industry comes in trending list on YouTube platform.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published