YouTube trending video analysis package — merges Kaggle data with YouTube API for EDA and predictive modeling
Project description
# YouTube Performance Analysis
**STAT 386 Final Project — Summer Price & Jane Gustafson**
This package analyzes YouTube trending video data by merging the Kaggle YouTube Trending Videos dataset with live data from the YouTube Data API. The result is a custom longitudinal dataset that tracks how trending videos from 2017 have grown over time, enabling exploratory analysis and predictive modeling of video performance.
## What This Package Does
- Downloads the Kaggle YouTube Trending Videos dataset (US, Nov 2017 - Jun 2018)
- Fetches current view, like, and comment counts for each video via the YouTube Data API
- Merges both sources into a single cleaned dataset
- Runs exploratory data analysis across 5 dimensions (growth, trending patterns, categories, engagement, time to trend)
- Trains 3 Random Forest models to predict current views, time to trend, and view growth
## Quick Start
```bash
git clone https://github.com/summeraskey/final_project386.git
cd final_project386
uv venv
source .venv/bin/activate
uv sync
Create a .env file in the project root:
YOUTUBE_API_KEY=your_youtube_api_key
KAGGLE_USERNAME=your_kaggle_username
KAGGLE_KEY=your_kaggle_api_key
Usage
from final_project_demo import run_cleaning_pipeline, run_analysis_pipeline
df = run_cleaning_pipeline()
run_analysis_pipeline(df)
Streamlit App
An interactive model predictor is hosted at: https://finalproject386-qpuktjzfa562fbmkaycd9v.streamlit.app/
To run locally:
streamlit run src/final_project_demo/streamlit_app.py
GitHub Pages Site
Full documentation, tutorial, and technical report are hosted at: https://summeraskey.github.io/final_project386/
Project Structure
final_project386/
├── src/final_project_demo/
│ ├── cleaning.py # Data loading and cleaning pipeline
│ ├── analysis.py # EDA and predictive modeling
│ └── streamlit_app.py # Interactive Streamlit app
├── docs/ # Generated Quarto site
├── index.qmd # Home page
├── Documentation.qmd # Function reference
├── Tutorial.qmd # Usage tutorial
├── TechnicalReport.qmd # Full technical report
└── _quarto.yml # Quarto configuration
Rebuild the Site
quarto render
Serve locally with:
quarto preview
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file final_project_demo-0.1.0.tar.gz.
File metadata
- Download URL: final_project_demo-0.1.0.tar.gz
- Upload date:
- Size: 7.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6ae3badc1adc92c8db0d22b97614249762e890184523ecc9fdf528d5b8aef06e
|
|
| MD5 |
69fae38360d8e8be0d92739b47c2fcf3
|
|
| BLAKE2b-256 |
93a0b7767075c16564ce3697837c6bbc5c53ec07235c41ba1413aeca7991664d
|
File details
Details for the file final_project_demo-0.1.0-py3-none-any.whl.
File metadata
- Download URL: final_project_demo-0.1.0-py3-none-any.whl
- Upload date:
- Size: 9.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d7b8ab27c33ce1e583527b4bc0ae50363e26f75a09a2659b2dc0f73277be4bd4
|
|
| MD5 |
a055fb07d8380f9d72758a61430bcd05
|
|
| BLAKE2b-256 |
c31b23b3860b8b4954487d32d9a18ef1d4b52b8846af0d25463693f3ca2b8f90
|