Data pipeline for TikTok analytics. Your exports in, real metrics out.
Project description
tokpipe
Data pipeline for TikTok analytics. Import your exported data, clean it, classify content, compute real metrics, and visualize what actually works.
No APIs, no scraping, no third-party tokens. Just your TikTok export files (CSV/XLSX) and Python.
Why tokpipe?
TikTok gives you a spreadsheet with raw numbers. That's it. No insights, no trends, no "why did this video work?".
tokpipe takes that file and builds a full analytics pipeline: cleans the data, classifies your content by topic, computes real metrics (engagement rate, best posting hour, growth trends), and generates an interactive dashboard, an Excel report with formulas, and static charts. One command, all outputs.
It's built for creators who want to understand their data without depending on third-party tools that ask for your credentials.
What you get
tokpipe analyze TikTok_Analytics.xlsx --followers 8728
| Output | What it is |
|---|---|
report.csv |
Your data cleaned + engagement rate, completion rate, category per video |
analytics.xlsx |
Excel with native formulas — open in Excel or Google Sheets |
dashboard.html |
Interactive Plotly dashboard — open in any browser, hover for details |
engagement.png |
Engagement rate distribution across all your videos |
best_hours.png |
Which hours get the best engagement |
growth.png |
7-day rolling average of your views |
Architecture
tokpipe follows a classic ETL pipeline structure:
Export (TikTok XLSX/CSV)
|
v
+-----------+
| ingest | --> Load and validate raw export files
+-----------+
|
v
+-----------+
| clean | --> Normalize columns, fix types, handle nulls
+-----------+
|
v
+-----------+
| classify | --> Tag each video with a topic/category
+-----------+
|
v
+-----------+
| metrics | --> Compute engagement rate, retention, trends
+-----------+
|
v
+-----------+ +-----------+ +-----------+
| output | | excel | | dashboard |
| (CSV/PNG) | | (.xlsx) | | (.html) |
+-----------+ +-----------+ +-----------+
Modules
| Module | What it does |
|---|---|
tokpipe.ingest |
Reads TikTok export files (XLSX, CSV). Detects format, validates columns, returns a raw DataFrame. |
tokpipe.clean |
Normalizes column names, converts date/number types, drops corrupted rows, fills missing values. |
tokpipe.classify |
Assigns a topic/category to each video. Configurable via YAML rules or custom function. |
tokpipe.metrics |
Computes derived metrics: engagement rate, average watch time, best posting hour, growth trends. |
tokpipe.output |
Exports results to CSV/JSON. Generates matplotlib/seaborn PNG charts. |
tokpipe.excel |
Generates Excel report with native formulas, formatting, and embedded charts. |
tokpipe.dashboard |
Generates interactive Plotly HTML dashboard with all visualizations. |
tokpipe.cli |
Command-line interface. Entry point for tokpipe analyze. |
Prerequisites
You need two things before installing tokpipe:
Python 3.10+
Check your version:
python --version
# or
python3 --version
If you don't have it:
# macOS (Homebrew)
brew install python
# Ubuntu/Debian
sudo apt install python3 python3-venv python3-pip
# Windows (winget)
winget install Python.Python.3.12
Or download directly from python.org.
git (optional)
Only needed to clone the repo. You can also download the ZIP from GitHub.
git --version
Get your TikTok data
tokpipe works with the analytics files that TikTok lets you export. No API keys, no scraping — just the file TikTok gives you.
How to export:
- Open TikTok on desktop (not the app) or go to tiktok.com
- Go to your profile > Creator tools > Analytics
- Select the date range you want to analyze
- Click Export data (top right)
- Download the XLSX or CSV file
What the file should contain:
| Required columns | Optional columns |
|---|---|
| Views | Watch time |
| Likes | Video duration |
| Comments | Post date/time |
| Shares | Caption/description |
tokpipe auto-detects column names in both English and Spanish. If your export uses different names, the pipeline will try to match them — if it can't find a views column, it will tell you.
Installation
# Clone the repo
git clone https://github.com/aroaxinping/tokpipe.git
cd tokpipe
# Create a virtual environment
python3 -m venv .venv
# Activate it
source .venv/bin/activate # macOS / Linux
# .venv\Scripts\activate # Windows (cmd)
# .venv\Scripts\Activate.ps1 # Windows (PowerShell)
# Install tokpipe and all its dependencies
pip install -e .
This installs: pandas, openpyxl, matplotlib, seaborn, plotly, and pyyaml.
Verify it worked:
tokpipe --version
# tokpipe 0.1.0
Usage
Try it with sample data
Don't have a TikTok export yet? Use the included sample:
tokpipe analyze examples/sample_data.csv --output sample_results/
Quick start
# Make sure your venv is active
source .venv/bin/activate
# Run the pipeline on your export file
tokpipe analyze ~/Downloads/TikTok_Analytics.xlsx
That's it. It will create a results/ folder with everything.
Output files
results/
report.csv # Your data cleaned + engagement rate, completion rate, category
analytics.xlsx # Excel with native formulas (open in Excel/Google Sheets)
dashboard.html # Interactive dashboard (open in any browser)
engagement.png # Engagement rate distribution chart
best_hours.png # Which hours get the best engagement
growth.png # How your views are trending over time
All CLI options
tokpipe analyze <file> [options]
| Option | What it does | Example |
|---|---|---|
--output, -o |
Output directory (default: results/) |
--output my_report/ |
--followers |
Your follower count (shown in reports) | --followers 8728 |
--period |
Label for the date range you're analyzing | --period "24 Feb - 23 Mar 2026" |
--rules |
Path to YAML file with custom classification rules | --rules my_rules.yaml |
--no-charts |
Skip PNG chart generation | |
--no-dashboard |
Skip HTML dashboard generation | |
--no-excel |
Skip Excel report generation |
Full example
tokpipe analyze TikTok_Analytics.xlsx \
--output results/ \
--followers 8728 \
--period "24 Feb - 23 Mar 2026" \
--rules rules.yaml
Only want the CSV?
tokpipe analyze data.xlsx --no-dashboard --no-excel --no-charts
Python API
from tokpipe import ingest, clean, classify, metrics, output, excel, dashboard
# Load and clean
raw = ingest.load("TikTok_Analytics.xlsx")
df = clean.normalize(raw)
# Classify content
df["category"] = classify.classify(df)
# Compute metrics
report = metrics.compute(df)
print(report.summary())
# Export
output.to_csv(report, "report.csv")
excel.to_excel(report, "analytics.xlsx", followers=8728)
dashboard.generate(report, "dashboard.html")
Content classification
By default, tokpipe classifies videos into: setup, coding, data, study, tech, other.
Custom rules via YAML
Create a rules.yaml:
setup:
- keyboard
- monitor
- desk
- compra
coding:
- python
- debug
- script
data:
- dataset
- pandas
- sql
study:
- exam
- uni
- homework
tokpipe analyze data.xlsx --rules rules.yaml
Custom function (Python API)
def my_classifier(text: str) -> str:
if "python" in text:
return "coding"
if "setup" in text:
return "setup"
return "other"
df["category"] = classify.classify(df, classifier_fn=my_classifier)
SQL queries
The sql/ directory contains reference queries for analyzing your exported CSV with DuckDB, SQLite, or any SQL engine:
# Example with DuckDB
duckdb -c "
CREATE TABLE videos AS SELECT * FROM read_csv_auto('results/report.csv');
SELECT * FROM videos ORDER BY engagement_rate DESC LIMIT 10;
"
See sql/queries.sql for the full set.
Available metrics
| Metric | Formula / Description |
|---|---|
| Engagement rate | (likes + comments + shares) / views |
| Average watch time | Total watch time / views |
| Completion rate | Average watch time / video duration |
| Best posting hour | Hour with highest median engagement |
| Growth trend | Rolling 7-day average of views |
| Top performers | Videos above 90th percentile engagement |
Project structure
tokpipe/
.github/
workflows/
ci.yml # GitHub Actions CI (tests on Python 3.10-3.13)
ISSUE_TEMPLATE/
bug_report.md # Bug report template
feature_request.md # Feature request template
src/
tokpipe/
__init__.py # Package init, version
cli.py # Command-line interface
ingest.py # Load TikTok exports
clean.py # Normalize and clean data
classify.py # Content classifier (YAML / custom function)
metrics.py # Compute derived metrics
output.py # CSV/JSON export + matplotlib charts
excel.py # Excel report with formulas
dashboard.py # Interactive Plotly HTML dashboard
tests/
test_ingest.py
test_clean.py
test_metrics.py
sql/
queries.sql # Reference SQL queries
examples/
basic_analysis.py # Minimal working example
sample_data.csv # Fake data to test without a TikTok account
pyproject.toml
LICENSE
CONTRIBUTING.md
README.md
Troubleshooting
ModuleNotFoundError: No module named 'tokpipe'
Your virtual environment is not activated. Run:
source .venv/bin/activate # macOS / Linux
.venv\Scripts\activate # Windows
ModuleNotFoundError: No module named 'pandas'
Dependencies are not installed. Run:
pip install -e .
ValueError: Could not find a 'views' column
tokpipe couldn't match any column in your export to "views". This happens when the export uses a language tokpipe doesn't recognize yet. Open your file, check the column name for views, and open an issue with the column names so we can add support.
FileNotFoundError: File not found
Check that the path to your export file is correct. Use the full path:
tokpipe analyze /Users/you/Downloads/TikTok_Analytics.xlsx
Charts are not generated
If you see -- Skipping best hours or -- Skipping growth trend, your export file doesn't have a date/time column. tokpipe needs a column with post dates to generate time-based charts. The engagement distribution chart will still work.
pip install -e . fails
If you're on Python 3.14+, try installing without editable mode:
pip install .
Or install dependencies manually and run with PYTHONPATH:
pip install pandas openpyxl matplotlib seaborn plotly pyyaml
PYTHONPATH=src tokpipe analyze data.xlsx
Contributing
See CONTRIBUTING.md.
License
MIT. See LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tokpipe-0.1.0.tar.gz.
File metadata
- Download URL: tokpipe-0.1.0.tar.gz
- Upload date:
- Size: 26.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bbf0c51b032d36a6c57540359905f39ee7b00b116de7c268d8f9466fe4620c57
|
|
| MD5 |
1655d39db20ccdbab23ef0471124e443
|
|
| BLAKE2b-256 |
17b758ec167408f14a24fd0db1a266cd60f11e1c53332e9911b3682ad1dff894
|
File details
Details for the file tokpipe-0.1.0-py3-none-any.whl.
File metadata
- Download URL: tokpipe-0.1.0-py3-none-any.whl
- Upload date:
- Size: 18.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
df0af14564f7fd587eba1a0a614dfa646716a1869769f538211b513b026ec4df
|
|
| MD5 |
03444153138d669345cc6cbaff8711e6
|
|
| BLAKE2b-256 |
dff4d3264a64a89000b7e61d12ffa52d9551b9db92558c02d36eecb4a7a6ec0f
|