Python library for TikTok metadata collection, with support for web scraping and the TikTok Research API.
Project description
TikTok Metadata Kit
A Python library for TikTok metadata collection, with support for web scraping and the TikTok Research API.
Quickstart
Install
The base install gives you the Research API client:
pip install tiktok-metadata-kit
To also use the web scraper (adds beautifulsoup4 and selenium, the latter
of which needs a browser driver), install the scraper extra:
pip install "tiktok-metadata-kit[scraper]"
Use the Research API client
Get your client_key and client_secret from the TikTok Research API portal.
from tiktok_metadata_kit.research_api import (
ResearchAPIClient,
PageOptions,
QueryOptions,
)
# Single page — useful for ad-hoc lookups.
client = ResearchAPIClient(api_key="...", api_secret="...")
page = client.query_videos(["7123456789", "7987654321"])
for video in page["data"]["videos"]:
print(video["id"], video["view_count"])
# Stream across all pages — cursor-based pagination handled for you.
with ResearchAPIClient("...", "...") as client:
for video in client.iter_user_videos(["alice", "bob"]):
print(video["id"], video["username"])
# Reuse filters across calls; cap iteration for safety.
filters = QueryOptions(start_date=d0, end_date=d1, max_count=100)
for page in client.iter_video_pages(ids, filters, max_pages=10):
process_page(page)
The client handles token retrieval and proactive refresh, retries transient
failures (429, 5xx, network errors) with exponential backoff, and honors
Retry-After. See the docstrings on
ResearchAPIClient for the
full API surface.
Use the scraper
TODO — see the
tiktok_metadata_kit.scrapersubpackage.
Development
Prerequisites
| Tool | Version | Notes |
|---|---|---|
| Python | 3.12 | Required; use pyenv or a system package |
| Git | any | Pre-commit hooks are used |
Initial Setup
1. Clone the repository
git clone https://github.com/Nico-AP/tiktok-metadata-kit
cd tiktok-metadata-kit
2. Create and activate a virtual environment
python -m venv venv
source venv/bin/activate # macOS / Linux
venv\Scripts\activate # Windows
3. Install Python dependencies
pip install -r requirements/base.txt
pip install -r requirements/dev.txt
4. Install pre-commit hooks
pre-commit install
Testing
Tests live at the project root in tests/, mirroring the package layout
(e.g. tests/research_api/ covers src/tiktok_metadata_kit/research_api/).
They are not packaged into the distribution.
Run the full suite:
pytest
Run a single subpackage or file:
pytest tests/research_api/
pytest tests/research_api/test_retry.py -v
Coverage:
coverage run -m pytest
coverage report
How HTTP is mocked
The Research API client takes a transport= constructor argument so tests
can inject an httpx.MockTransport and drive the client without network
access or real credentials. The MockHandler helper in
tests/research_api/conftest.py records each request and returns programmed
responses in FIFO order — see existing tests for usage patterns.
No live API credentials are needed to run the suite.
Integration tests
Integration tests live in tests/integration/ and hit the real TikTok
Research API. They are marked with @pytest.mark.integration and
excluded from the default pytest run so day-to-day development stays
fast and credential-free.
Run them explicitly:
pytest -m integration
They will skip (not fail) unless these env vars are set:
| Variable | Purpose |
|---|---|
TIKTOK_RESEARCH_API_KEY |
Client key for the Research API. |
TIKTOK_RESEARCH_API_SECRET |
Client secret for the Research API. |
Setting the env vars
Inline, for a single invocation (bash / zsh):
TIKTOK_RESEARCH_API_KEY=your-key \
TIKTOK_RESEARCH_API_SECRET=your-secret \
pytest -m integration
Inline, for a single invocation (Windows PowerShell):
$env:TIKTOK_RESEARCH_API_KEY = "your-key"
$env:TIKTOK_RESEARCH_API_SECRET = "your-secret"
pytest -m integration
Persistent for the current shell session (bash / zsh):
export TIKTOK_RESEARCH_API_KEY=your-key
export TIKTOK_RESEARCH_API_SECRET=your-secret
Persistent for the user (Windows, PowerShell):
[Environment]::SetEnvironmentVariable("TIKTOK_RESEARCH_API_KEY", "your-key", "User")
[Environment]::SetEnvironmentVariable("TIKTOK_RESEARCH_API_SECRET", "your-secret", "User")
Restart the shell after running these so the new values are picked up.
From a .env file — keep credentials out of shell history and out of the
repo. Create a project-local .env (already covered by .gitignore):
TIKTOK_RESEARCH_API_KEY=your-key
TIKTOK_RESEARCH_API_SECRET=your-secret
Load it before running pytest:
set -a; source .env; set +a # bash / zsh
Get-Content .env | ForEach-Object {
if ($_ -match '^\s*([^#=]+)=(.*)$') { $env:($matches[1].Trim()) = $matches[2].Trim() }
}
Note that integration tests consume API quota — keep them minimal and
favor max_count=1 style probes for shape assertions.
CI / Release
Branching
| Branch | Purpose |
|---|---|
main |
Stable. Only ever advances when cutting a release. Each commit on main corresponds to a tagged version on PyPI. |
dev |
Integration branch. Day-to-day work lands here via feature-branch PRs. |
Feature branches → PR → merge to dev. When ready to release: PR dev → main,
tag, release.
Workflows
Three GitHub Actions workflows live in .github/workflows/:
| Workflow | Trigger | What it does |
|---|---|---|
test.yml |
push to main/dev; all PRs |
Runs ruff check and pytest with coverage (80% gate). |
release-testpypi.yml |
push of a tag matching v* |
Runs tests, builds sdist + wheel (version derived from the tag via hatch-vcs), publishes to TestPyPI. |
publish-pypi.yml |
a GitHub Release is published | Same checks and build, publishes to production PyPI. |
Both release workflows use trusted publishing (OIDC) — no PyPI API tokens are stored as GitHub Secrets.
Versioning
The package version is derived from git tags by
hatch-vcs — there is no version
field in pyproject.toml to maintain. The v* tag is the version:
| Git state | Built version |
|---|---|
HEAD is exactly on tag v1.2.0 |
1.2.0 |
v1.2.0rc1 |
1.2.0rc1 (prerelease) |
3 commits past v1.2.0 |
1.2.1.dev3+g<sha> |
| 3 commits past + dirty tree | 1.2.1.dev3+g<sha>.d... |
The +g<sha> suffix is a PEP 440 local version identifier — valid in
metadata, ignored by PyPI publishing (you can't upload +local versions),
and lets editable installs report a distinguishable dev version.
tiktok_metadata_kit.__version__ reads this resolved version via
importlib.metadata at import time, so it always reflects the installed
wheel/sdist.
Cutting a release
The two-step flow lets you smoke-test the wheel on TestPyPI before promoting it to production PyPI.
-
Merge
dev→mainvia PR. -
Tag the release commit on
mainand push the tag:git tag v1.2.0rc1 # or v1.2.0 for a final release git push --tags
This fires
release-testpypi.yml, which uploads to TestPyPI. The tag becomes the version — nopyproject.tomlbump required. -
Smoke-test the TestPyPI wheel:
pip install -i https://test.pypi.org/simple/ \ --extra-index-url https://pypi.org/simple/ \ --pre tiktok-metadata-kit==1.2.0rc1 python -c "from tiktok_metadata_kit.research_api import ResearchAPIClient"
(The
--extra-index-urllets pip resolve regular dependencies from real PyPI; otherwise it only sees TestPyPI's smaller index.) -
Create a GitHub Release on the tag (GitHub UI → Releases → "Draft a new release" → pick the tag → check "Set as a pre-release" for
aN/bN/rcNversions). Publishing the Release firespublish-pypi.yml, which uploads to PyPI.
For a final (non-prerelease) release, repeat with a clean version
(v1.2.0) and leave the "pre-release" checkbox unchecked.
Prereleases
PyPI accepts prerelease versions and pip ignores them by default. Users
who want them opt in with --pre:
pip install tiktok-metadata-kit --pre
Tag naming follows PEP 440: v1.2.0a1 (alpha), v1.2.0b1 (beta),
v1.2.0rc1 (release candidate).
First-time PyPI setup (one-off)
Trusted publishing requires a "pending publisher" to be registered on each service before the first release:
- TestPyPI: https://test.pypi.org/manage/account/publishing/
- PyPI: https://pypi.org/manage/account/publishing/
For each, register:
| Field | Value |
|---|---|
| Project name | tiktok-metadata-kit |
| Owner | Nico-AP |
| Repository | tiktok-metadata-kit |
| Workflow filename | release-testpypi.yml (for TestPyPI) / publish-pypi.yml (for PyPI) |
| Environment name | testpypi / pypi |
Then create matching environments in the GitHub repo
(Settings → Environments). Optionally add required reviewers on the pypi
environment to gate production publishes behind manual approval.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tiktok_metadata_kit-0.1.0.tar.gz.
File metadata
- Download URL: tiktok_metadata_kit-0.1.0.tar.gz
- Upload date:
- Size: 28.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c694901ffc69234108e09807fa131f4f77f534936fdd00ab1b1be0985e740a4d
|
|
| MD5 |
e098a5e144ab5bdfbe11dab01c483f78
|
|
| BLAKE2b-256 |
a102860722674659a0edf74a26f438d7f9d55ebe09448872720cdd6c7c4b6676
|
Provenance
The following attestation bundles were made for tiktok_metadata_kit-0.1.0.tar.gz:
Publisher:
publish-pypi.yml on Nico-AP/tiktok-metadata-kit
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tiktok_metadata_kit-0.1.0.tar.gz -
Subject digest:
c694901ffc69234108e09807fa131f4f77f534936fdd00ab1b1be0985e740a4d - Sigstore transparency entry: 1839611138
- Sigstore integration time:
-
Permalink:
Nico-AP/tiktok-metadata-kit@a52864e492e8cad7a3918d47dd79730b4e0d87b4 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/Nico-AP
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@a52864e492e8cad7a3918d47dd79730b4e0d87b4 -
Trigger Event:
release
-
Statement type:
File details
Details for the file tiktok_metadata_kit-0.1.0-py3-none-any.whl.
File metadata
- Download URL: tiktok_metadata_kit-0.1.0-py3-none-any.whl
- Upload date:
- Size: 23.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cc7db7859bde8484f370fa5046754cc366f704b4ad02cb71598e10a9fff7efb8
|
|
| MD5 |
a2b0ea6761ce74d26c05304d5f1c66f0
|
|
| BLAKE2b-256 |
7a6aeed2d3bb79f0531654723fdb5f22a97f9bff77726e9b69dd4ee6be7d37d6
|
Provenance
The following attestation bundles were made for tiktok_metadata_kit-0.1.0-py3-none-any.whl:
Publisher:
publish-pypi.yml on Nico-AP/tiktok-metadata-kit
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tiktok_metadata_kit-0.1.0-py3-none-any.whl -
Subject digest:
cc7db7859bde8484f370fa5046754cc366f704b4ad02cb71598e10a9fff7efb8 - Sigstore transparency entry: 1839611235
- Sigstore integration time:
-
Permalink:
Nico-AP/tiktok-metadata-kit@a52864e492e8cad7a3918d47dd79730b4e0d87b4 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/Nico-AP
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@a52864e492e8cad7a3918d47dd79730b4e0d87b4 -
Trigger Event:
release
-
Statement type: