Skip to main content

Python library for TikTok metadata collection, with support for web scraping and the TikTok Research API.

Project description

TikTok Metadata Kit

PyPI version Python versions License: MIT Tests Ruff

A Python library for TikTok metadata collection, with support for web scraping and the TikTok Research API.

Quickstart

Install

The base install gives you the Research API client:

pip install tiktok-metadata-kit

To also use the web scraper (adds beautifulsoup4 and selenium, the latter of which needs a browser driver), install the scraper extra:

pip install "tiktok-metadata-kit[scraper]"

Use the Research API client

Get your client_key and client_secret from the TikTok Research API portal.

from tiktok_metadata_kit.research_api import (
    ResearchAPIClient,
    PageOptions,
    QueryOptions,
)

# Single page — useful for ad-hoc lookups.
client = ResearchAPIClient(api_key="...", api_secret="...")
page = client.query_videos(["7123456789", "7987654321"])
for video in page["data"]["videos"]:
    print(video["id"], video["view_count"])

# Stream across all pages — cursor-based pagination handled for you.
with ResearchAPIClient("...", "...") as client:
    for video in client.iter_user_videos(["alice", "bob"]):
        print(video["id"], video["username"])

# Reuse filters across calls; cap iteration for safety.
filters = QueryOptions(start_date=d0, end_date=d1, max_count=100)
for page in client.iter_video_pages(ids, filters, max_pages=10):
    process_page(page)

The client handles token retrieval and proactive refresh, retries transient failures (429, 5xx, network errors) with exponential backoff, and honors Retry-After. See the docstrings on ResearchAPIClient for the full API surface.

Use the scraper

TODO — see the tiktok_metadata_kit.scraper subpackage.

Development

Prerequisites

Tool Version Notes
Python 3.12 Required; use pyenv or a system package
Git any Pre-commit hooks are used

Initial Setup

1. Clone the repository

git clone https://github.com/Nico-AP/tiktok-metadata-kit
cd tiktok-metadata-kit

2. Create and activate a virtual environment

python -m venv venv
source venv/bin/activate        # macOS / Linux
venv\Scripts\activate           # Windows

3. Install Python dependencies

pip install -r requirements/base.txt
pip install -r requirements/dev.txt

4. Install pre-commit hooks

pre-commit install

Testing

Tests live at the project root in tests/, mirroring the package layout (e.g. tests/research_api/ covers src/tiktok_metadata_kit/research_api/). They are not packaged into the distribution.

Run the full suite:

pytest

Run a single subpackage or file:

pytest tests/research_api/
pytest tests/research_api/test_retry.py -v

Coverage:

coverage run -m pytest
coverage report

How HTTP is mocked

The Research API client takes a transport= constructor argument so tests can inject an httpx.MockTransport and drive the client without network access or real credentials. The MockHandler helper in tests/research_api/conftest.py records each request and returns programmed responses in FIFO order — see existing tests for usage patterns.

No live API credentials are needed to run the suite.

Integration tests

Integration tests live in tests/integration/ and hit the real TikTok Research API. They are marked with @pytest.mark.integration and excluded from the default pytest run so day-to-day development stays fast and credential-free.

Run them explicitly:

pytest -m integration

They will skip (not fail) unless these env vars are set:

Variable Purpose
TIKTOK_RESEARCH_API_KEY Client key for the Research API.
TIKTOK_RESEARCH_API_SECRET Client secret for the Research API.
Setting the env vars

Inline, for a single invocation (bash / zsh):

TIKTOK_RESEARCH_API_KEY=your-key \
TIKTOK_RESEARCH_API_SECRET=your-secret \
pytest -m integration

Inline, for a single invocation (Windows PowerShell):

$env:TIKTOK_RESEARCH_API_KEY = "your-key"
$env:TIKTOK_RESEARCH_API_SECRET = "your-secret"
pytest -m integration

Persistent for the current shell session (bash / zsh):

export TIKTOK_RESEARCH_API_KEY=your-key
export TIKTOK_RESEARCH_API_SECRET=your-secret

Persistent for the user (Windows, PowerShell):

[Environment]::SetEnvironmentVariable("TIKTOK_RESEARCH_API_KEY", "your-key", "User")
[Environment]::SetEnvironmentVariable("TIKTOK_RESEARCH_API_SECRET", "your-secret", "User")

Restart the shell after running these so the new values are picked up.

From a .env file — keep credentials out of shell history and out of the repo. Create a project-local .env (already covered by .gitignore):

TIKTOK_RESEARCH_API_KEY=your-key
TIKTOK_RESEARCH_API_SECRET=your-secret

Load it before running pytest:

set -a; source .env; set +a    # bash / zsh
Get-Content .env | ForEach-Object {
    if ($_ -match '^\s*([^#=]+)=(.*)$') { $env:($matches[1].Trim()) = $matches[2].Trim() }
}

Note that integration tests consume API quota — keep them minimal and favor max_count=1 style probes for shape assertions.

CI / Release

Branching

Branch Purpose
main Stable. Only ever advances when cutting a release. Each commit on main corresponds to a tagged version on PyPI.
dev Integration branch. Day-to-day work lands here via feature-branch PRs.

Feature branches → PR → merge to dev. When ready to release: PR devmain, tag, release.

Workflows

Three GitHub Actions workflows live in .github/workflows/:

Workflow Trigger What it does
test.yml push to main/dev; all PRs Runs ruff check and pytest with coverage (80% gate).
release-testpypi.yml push of a tag matching v* Runs tests, builds sdist + wheel (version derived from the tag via hatch-vcs), publishes to TestPyPI.
publish-pypi.yml a GitHub Release is published Same checks and build, publishes to production PyPI.

Both release workflows use trusted publishing (OIDC) — no PyPI API tokens are stored as GitHub Secrets.

Versioning

The package version is derived from git tags by hatch-vcs — there is no version field in pyproject.toml to maintain. The v* tag is the version:

Git state Built version
HEAD is exactly on tag v1.2.0 1.2.0
v1.2.0rc1 1.2.0rc1 (prerelease)
3 commits past v1.2.0 1.2.1.dev3+g<sha>
3 commits past + dirty tree 1.2.1.dev3+g<sha>.d...

The +g<sha> suffix is a PEP 440 local version identifier — valid in metadata, ignored by PyPI publishing (you can't upload +local versions), and lets editable installs report a distinguishable dev version.

tiktok_metadata_kit.__version__ reads this resolved version via importlib.metadata at import time, so it always reflects the installed wheel/sdist.

Cutting a release

The two-step flow lets you smoke-test the wheel on TestPyPI before promoting it to production PyPI.

  1. Merge devmain via PR.

  2. Tag the release commit on main and push the tag:

    git tag v1.2.0rc1   # or v1.2.0 for a final release
    git push --tags
    

    This fires release-testpypi.yml, which uploads to TestPyPI. The tag becomes the version — no pyproject.toml bump required.

  3. Smoke-test the TestPyPI wheel:

    pip install -i https://test.pypi.org/simple/ \
        --extra-index-url https://pypi.org/simple/ \
        --pre tiktok-metadata-kit==1.2.0rc1
    python -c "from tiktok_metadata_kit.research_api import ResearchAPIClient"
    

    (The --extra-index-url lets pip resolve regular dependencies from real PyPI; otherwise it only sees TestPyPI's smaller index.)

  4. Create a GitHub Release on the tag (GitHub UI → Releases → "Draft a new release" → pick the tag → check "Set as a pre-release" for aN/bN/rcN versions). Publishing the Release fires publish-pypi.yml, which uploads to PyPI.

For a final (non-prerelease) release, repeat with a clean version (v1.2.0) and leave the "pre-release" checkbox unchecked.

Prereleases

PyPI accepts prerelease versions and pip ignores them by default. Users who want them opt in with --pre:

pip install tiktok-metadata-kit --pre

Tag naming follows PEP 440: v1.2.0a1 (alpha), v1.2.0b1 (beta), v1.2.0rc1 (release candidate).

First-time PyPI setup (one-off)

Trusted publishing requires a "pending publisher" to be registered on each service before the first release:

For each, register:

Field Value
Project name tiktok-metadata-kit
Owner Nico-AP
Repository tiktok-metadata-kit
Workflow filename release-testpypi.yml (for TestPyPI) / publish-pypi.yml (for PyPI)
Environment name testpypi / pypi

Then create matching environments in the GitHub repo (Settings → Environments). Optionally add required reviewers on the pypi environment to gate production publishes behind manual approval.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tiktok_metadata_kit-0.1.0.tar.gz (28.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tiktok_metadata_kit-0.1.0-py3-none-any.whl (23.0 kB view details)

Uploaded Python 3

File details

Details for the file tiktok_metadata_kit-0.1.0.tar.gz.

File metadata

  • Download URL: tiktok_metadata_kit-0.1.0.tar.gz
  • Upload date:
  • Size: 28.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for tiktok_metadata_kit-0.1.0.tar.gz
Algorithm Hash digest
SHA256 c694901ffc69234108e09807fa131f4f77f534936fdd00ab1b1be0985e740a4d
MD5 e098a5e144ab5bdfbe11dab01c483f78
BLAKE2b-256 a102860722674659a0edf74a26f438d7f9d55ebe09448872720cdd6c7c4b6676

See more details on using hashes here.

Provenance

The following attestation bundles were made for tiktok_metadata_kit-0.1.0.tar.gz:

Publisher: publish-pypi.yml on Nico-AP/tiktok-metadata-kit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file tiktok_metadata_kit-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for tiktok_metadata_kit-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 cc7db7859bde8484f370fa5046754cc366f704b4ad02cb71598e10a9fff7efb8
MD5 a2b0ea6761ce74d26c05304d5f1c66f0
BLAKE2b-256 7a6aeed2d3bb79f0531654723fdb5f22a97f9bff77726e9b69dd4ee6be7d37d6

See more details on using hashes here.

Provenance

The following attestation bundles were made for tiktok_metadata_kit-0.1.0-py3-none-any.whl:

Publisher: publish-pypi.yml on Nico-AP/tiktok-metadata-kit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page