Skip to main content

Python library for TikTok metadata collection, with support for web scraping and the TikTok Research API.

Project description

TikTok Metadata Kit

PyPI version Python versions License: MIT Tests Ruff

A Python library for TikTok metadata collection, with support for web scraping and the TikTok Research API.

Quickstart

Install

The base install gives you the Research API client:

pip install tiktok-metadata-kit

Use the Research API client

Get your client_key and client_secret from the TikTok Research API portal.

from tiktok_metadata_kit.research_api import (
    ResearchAPIClient,
    QueryVideosOptions,
)

# Stream videos for a set of users — cursor-based pagination handled for you.
with ResearchAPIClient("client_key", "client_secret") as client:
    for video in client.query_videos_by_username(["alice", "bob"]):
        print(video["id"], video["username"])

# Stream videos by ID with a smaller field set and a safety cap on pages.
opts = QueryVideosOptions(
    fields=["id", "view_count"],
    max_count=50,
    max_pages=10,
)
for video in client.query_videos_by_id(["7123456789", "7987654321"], options=opts):
    print(video["id"], video["view_count"])

# Page-level iteration.
for page in client.query_videos_pages({"and": [...]}):

    cursor=page["data"].get("cursor"),
    search_id=page["data"].get("search_id"),

    for video in page["data"]["videos"]:
        print(video["id"])

# Single-shot user metadata lookup (no pagination).
info = client.query_user_info("alice")
print(info["data"]["follower_count"])

The client handles token retrieval and proactive refresh, retries transient failures (429, 5xx, network errors) with exponential backoff, and honors Retry-After. See the docstrings on ResearchAPIClient for the full API surface, including raw queries and resume-from-checkpoint via QueryVideosOptions(cursor=..., search_id=...).

Use the scraper

This is currently work in progress and not yet included in the package.

Development

Prerequisites

Tool Version Notes
Python 3.12 Required; use pyenv or a system package
Git any Pre-commit hooks are used

Initial Setup

1. Clone the repository

git clone https://github.com/Nico-AP/tiktok-metadata-kit
cd tiktok-metadata-kit

2. Create and activate a virtual environment

python -m venv venv
source venv/bin/activate        # macOS / Linux
venv\Scripts\activate           # Windows

3. Install Python dependencies

pip install -r requirements/base.txt
pip install -r requirements/dev.txt

4. Install pre-commit hooks

pre-commit install

Testing

Tests live at the project root in tests/, mirroring the package layout (e.g. tests/research_api/ covers src/tiktok_metadata_kit/research_api/). They are not packaged into the distribution.

Run the full suite:

pytest

Run a single subpackage or file:

pytest tests/research_api/
pytest tests/research_api/test_retry.py -v

Coverage:

coverage run -m pytest
coverage report

How HTTP is mocked

The Research API client takes a transport= constructor argument so tests can inject an httpx.MockTransport and drive the client without network access or real credentials. The MockHandler helper in tests/research_api/conftest.py records each request and returns programmed responses in FIFO order — see existing tests for usage patterns.

No live API credentials are needed to run the suite.

Integration tests

Integration tests live in tests/integration/ and hit the real TikTok Research API. They are marked with @pytest.mark.integration and excluded from the default pytest run so day-to-day development stays fast and credential-free.

Run them explicitly:

pytest -m integration

They will skip (not fail) unless these env vars are set:

Variable Purpose
TIKTOK_RESEARCH_API_KEY Client key for the Research API.
TIKTOK_RESEARCH_API_SECRET Client secret for the Research API.
Setting the env vars

Inline, for a single invocation (bash / zsh):

TIKTOK_RESEARCH_API_KEY=your-key \
TIKTOK_RESEARCH_API_SECRET=your-secret \
pytest -m integration

Inline, for a single invocation (Windows PowerShell):

$env:TIKTOK_RESEARCH_API_KEY = "your-key"
$env:TIKTOK_RESEARCH_API_SECRET = "your-secret"
pytest -m integration

Persistent for the current shell session (bash / zsh):

export TIKTOK_RESEARCH_API_KEY=your-key
export TIKTOK_RESEARCH_API_SECRET=your-secret

Persistent for the user (Windows, PowerShell):

[Environment]::SetEnvironmentVariable("TIKTOK_RESEARCH_API_KEY", "your-key", "User")
[Environment]::SetEnvironmentVariable("TIKTOK_RESEARCH_API_SECRET", "your-secret", "User")

Restart the shell after running these so the new values are picked up.

From a .env file — keep credentials out of shell history and out of the repo. Create a project-local .env (already covered by .gitignore):

TIKTOK_RESEARCH_API_KEY=your-key
TIKTOK_RESEARCH_API_SECRET=your-secret

Load it before running pytest:

set -a; source .env; set +a    # bash / zsh
Get-Content .env | ForEach-Object {
    if ($_ -match '^\s*([^#=]+)=(.*)$') { $env:($matches[1].Trim()) = $matches[2].Trim() }
}

Note that integration tests consume API quota — keep them minimal and favor max_count=1 style probes for shape assertions.

CI / Release

Branching

Branch Purpose
main Stable. Only ever advances when cutting a release. Each commit on main corresponds to a tagged version on PyPI.
dev Integration branch. Day-to-day work lands here via feature-branch PRs.

Feature branches → PR → merge to dev. When ready to release: PR devmain, tag, release.

Workflows

Three GitHub Actions workflows live in .github/workflows/:

Workflow Trigger What it does
test.yml push to main/dev; all PRs Runs ruff check and pytest with coverage (80% gate).
release-testpypi.yml push of a tag matching v* Runs tests, builds sdist + wheel (version derived from the tag via hatch-vcs), publishes to TestPyPI.
publish-pypi.yml a GitHub Release is published Same checks and build, publishes to production PyPI.

Both release workflows use trusted publishing (OIDC) — no PyPI API tokens are stored as GitHub Secrets.

Versioning

The package version is derived from git tags by hatch-vcs — there is no version field in pyproject.toml to maintain. The v* tag is the version:

Git state Built version
HEAD is exactly on tag v1.2.0 1.2.0
v1.2.0rc1 1.2.0rc1 (prerelease)
3 commits past v1.2.0 1.2.1.dev3+g<sha>
3 commits past + dirty tree 1.2.1.dev3+g<sha>.d...

The +g<sha> suffix is a PEP 440 local version identifier — valid in metadata, ignored by PyPI publishing (you can't upload +local versions), and lets editable installs report a distinguishable dev version.

tiktok_metadata_kit.__version__ reads this resolved version via importlib.metadata at import time, so it always reflects the installed wheel/sdist.

Cutting a release

The two-step flow lets you smoke-test the wheel on TestPyPI before promoting it to production PyPI.

  1. Merge devmain via PR.

  2. Tag the release commit on main and push the tag:

    git tag v1.2.0rc1   # or v1.2.0 for a final release
    git push --tags
    

    This fires release-testpypi.yml, which uploads to TestPyPI. The tag becomes the version — no pyproject.toml bump required.

  3. Smoke-test the TestPyPI wheel:

    pip install -i https://test.pypi.org/simple/ \
        --extra-index-url https://pypi.org/simple/ \
        --pre tiktok-metadata-kit==1.2.0rc1
    python -c "from tiktok_metadata_kit.research_api import ResearchAPIClient"
    

    (The --extra-index-url lets pip resolve regular dependencies from real PyPI; otherwise it only sees TestPyPI's smaller index.)

  4. Create a GitHub Release on the tag (GitHub UI → Releases → "Draft a new release" → pick the tag → check "Set as a pre-release" for aN/bN/rcN versions). Publishing the Release fires publish-pypi.yml, which uploads to PyPI.

For a final (non-prerelease) release, repeat with a clean version (v1.2.0) and leave the "pre-release" checkbox unchecked.

Prereleases

PyPI accepts prerelease versions and pip ignores them by default. Users who want them opt in with --pre:

pip install tiktok-metadata-kit --pre

Tag naming follows PEP 440: v1.2.0a1 (alpha), v1.2.0b1 (beta), v1.2.0rc1 (release candidate).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tiktok_metadata_kit-0.1.2.tar.gz (28.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tiktok_metadata_kit-0.1.2-py3-none-any.whl (14.7 kB view details)

Uploaded Python 3

File details

Details for the file tiktok_metadata_kit-0.1.2.tar.gz.

File metadata

  • Download URL: tiktok_metadata_kit-0.1.2.tar.gz
  • Upload date:
  • Size: 28.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for tiktok_metadata_kit-0.1.2.tar.gz
Algorithm Hash digest
SHA256 2f01889dd66359c409661ef81a06fd83056c3b5d5da321399afa5577f802ce9d
MD5 d842b70df424331b45f211241c2210f4
BLAKE2b-256 1e1ab4d274bb696b5211002b7aca38716d253a4581d77e6664b423be4ad1aa3a

See more details on using hashes here.

Provenance

The following attestation bundles were made for tiktok_metadata_kit-0.1.2.tar.gz:

Publisher: publish-pypi.yml on Nico-AP/tiktok-metadata-kit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file tiktok_metadata_kit-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for tiktok_metadata_kit-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 b785e9ac851a8e1bf7b10c92dbe682520d7344a872039e6890a4d40a88031114
MD5 af2b368f6349b0d22850894f96f9d114
BLAKE2b-256 7e4df65312534a6ebf30f62f5830da550818538c4fb96fd37e7c3c1af883acb8

See more details on using hashes here.

Provenance

The following attestation bundles were made for tiktok_metadata_kit-0.1.2-py3-none-any.whl:

Publisher: publish-pypi.yml on Nico-AP/tiktok-metadata-kit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page