Skip to main content

Strip tracking parameters from URLs. Deterministically. Zero dependencies.

Project description

detrack

Strip tracking parameters from URLs. Deterministically. Zero dependencies.

Python MIT no dependencies PyPI

Install

pip install detrack

Quick start

import detrack

url = "https://example.com/post?utm_source=twitter&q=python&fbclid=123"
result = detrack.clean(url)

print(result.url)
# "https://example.com/post?q=python"

print(result.removed_params)
# {"utm_source": "twitter", "fbclid": "123"}

print(result.cleaned_params)
# {"q": "python"}

Why detrack?

Other URL cleaners do too much (host remapping, site-specific rules, semantic rewriting), while detrack does one thing and does it well: remove tracking parameters.

This makes detrack predictable, testable, and trivial to integrate.

Ecosystem

detrack is the shared cleaning layer for the seoslug (SEO metadata) and tagurl (semantic tagging) libraries.

Examples

Basic

>>> detrack.clean("https://example.com?utm_source=twitter&q=python")
DetrackResult(url="https://example.com?q=python", cleaned_params={"q": "python"},
              removed_params={"utm_source": "twitter"})

Multiple trackers stripped

>>> detrack.clean("https://example.com?a=1&utm_source=x&b=2&fbclid=y&c=3")
DetrackResult(url="https://example.com?a=1&b=2&c=3",
              cleaned_params={"a": "1", "b": "2", "c": "3"},
              removed_params={"utm_source": "x", "fbclid": "y"})

All params stripped (query removed entirely)

>>> detrack.clean("https://example.com?utm_source=x&fbclid=y")
DetrackResult(url="https://example.com",
              cleaned_params={},
              removed_params={"utm_source": "x", "fbclid": "y"})

Custom patterns

>>> detrack.clean("https://example.com?session=abc123&page=1", patterns=["session"])
DetrackResult(url="https://example.com?page=1",
              cleaned_params={"page": "1"},
              removed_params={"session": "abc123"})

Query string only

>>> detrack.clean_query("a=1&utm_source=x&b=2")
"a=1&b=2"

>>> detrack.clean_query("utm_source=x&fbclid=y")
""

API

detrack.clean(url, patterns=None)

Strip tracking parameters from a full URL.

Parameter Type Description
url str Any URL string
patterns list[str] | None Optional param names to strip (defaults to DEFAULT_PATTERNS)

Returns: DetrackResult -> dataclass with cleaned URL and metadata.

Raises: Nothing -> pure function, no exceptions. Malformed URLs pass through unchanged. Invalid patterns are ignored.


detrack.clean_query(query, patterns=None)

Strip tracking parameters from a query string only.

Parameter Type Description
query str URL query string, e.g. "a=1&utm_source=x&b=2"
patterns list[str] | None Optional param names to strip

Returns: str -> cleaned query string (empty string if all params stripped).


detrack.DEFAULT_PATTERNS

frozenset[str]  # 60+ common tracking parameters

Covers UTM parameters, social tracking (fbclid, ref, si), marketing IDs (gclid, msclkid, wbraid), analytics (_ga, _gl), cache busters (cb, rand, timestamp), session IDs (sid, phpsessid), and redirect params. Pass a custom patterns list to clean() to override.


DetrackResult

@dataclass
class DetrackResult:
    url: str                       # Cleaned URL
    parsed_url: SplitResult        # urllib.parse result (for further processing)
    cleaned_params: dict[str, str] # Parameters that remain
    removed_params: dict[str, str] # Stripped parameters + their original values

removed_params preserves the original values so you can log what was stripped for analytics, debugging, or compliance.

Features

  • 60+ default patterns: UTM, social, marketing, analytics, cache busters, session, redirect
  • Case-insensitive matching: UTM_SOURCE, Utm_Source, and utm_source are all stripped
  • Zero dependencies: uses only urllib.parse from the Python standard library
  • Deterministic: same input always yields the same output, across all systems
  • Pure functions: no state, no I/O, no random numbers, no exceptions
  • Metadata returned: removed_params tells you exactly what was stripped and its original value

See MIT LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

detrack-0.2.0.tar.gz (6.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

detrack-0.2.0-py3-none-any.whl (5.6 kB view details)

Uploaded Python 3

File details

Details for the file detrack-0.2.0.tar.gz.

File metadata

  • Download URL: detrack-0.2.0.tar.gz
  • Upload date:
  • Size: 6.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for detrack-0.2.0.tar.gz
Algorithm Hash digest
SHA256 2b59c32f6e181e8f149cb22c67ecd778a75dcf5714d9d3c8d78290993b9ba410
MD5 a639b1bb8c8f2cea0a1ace1880f55dae
BLAKE2b-256 91084234dc3470e6f394882e869307377f94604e5046412fd9cd73c999637f15

See more details on using hashes here.

File details

Details for the file detrack-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: detrack-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 5.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for detrack-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 89ff1a0fafca9868b8e8f32bca9e7862e5c13f2bb83857a0e3864d8668dd8bae
MD5 fef0251cab4e84b3b51ccede9ac87a99
BLAKE2b-256 e80b58f7f0d94e14be3c7078db4ee63ba62a8ae5e477edde9f39df14f85c7f68

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page