Skip to main content

Pluggable pipeline framework for data quality workflows

Project description

GoldenPipe

Golden Suite orchestrator -- Check quality, fix issues, deduplicate records. One command. Built by Ben Severn.

PyPI CI codecov Downloads Python 3.11+ License: MIT Docs DQBench Pipeline Open In Colab

What It Does

Raw Data
  | GoldenCheck   -- profile & discover quality issues
  | GoldenFlow    -- fix issues, standardize, reshape
  | GoldenMatch   -- deduplicate, match, create golden records
  v
Golden Records

GoldenPipe orchestrates the full pipeline with adaptive logic:

  • Skips transformation if no quality issues found
  • Routes to privacy-preserving matching if sensitive fields detected
  • Reports reasoning for every decision

Install

pip install goldenpipe

Quick Start

import goldenpipe as gp

result = gp.run("customers.csv")

print(result.status)        # "success"
print(result.check)         # Quality findings
print(result.transform)     # What was fixed
print(result.match)         # Deduplicated clusters
print(result.reasoning)     # Why each decision was made

CLI

goldenpipe run customers.csv                # Full pipeline
goldenpipe run customers.csv --verbose      # Show reasoning
goldenpipe run customers.csv --skip-flow    # Check + Match only
goldenpipe run customers.csv --strategy pprl  # Force privacy mode
goldenpipe run customers.csv -o golden.csv  # Save golden records

Remote MCP Server

GoldenPipe is available as a hosted MCP server on Smithery — connect from any MCP client without installing anything.

Claude Desktop / Claude Code:

{
  "mcpServers": {
    "goldenpipe": {
      "url": "https://goldenpipe-mcp-production.up.railway.app/mcp/"
    }
  }
}

Local server:

pip install goldenpipe[mcp]
goldenpipe mcp-serve

4 tools available: list pipeline stages, validate wiring, run full check-transform-match pipeline, explain configs.

Part of the Golden Suite

Tool Purpose Install
GoldenCheck Validate & profile data quality pip install goldencheck
GoldenFlow Transform & standardize data pip install goldenflow
GoldenMatch Deduplicate & match records pip install goldenmatch
GoldenPipe Orchestrate the full pipeline pip install goldenpipe

Author

Ben Severn

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

goldenpipe-1.0.6.tar.gz (47.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

goldenpipe-1.0.6-py3-none-any.whl (32.6 kB view details)

Uploaded Python 3

File details

Details for the file goldenpipe-1.0.6.tar.gz.

File metadata

  • Download URL: goldenpipe-1.0.6.tar.gz
  • Upload date:
  • Size: 47.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for goldenpipe-1.0.6.tar.gz
Algorithm Hash digest
SHA256 0ee6e756a5d9e898abfce3796a9c1c27f39ca957b57003feb9783e638a877b98
MD5 a6712ec6bab7613f4a5aa46bd1d43bd7
BLAKE2b-256 1572cbadbf721929875e610d4218c851a367d8afc8d1a830fc181d90a8b96785

See more details on using hashes here.

Provenance

The following attestation bundles were made for goldenpipe-1.0.6.tar.gz:

Publisher: publish.yml on benzsevern/goldenpipe

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file goldenpipe-1.0.6-py3-none-any.whl.

File metadata

  • Download URL: goldenpipe-1.0.6-py3-none-any.whl
  • Upload date:
  • Size: 32.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for goldenpipe-1.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 47dc6ae8dbbcd144b915e1e964c2eaa35e749635d6dbe975fa40e27c9d2414df
MD5 db5adce1eca0a6ef014993da28d1a88d
BLAKE2b-256 3d5a6e78608c9b7f094e7eb416b0f2deaf5f4ced24a31f01ea559d62163982d0

See more details on using hashes here.

Provenance

The following attestation bundles were made for goldenpipe-1.0.6-py3-none-any.whl:

Publisher: publish.yml on benzsevern/goldenpipe

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page