A Swiss Army knife for simple ETL operations

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

ETLPlus

ETLPlus is a veritable Swiss Army knife for enabling simple ETL operations, offering both a Python package and command-line interface for data extraction, validation, transformation, and loading.

ETLPlus

Features

Extract data from multiple sources:
- Files (CSV, JSON, XML, YAML)
- Databases (connection string support)
- REST APIs (GET)
Validate data with flexible rules:
- Type checking
- Required fields
- Value ranges (min/max)
- String length constraints
- Pattern matching
- Enum validation
Transform data with powerful operations:
- Filter records
- Map/rename fields
- Select specific fields
- Sort data
- Aggregate functions (avg, count, max, min, sum)
Load data to multiple targets:
- Files (CSV, JSON, XML, YAML)
- Databases (connection string support)
- REST APIs (PATCH, POST, PUT)

Installation

pip install etlplus

For development:

pip install -e ".[dev]"

Quickstart

Get up and running in under a minute.

Command line interface:

# Inspect help and version
etlplus --help
etlplus --version

# One-liner: extract CSV, filter, select, and write JSON
etlplus extract file examples/data/sample.csv \
  | etlplus transform - --operations '{"filter": {"field": "age", "op": "gt", "value": 25}, "select": ["name", "email"]}' \
  -o temp/sample_output.json

Python API:

from etlplus import extract, transform, validate, load

data = extract("file", "input.csv")
ops = {"filter": {"field": "age", "op": "gt", "value": 25}, "select": ["name", "email"]}
filtered = transform(data, ops)
rules = {"name": {"type": "string", "required": True}, "email": {"type": "string", "required": True}}
assert validate(filtered, rules)["valid"]
load(filtered, "file", "temp/sample_output.json", file_format="json")

Usage

Command Line Interface

ETLPlus provides a powerful CLI for ETL operations:

# Show help
etlplus --help

# Show version
etlplus --version

Extract Data

Note: For file sources, the format is inferred from the filename extension; the --format option is ignored. To treat passing --format as an error for file sources, either set ETLPLUS_FORMAT_BEHAVIOR=error or pass the CLI flag --strict-format.

Extract from JSON file:

etlplus extract file examples/data/sample.json

Extract from CSV file:

etlplus extract file examples/data/sample.csv

Extract from XML file:

etlplus extract file examples/data/sample.xml

Extract from REST API:

etlplus extract api https://api.example.com/data

Save extracted data to file:

etlplus extract file examples/data/sample.csv -o temp/sample_output.json

Validate Data

Validate data from file or JSON string:

etlplus validate '{"name": "John", "age": 30}' --rules '{"name": {"type": "string", "required": true}, "age": {"type": "number", "min": 0, "max": 150}}'

Validate from file:

etlplus validate examples/data/sample.json --rules '{"email": {"type": "string", "pattern": "^[\\w.-]+@[\\w.-]+\\.\\w+$"}}'

Transform Data

Filter and select fields:

etlplus transform '[{"name": "John", "age": 30}, {"name": "Jane", "age": 25}]' \
  --operations '{"filter": {"field": "age", "op": "gt", "value": 26}, "select": ["name"]}'

Sort data:

etlplus transform examples/data/sample.json --operations '{"sort": {"field": "age", "reverse": true}}'

Aggregate data:

etlplus transform examples/data/sample.json --operations '{"aggregate": {"field": "age", "func": "sum"}}'

Map/rename fields:

etlplus transform examples/data/sample.json --operations '{"map": {"name": "new_name"}}'

Load Data

Load to JSON file:

etlplus load '{"name": "John", "age": 30}' file temp/sample_output.json

Load to CSV file:

etlplus load '[{"name": "John", "age": 30}]' file temp/sample_output.csv

Load to REST API:

etlplus load examples/data/sample.json api https://api.example.com/endpoint

Python API

Use ETLPlus as a Python library:

from etlplus import extract, validate, transform, load

# Extract data
data = extract("file", "data.json")

# Validate data
validation_rules = {
    "name": {"type": "string", "required": True},
    "age": {"type": "number", "min": 0, "max": 150}
}
result = validate(data, validation_rules)
if result["valid"]:
    print("Data is valid!")

# Transform data
operations = {
    "filter": {"field": "age", "op": "gt", "value": 18},
    "select": ["name", "email"]
}
transformed = transform(data, operations)

# Load data
load(transformed, "file", "temp/sample_output.json", format="json")

For YAML-driven pipelines executed end-to-end (extract → validate → transform → load), see:

Authoring: docs/pipeline-guide.md
Runner API and internals: docs/run-module.md

Complete ETL Pipeline Example

# 1. Extract from CSV
etlplus extract file examples/data/sample.csv -o temp/sample_extracted.json

# 2. Transform (filter and select fields)
etlplus transform temp/sample_extracted.json \
  --operations '{"filter": {"field": "age", "op": "gt", "value": 25}, "select": ["name", "email"]}' \
  -o temp/sample_transformed.json

# 3. Validate transformed data
etlplus validate temp/sample_transformed.json \
  --rules '{"name": {"type": "string", "required": true}, "email": {"type": "string", "required": true}}'

# 4. Load to CSV
etlplus load temp/sample_transformed.json file temp/sample_output.csv

Environment Variables

ETLPlus honors a small number of environment toggles to refine CLI behavior:

ETLPLUS_FORMAT_BEHAVIOR: controls what happens when --format is provided for file sources or targets (extract/load) where the format is inferred from the filename extension.
- error|fail|strict: treat as error (non-zero exit)
- warn (default): print a warning to stderr
- ignore|silent: no message
Precedence: the CLI flag --strict-format overrides the environment.

Examples (zsh):

# Warn (default)
etlplus extract file data.csv --format csv
etlplus load data.json file out.csv --format csv

# Enforce error via environment
ETLPLUS_FORMAT_BEHAVIOR=error \
  etlplus extract file data.csv --format csv
ETLPLUS_FORMAT_BEHAVIOR=error \
  etlplus load data.json file out.csv --format csv

# Equivalent strict behavior via flag (overrides environment)
etlplus extract file data.csv --format csv --strict-format
etlplus load data.json file out.csv --format csv --strict-format

# Recommended: rely on extension, no --format needed for files
etlplus extract file data.csv
etlplus load data.json file out.csv

Transformation Operations

Filter Operations

Supported operators:

eq: Equal
ne: Not equal
gt: Greater than
gte: Greater than or equal
lt: Less than
lte: Less than or equal
in: Value in list
contains: List/string contains value

Example:

{
  "filter": {
    "field": "status",
    "op": "in",
    "value": ["active", "pending"]
  }
}

Aggregation Functions

Supported functions:

sum: Sum of values
avg: Average of values
min: Minimum value
max: Maximum value
count: Count of values

Example:

{
  "aggregate": {
    "field": "revenue",
    "func": "sum"
  }
}

Validation Rules

Supported validation rules:

type: Data type (string, number, integer, boolean, array, object)
required: Field is required (true/false)
min: Minimum value for numbers
max: Maximum value for numbers
minLength: Minimum length for strings
maxLength: Maximum length for strings
pattern: Regex pattern for strings
enum: List of allowed values

Example:

{
  "email": {
    "type": "string",
    "required": true,
    "pattern": "^[\\w.-]+@[\\w.-]+\\.\\w+$"
  },
  "age": {
    "type": "number",
    "min": 0,
    "max": 150
  },
  "status": {
    "type": "string",
    "enum": ["active", "inactive", "pending"]
  }
}

Development

API Client Docs

Looking for the HTTP client and pagination helpers? See the dedicated docs in etlplus/api/README.md for:

Quickstart with EndpointClient
Authentication via EndpointCredentialsBearer
Pagination with PaginationConfig (page and cursor styles)
Tips on records_path and cursor_path

Runner Internals and Connectors

Curious how the pipeline runner composes API requests, pagination, and load calls?

Runner overview and helpers: docs/run-module.md
Unified "connector" vocabulary (API/File/DB): etlplus/config/connector.py
- API/file targets reuse the same shapes as sources; API targets typically set a method.

Running Tests

pytest tests/ -v

Test Layers

We split tests into two layers:

Unit (tests/unit/): single function or class, no real I/O, fast, uses stubs/monkeypatch (e.g. etlplus.cli.create_parser, transform + validate helpers).
Integration (tests/integration/): end-to-end flows (CLI main(), pipeline run(), pagination + rate limit defaults, file/API connector interactions) may touch temp files and use fake clients.

If a test calls etlplus.cli.main() or etlplus.run.run() it’s integration by default. Full criteria: CONTRIBUTING.md#testing.

Code Coverage

pytest tests/ --cov=etlplus --cov-report=html

Linting

flake8 etlplus/
black etlplus/

Releasing to PyPI

For maintainers, releases are built from the root using the modern pyproject.toml configuration:

make dist          # build sdist + wheel into ./dist and run twine check

Then upload the artifacts in dist/ with twine (installed by make dist):

export TWINE_USERNAME="__token__"
export TWINE_PASSWORD="pypi-..."  # your PyPI API token
python -m twine upload dist/*

License

This project is licensed under the MIT License.

Contributing

Code and codeless contributions are welcome! If you’d like to add a new feature, fix a bug, or improve the documentation, please feel free to submit a pull request as follows:

Fork this repository.
Create a new feature branch for your changes (git checkout -b feature/feature-name).
Commit your changes (git commit -m "Add feature").
Push to your branch (git push origin feature-name).
Submit a pull request with a detailed description.

If you choose to be a code contributor, please first refer these documents:

Pipeline authoring guide: docs/pipeline-guide.md
Design notes (Mapping inputs, dict outputs): docs/pipeline-guide.md#design-notes-mapping-inputs-dict-outputs
Typing philosophy (TypedDicts as editor hints, permissive runtime): CONTRIBUTING.md#typing-philosophy

Acknowledgments

ETLPlus is inspired by common work patterns in data engineering and software engineering patterns in Python development, aiming to increase productivity and reduce boilerplate code. Feedback and contributions are always appreciated!

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

djrlj694

These details have not been verified by PyPI

Release history Release notifications | RSS feed

1.14.7

May 7, 2026

1.14.6

May 7, 2026

1.14.5

May 7, 2026

1.14.4

May 6, 2026

1.14.3

May 6, 2026

1.14.2

May 6, 2026

1.14.1

May 5, 2026

1.14.0

May 5, 2026

1.13.18

May 5, 2026

1.13.17

May 5, 2026

1.13.16

May 4, 2026

1.13.15

May 4, 2026

1.13.14

May 4, 2026

1.13.13

May 3, 2026

1.13.12

May 3, 2026

1.13.11

May 2, 2026

1.13.10

May 2, 2026

1.13.7

May 2, 2026

1.13.6

Apr 28, 2026

1.13.5

Apr 27, 2026

1.13.4

Apr 27, 2026

1.13.3

Apr 26, 2026

1.13.2

Apr 26, 2026

1.13.1

Apr 26, 2026

1.12.0

Apr 23, 2026

1.11.11

Apr 23, 2026

1.11.10

Apr 23, 2026

1.11.9

Apr 23, 2026

1.11.8

Apr 23, 2026

1.11.7

Apr 23, 2026

1.11.6

Apr 23, 2026

1.11.5

Apr 22, 2026

1.11.4

Apr 21, 2026

1.11.3

Apr 21, 2026

1.11.2

Apr 21, 2026

1.11.1

Apr 20, 2026

1.11.0

Apr 19, 2026

1.10.5

Apr 19, 2026

1.10.4

Apr 19, 2026

1.10.3

Apr 19, 2026

1.10.2

Apr 19, 2026

1.10.1

Apr 18, 2026

1.10.0

Apr 18, 2026

1.9.4

Apr 18, 2026

1.9.3

Apr 18, 2026

1.9.2

Apr 17, 2026

1.9.1

Apr 17, 2026

1.9.0

Apr 16, 2026

1.8.2

Apr 7, 2026

1.8.1

Apr 7, 2026

1.8.0

Apr 7, 2026

1.7.0

Apr 6, 2026

1.6.0

Apr 6, 2026

1.5.4

Apr 6, 2026

1.5.3

Apr 6, 2026

1.5.2

Apr 5, 2026

1.5.1

Apr 5, 2026

1.5.0

Apr 5, 2026

1.4.5

Apr 5, 2026

1.4.4

Apr 5, 2026

1.4.3

Apr 5, 2026

1.4.2

Apr 4, 2026

1.4.1

Apr 4, 2026

1.4.0

Apr 4, 2026

1.3.4

Apr 3, 2026

1.3.3

Apr 3, 2026

1.3.2

Apr 2, 2026

1.3.1

Apr 2, 2026

1.3.0

Apr 2, 2026

1.2.11

Apr 1, 2026

1.2.10

Apr 1, 2026

1.2.9

Apr 1, 2026

1.2.8

Apr 1, 2026

1.2.7

Mar 30, 2026

1.2.6

Mar 30, 2026

1.2.5

Mar 30, 2026

1.2.4

Mar 29, 2026

1.2.3

Mar 29, 2026

1.2.2

Mar 29, 2026

1.2.1

Mar 29, 2026

1.2.0

Mar 24, 2026

1.1.3

Mar 23, 2026

1.1.2

Mar 21, 2026

1.1.1

Mar 21, 2026

1.1.0

Mar 20, 2026

1.0.3

Mar 20, 2026

1.0.2

Mar 20, 2026

1.0.1

Mar 19, 2026

1.0.0

Mar 19, 2026

0.27.12

Mar 17, 2026

0.27.11

Mar 16, 2026

0.27.10

Mar 15, 2026

0.27.9

Mar 15, 2026

0.27.8

Mar 15, 2026

0.27.7

Mar 15, 2026

0.27.6

Mar 15, 2026

0.27.5

Mar 15, 2026

0.27.4

Mar 15, 2026

0.27.3

Mar 15, 2026

0.27.2

Mar 15, 2026

0.26.1

Mar 13, 2026

0.26.0

Mar 13, 2026

0.25.11

Mar 12, 2026

0.25.10

Mar 12, 2026

0.25.9

Mar 12, 2026

0.25.8

Mar 11, 2026

0.25.7

Mar 11, 2026

0.25.5

Mar 11, 2026

0.25.4

Mar 11, 2026

0.25.3

Mar 10, 2026

0.25.2

Mar 9, 2026

0.25.1

Mar 9, 2026

0.25.0

Mar 9, 2026

0.24.7

Mar 9, 2026

0.24.6

Mar 9, 2026

0.24.5

Mar 9, 2026

0.24.4

Mar 8, 2026

0.24.3

Mar 2, 2026

0.24.2

Mar 2, 2026

0.24.1

Mar 1, 2026

0.24.0

Feb 27, 2026

0.23.1

Feb 26, 2026

0.23.0

Feb 26, 2026

0.22.11

Feb 25, 2026

0.22.10

Feb 25, 2026

0.22.9

Feb 25, 2026

0.22.8

Feb 25, 2026

0.22.6

Feb 25, 2026

0.22.4

Feb 24, 2026

0.22.3

Feb 24, 2026

0.22.2

Feb 24, 2026

0.22.1

Feb 24, 2026

0.22.0

Feb 23, 2026

0.21.8

Feb 23, 2026

0.21.7

Feb 17, 2026

0.21.6

Feb 17, 2026

0.21.5

Feb 16, 2026

0.21.4

Feb 15, 2026

0.21.3

Feb 14, 2026

0.21.2

Feb 13, 2026

0.20.1

Feb 13, 2026

0.19.4

Feb 12, 2026

0.19.3

Feb 12, 2026

0.19.2

Feb 12, 2026

0.19.1

Feb 12, 2026

0.19.0

Feb 11, 2026

0.18.2

Feb 11, 2026

0.18.1

Feb 9, 2026

0.18.0

Feb 7, 2026

0.17.7

Feb 6, 2026

0.17.6

Feb 4, 2026

0.17.5

Feb 4, 2026

0.17.4

Feb 3, 2026

0.17.3

Feb 3, 2026

0.17.2

Feb 3, 2026

0.16.10

Feb 2, 2026

0.16.9

Feb 2, 2026

0.16.8

Feb 1, 2026

0.16.7

Feb 1, 2026

0.16.6

Feb 1, 2026

0.16.5

Feb 1, 2026

0.16.4

Feb 1, 2026

0.16.3

Jan 31, 2026

0.16.2

Jan 30, 2026

0.16.0

Jan 30, 2026

0.15.5

Jan 26, 2026

0.15.4

Jan 25, 2026

0.15.2

Jan 25, 2026

0.15.0

Jan 23, 2026

0.14.3

Jan 20, 2026

0.14.1

Jan 20, 2026

0.14.0

Jan 20, 2026

0.13.0

Jan 18, 2026

0.12.13

Jan 18, 2026

0.12.12

Jan 18, 2026

0.12.11

Jan 18, 2026

0.12.10

Jan 17, 2026

0.12.9

Jan 16, 2026

0.12.5

Jan 15, 2026

0.12.4

Jan 15, 2026

0.12.3

Jan 14, 2026

0.12.2

Jan 14, 2026

0.12.1

Jan 14, 2026

0.11.12

Jan 14, 2026

0.11.11

Jan 14, 2026

0.11.10

Jan 14, 2026

0.11.9

Jan 14, 2026

0.11.8

Jan 14, 2026

0.11.7

Jan 13, 2026

0.11.5

Jan 13, 2026

0.11.4

Jan 13, 2026

0.11.3

Jan 13, 2026

0.11.2

Jan 13, 2026

0.11.1

Jan 13, 2026

0.10.5

Jan 12, 2026

0.10.4

Jan 12, 2026

0.10.3

Jan 12, 2026

0.10.2

Jan 12, 2026

0.10.1

Jan 12, 2026

0.9.2

Jan 25, 2026

0.9.1

Jan 12, 2026

0.9.0

Jan 11, 2026

0.8.6

Jan 10, 2026

0.8.4

Jan 10, 2026

0.8.3

Jan 10, 2026

0.8.2

Jan 9, 2026

0.8.0

Jan 9, 2026

0.7.2

Jan 9, 2026

0.7.1

Jan 8, 2026

0.7.0

Jan 8, 2026

0.6.1

Jan 8, 2026

0.5.5

Jan 8, 2026

0.5.4

Jan 8, 2026

0.5.3

Jan 7, 2026

0.5.2

Jan 7, 2026

0.5.1

Jan 7, 2026

0.4.9

Jan 7, 2026

0.4.8

Jan 7, 2026

0.4.7

Jan 7, 2026

0.4.6

Jan 7, 2026

0.4.5

Jan 7, 2026

0.4.1

Jan 2, 2026

0.4.0

Dec 30, 2025

0.3.25

Dec 29, 2025

0.3.23

Dec 29, 2025

0.3.22

Dec 27, 2025

0.3.21

Dec 27, 2025

0.3.19

Dec 22, 2025

0.3.17

Dec 22, 2025

0.3.16

Dec 19, 2025

0.3.15

Dec 19, 2025

0.3.14

Dec 19, 2025

0.3.13

Dec 18, 2025

0.3.10

Dec 18, 2025

0.3.9

Dec 18, 2025

0.3.8

Dec 18, 2025

0.3.7

Dec 18, 2025

0.3.6

Dec 18, 2025

0.3.5

Dec 18, 2025

This version

0.3.3

Dec 18, 2025

0.3.2

Dec 17, 2025

0.3.1

Dec 17, 2025

0.3.0

Dec 17, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

etlplus-0.3.3.tar.gz (96.9 kB view details)

Uploaded Dec 18, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

etlplus-0.3.3-py3-none-any.whl (114.8 kB view details)

Uploaded Dec 18, 2025 Python 3

File details

Details for the file etlplus-0.3.3.tar.gz.

File metadata

Download URL: etlplus-0.3.3.tar.gz
Upload date: Dec 18, 2025
Size: 96.9 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for etlplus-0.3.3.tar.gz
Algorithm	Hash digest
SHA256	`d32ce9a864759fbb287e0e85682a074697a4aa850473accb50a6959380af5169`
MD5	`a4d7824d4621927036c31eafa2f02f36`
BLAKE2b-256	`f548ddf93096c344b3b8f22412747d699b2c1b8fb9c2eec0c686526605edfcb1`

See more details on using hashes here.

Provenance

The following attestation bundles were made for etlplus-0.3.3.tar.gz:

Publisher: ci.yml on Dagitali/ETLPlus

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: etlplus-0.3.3.tar.gz
- Subject digest: d32ce9a864759fbb287e0e85682a074697a4aa850473accb50a6959380af5169
- Sigstore transparency entry: 771146495
- Sigstore integration time: Dec 18, 2025
Source repository:
- Permalink: Dagitali/ETLPlus@fdfc715920939120282b04d306dfc452bb69c934
- Branch / Tag: refs/tags/v0.3.3
- Owner: https://github.com/Dagitali
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: ci.yml@fdfc715920939120282b04d306dfc452bb69c934
- Trigger Event: push

File details

Details for the file etlplus-0.3.3-py3-none-any.whl.

File metadata

Download URL: etlplus-0.3.3-py3-none-any.whl
Upload date: Dec 18, 2025
Size: 114.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for etlplus-0.3.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`92bf7036d19cb1a8ba0a0950df85ad9e4b3a55f51141d4951340c81be80195ba`
MD5	`d8a396c722220467faf50a2228c61114`
BLAKE2b-256	`e46858827604a366d330c21645f77422bed0a22f31054b8994aa61013fa9c242`

See more details on using hashes here.

Provenance

The following attestation bundles were made for etlplus-0.3.3-py3-none-any.whl:

Publisher: ci.yml on Dagitali/ETLPlus

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: etlplus-0.3.3-py3-none-any.whl
- Subject digest: 92bf7036d19cb1a8ba0a0950df85ad9e4b3a55f51141d4951340c81be80195ba
- Sigstore transparency entry: 771146501
- Sigstore integration time: Dec 18, 2025
Source repository:
- Permalink: Dagitali/ETLPlus@fdfc715920939120282b04d306dfc452bb69c934
- Branch / Tag: refs/tags/v0.3.3
- Owner: https://github.com/Dagitali
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: ci.yml@fdfc715920939120282b04d306dfc452bb69c934
- Trigger Event: push

etlplus 0.3.3

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

ETLPlus

Features

Installation

Quickstart

Usage

Command Line Interface

Extract Data

Validate Data

Transform Data

Load Data

Python API

Complete ETL Pipeline Example

Environment Variables

Transformation Operations

Filter Operations

Aggregation Functions

Validation Rules

Development

API Client Docs

Runner Internals and Connectors

Running Tests

Test Layers

Code Coverage

Linting

Releasing to PyPI

Links

License

Contributing

Acknowledgments

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance