A lightweight Python library for building resilient, in-memory data pipelines with elegant, chainable syntax
Project description
Laygo
A lightweight Python library for building resilient, in-memory data pipelines with elegant, chainable syntax.
🎯 Overview
Laygo is a lightweight Python library for building resilient, in-memory data pipelines. It provides a fluent API to layer transformations, manage context, and handle errors with elegant, chainable syntax.
Key Features:
- Fluent API: Chainable method syntax for readable data transformations
- Performance Optimized: Uses chunked processing and list comprehensions for maximum speed
- Memory Efficient: Lazy evaluation and streaming support for large datasets
- Parallel Processing: Built-in ThreadPoolExecutor for CPU-intensive operations
- Context Management: Shared state across pipeline operations for stateful processing
- Error Handling: Comprehensive error handling
- Type Safety: Full type hints support with generic types
📦 Installation
pip install laygo
Or for development:
git clone https://github.com/ringoldsdev/laygo-python.git
cd laygo-python
pip install -e ".[dev]"
🐳 Dev Container Setup
If you're using this project in a dev container, you'll need to configure Git to use HTTPS instead of SSH for authentication:
# Switch to HTTPS remote URL
git remote set-url origin https://github.com/ringoldsdev/laygo-python.git
# Configure Git to use HTTPS for all GitHub operations
git config --global url."https://github.com/".insteadOf "git@github.com:"
🚀 Usage
Basic Pipeline Operations
from laygo import Pipeline
# Simple data transformation
data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
result = (
Pipeline(data)
.transform(lambda t: t.filter(lambda x: x % 2 == 0)) # Keep even numbers
.transform(lambda t: t.map(lambda x: x * 2)) # Double them
.to_list()
)
print(result) # [4, 8, 12, 16, 20]
Context-Aware Operations
from laygo import Pipeline
from laygo import PipelineContext
# Create context with shared state
context: PipelineContext = {"multiplier": 3, "threshold": 10}
result = (
Pipeline([1, 2, 3, 4, 5])
.context(context)
.transform(lambda t: t.map(lambda x, ctx: x * ctx["multiplier"]))
.transform(lambda t: t.filter(lambda x, ctx: x > ctx["threshold"]))
.to_list()
)
print(result) # [12, 15]
ETL Pipeline Example
from laygo import Pipeline
# Sample employee data processing
employees = [
{"name": "Alice", "age": 25, "salary": 50000},
{"name": "Bob", "age": 30, "salary": 60000},
{"name": "Charlie", "age": 35, "salary": 70000},
{"name": "David", "age": 28, "salary": 55000},
]
# Extract, Transform, Load pattern
high_earners = (
Pipeline(employees)
.transform(lambda t: t.filter(lambda emp: emp["age"] > 28)) # Extract
.transform(lambda t: t.map(lambda emp: { # Transform
"name": emp["name"],
"annual_salary": emp["salary"],
"monthly_salary": emp["salary"] / 12
}))
.transform(lambda t: t.filter(lambda emp: emp["annual_salary"] > 55000)) # Filter
.to_list()
)
Using Transformers Directly
from laygo import Transformer
# Create a reusable transformation pipeline
transformer = (
Transformer.init(int)
.filter(lambda x: x % 2 == 0) # Keep even numbers
.map(lambda x: x * 2) # Double them
.filter(lambda x: x > 5) # Keep > 5
)
# Apply to different datasets
result1 = list(transformer([1, 2, 3, 4, 5])) # [4, 8]
result2 = list(transformer(range(10))) # [4, 8, 12, 16, 20]
Custom Transformer Composition
from laygo import Pipeline
from laygo import Transformer
# Create reusable transformation components
validate_data = Transformer.init(dict).filter(lambda x: x.get("id") is not None)
normalize_text = Transformer.init(dict).map(lambda x: {**x, "name": x["name"].strip().title()})
# Use transformers directly with Pipeline.transform()
result = (
Pipeline(raw_data)
.transform(validate_data) # Pass transformer directly
.transform(normalize_text) # Pass transformer directly
.to_list()
)
Parallel Processing
from laygo import Pipeline
from laygo import ParallelTransformer
# Process large datasets with multiple threads
large_data = range(100_000)
# Create parallel transformer
parallel_processor = (
ParallelTransformer.init(
int,
max_workers=4,
ordered=True, # Maintain result order
chunk_size=10000 # Process in chunks
).map(lambda x: x ** 2)
)
results = (
Pipeline(large_data)
.transform(parallel_processor)
.transform(lambda t: t.filter(lambda x: x > 100))
.first(1000) # Get first 1000 results
)
Error Handling and Recovery
from laygo import Pipeline
from laygo import Transformer
def risky_operation(x):
if x == 5:
raise ValueError("Cannot process 5")
return x * 2
def error_handler(chunk, error, context):
print(f"Error in chunk {chunk}: {error}")
return [0] * len(chunk) # Return default values
# Pipeline with error recovery
result = (
Pipeline([1, 2, 3, 4, 5, 6])
.transform(lambda t: t.map(risky_operation).catch(
lambda sub_t: sub_t.map(lambda x: x + 1),
on_error=error_handler
))
.to_list()
)
📄 License
This project is licensed under the MIT License - see the LICENSE file for details.
🚀 Built With
- Python 3.12+ - Core language with modern type hints
- Ruff - Code formatting and linting
- Pytest - Testing framework
- ThreadPoolExecutor - Parallel processing
- Type Hints - Full type safety support
⭐ Star this repository if Laygo helps your data processing workflows!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file laygo-0.1.1.tar.gz.
File metadata
- Download URL: laygo-0.1.1.tar.gz
- Upload date:
- Size: 44.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b8a030cfc509f89e099317f5722c23ae3c3aa23ff6d6dbf38a8f857a72edb4ef
|
|
| MD5 |
0d34e071429030ee3e529c86537b59c6
|
|
| BLAKE2b-256 |
06276656d1fb9d3bf5983902ed91604e2ed0be9beeeea5c0a38232b386029e45
|
Provenance
The following attestation bundles were made for laygo-0.1.1.tar.gz:
Publisher:
publish.yml on ringoldsdev/laygo-python
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
laygo-0.1.1.tar.gz -
Subject digest:
b8a030cfc509f89e099317f5722c23ae3c3aa23ff6d6dbf38a8f857a72edb4ef - Sigstore transparency entry: 278175737
- Sigstore integration time:
-
Permalink:
ringoldsdev/laygo-python@32998a8903d71b05c1d03b865a31590c87b4adc3 -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/ringoldsdev
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@32998a8903d71b05c1d03b865a31590c87b4adc3 -
Trigger Event:
push
-
Statement type:
File details
Details for the file laygo-0.1.1-py3-none-any.whl.
File metadata
- Download URL: laygo-0.1.1-py3-none-any.whl
- Upload date:
- Size: 11.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
28f804f8a873487f4b0e5e0502d0294a50a65a9285fe746560925571f941f7c3
|
|
| MD5 |
3b80648d7c2e854d77b4f2a0857daafb
|
|
| BLAKE2b-256 |
740a1ef73e31b24f10a68b69c2c891b619c2c0cebaa9338d08956c317131af55
|
Provenance
The following attestation bundles were made for laygo-0.1.1-py3-none-any.whl:
Publisher:
publish.yml on ringoldsdev/laygo-python
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
laygo-0.1.1-py3-none-any.whl -
Subject digest:
28f804f8a873487f4b0e5e0502d0294a50a65a9285fe746560925571f941f7c3 - Sigstore transparency entry: 278175759
- Sigstore integration time:
-
Permalink:
ringoldsdev/laygo-python@32998a8903d71b05c1d03b865a31590c87b4adc3 -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/ringoldsdev
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@32998a8903d71b05c1d03b865a31590c87b4adc3 -
Trigger Event:
push
-
Statement type: