A tiny, fast, Rust-backed transformation core for Python table data
Project description
🪶 feathertail
A high-performance Python DataFrame library powered by Rust — designed for flexibility, blazing speed, and intelligent type handling. Built for production with comprehensive features, advanced analytics, and enterprise-grade performance.
✨ Key Features
🚀 Core DataFrame Operations
- ✅ Build
TinyFramefrom Python dict records (from_dicts) - ✅ Automatic type inference, including mixed-type and optional columns
- ✅ Intelligent fallback to Python objects when Rust-native types aren't possible (stored by runtime pointer identity for the lifetime of the frame—keep references alive while using
TinyFrame) - ✅ Flexible
fillnato handle missing data - ✅ Powerful
cast_columnto convert columns between types - ✅ Smart
edit_column: edits that automatically adjust column type if needed - ✅ Drop or rename columns easily
- ✅ Export back to Python dicts (
to_dicts)
🔗 Advanced Data Operations
- ✅ Join Operations: Inner, left, right, outer, and cross joins
- ✅ Filtering & Sorting: Advanced filtering with multiple conditions and multi-column sorting
- ✅ GroupBy Aggregations:
TinyGroupBywith string key columns — sum, mean, min, max, std, var, median, first, last, count, size (call each aggregation separately) - ✅ Window Functions: Rolling and expanding window operations
- ✅ Ranking Functions: Rank calculation with multiple methods and percentage change
📊 Advanced Analytics
- ✅ Descriptive Statistics:
describe(),skew(),kurtosis(),quantile(),mode(),nunique() - ✅ Correlation & Covariance: Full correlation/covariance matrices and pairwise calculations
- ✅ Time Series Operations: DateTime parsing (strict: invalid or empty strings raise
ValueError), component extraction, time differences, and shifting - ✅ String Operations: Case conversion, whitespace removal, replacement, splitting, pattern matching, length, and concatenation
- ✅ Data Validation: Not null, range, pattern, uniqueness validation with comprehensive reporting
⚡ Performance & Optimization
- ✅ SIMD Operations: x86_64 optimized numerical operations for blazing speed
- ✅ Parallel Processing: Multi-core operations using Rayon for GroupBy, filtering, and sorting
- ✅ Memory Optimization: String interning, lazy evaluation, and copy-on-write optimizations
- ✅ Chunked Processing: Handle large datasets efficiently with streaming operations
- ✅ Rust-backed Core: Lightweight, fast, and dependency-light
- ✅ Cross-Platform Builds: Automated CI/CD with pre-built wheels for all major platforms
🛠️ Developer Experience
- ✅ Comprehensive Documentation: Sphinx-generated API docs with tutorials and guides
- ✅ Logging & Debugging: Built-in logging system with performance monitoring
- ✅ Profiling Tools: Performance profiling and optimization insights
- ✅ Development Tools: Pre-commit hooks, automated testing, and development scripts
- ✅ 250+ Comprehensive Tests: Full test coverage running in well under one second locally
📦 Installation
pip install feathertail
✅ Cross-Platform Support: Pre-built wheels are available for Python 3.8+ on:
- Linux (x86_64)
- macOS (ARM64/aarch64)
- Windows (x86_64)
Building from Source
# Clone the repository
git clone https://github.com/eddiethedean/feathertail.git
cd feathertail
# Install dependencies and build
pip install maturin
maturin develop --release
# Or install in development mode
pip install -e .
🧑💻 Quickstart
Basic DataFrame Operations
import feathertail as ft
records = [
{"name": "Alice", "age": 30, "city": "New York", "score": 95.5},
{"name": "Bob", "age": None, "city": "Paris", "score": 85.0},
{"name": "Charlie", "age": 25, "city": "New York", "score": None},
]
frame = ft.TinyFrame.from_dicts(records)
print(frame)
Output:
TinyFrame(rows=3, columns=4, cols={ 'name': 'Str', 'age': 'OptInt', 'city': 'Str', 'score': 'OptFloat' })
Advanced Filtering and Sorting
# Filter and sort data
filtered = frame.filter("age", ">", 25)
sorted_frame = frame.sort_values(["city", "age"], ascending=[True, False])
print(sorted_frame.to_dicts())
GroupBy Aggregations
# Group keys must be string columns. Build TinyGroupBy, then aggregate with the frame:
gb = ft.TinyGroupBy(frame, ["city"])
mean_age = gb.mean(frame, "age")
max_score = gb.max(frame, "score")
row_counts = gb.count(frame)
Join Operations
# Inner join with another DataFrame
other_data = [
{"city": "New York", "population": 8_000_000},
{"city": "Paris", "population": 2_000_000},
]
other_frame = ft.TinyFrame.from_dicts(other_data)
joined = frame.join(other_frame, "city", "city", "inner")
print(joined.to_dicts())
Join semantics
- Composite keys: A row is used for matching only if every join-key column is non-null (SQL-style). If any key component is null, that row does not appear in the join key index.
- Column names: The result keeps one name per logical column. If the same basename appears as a non-join column on both sides, or if a join-key name on one frame collides with a non-key column on the other, feathertail raises
ValueError—rename on one frame first. Automatic_x/_ysuffixing (pandas-style) is not implemented yet. cross_join: Left and right must have disjoint column names; overlaps raiseValueError.- Python object fallback: Fallback object storage from both sides is merged on join outputs so
to_dicts()can resolve references. Conflicting reuse of the same internal id for different objects raises a runtime error (very rare).
Advanced Analytics
# Descriptive statistics
description = frame.describe("score")
print(description.to_dicts())
# Correlation analysis
correlation = frame.corr("age", "score")
print(f"Age-Score correlation: {correlation}")
# Time series operations
time_data = [
{"timestamp": "2023-01-01 10:00:00", "value": 100},
{"timestamp": "2023-01-01 11:00:00", "value": 120},
]
time_frame = ft.TinyFrame.from_dicts(time_data)
time_frame = time_frame.to_timestamps("timestamp") # adds `timestamp_timestamp` (Unix seconds)
time_frame = time_frame.dt_year("timestamp") # still parses from original string column
print(time_frame.to_dicts())
Window Functions
# Rolling window operations
data = [{"value": i} for i in range(1, 11)]
window_frame = ft.TinyFrame.from_dicts(data)
rolling_mean = window_frame.rolling_mean("value", 3)
print(rolling_mean.to_dicts())
String Operations
# String manipulation
text_data = [{"text": " hello world "}, {"text": "foo bar"}]
text_frame = ft.TinyFrame.from_dicts(text_data)
processed = text_frame.str_upper("text").str_strip("text")
print(processed.to_dicts())
Data Validation
# Data quality checks
validation = frame.validate_not_null("age")
validation_summary = frame.validation_summary("age")
print(f"Validation summary: {validation_summary}")
🚀 Performance Features
SIMD-Accelerated Operations
# Automatic SIMD optimization for numerical operations
large_data = [{"category": "A" if i % 2 == 0 else "B", "value": i * 1.5} for i in range(100000)]
large_frame = ft.TinyFrame.from_dicts(large_data)
# TinyGroupBy keys must be string columns; aggregates run over numeric columns
gb = ft.TinyGroupBy(large_frame, ["category"])
sum_result = gb.sum(large_frame, "value")
Parallel Processing
# Multi-core operations for large datasets
# Automatically uses all available CPU cores
filtered = large_frame.filter("value", ">", 50000)
sorted_data = large_frame.sort_values("value")
Memory Optimization
# String interning and lazy evaluation
# Memory usage is automatically optimized
frame = ft.TinyFrame.from_dicts(records)
# Operations are optimized for memory efficiency
🛠️ Developer Tools
Logging and Debugging
# Enable comprehensive logging
ft.init_logging_with_config("info", log_memory=True, log_performance=True, log_operations=True)
# Enable debug mode
ft.enable_debug()
# Enable profiling
ft.enable_profiling()
# Your operations will be logged and profiled
frame = ft.TinyFrame.from_dicts(data)
result = frame.filter("age", ">", 25)
# View profiling report
ft.print_profiling_report()
Performance Monitoring
# Get operation statistics
stats = ft.get_operation_stats("filter")
print(f"Filter operations: {stats}")
# Get overall performance metrics
overall_stats = ft.get_overall_stats()
print(f"Total operations: {overall_stats['total_operations']}")
⚙️ Supported Types
| Type | Column variants | Description |
|---|---|---|
| int | Int, OptInt |
64-bit integers with optional null support |
| float | Float, OptFloat |
64-bit floats with optional null support |
| bool | Bool, OptBool |
Boolean values with optional null support |
| str | Str, OptStr |
UTF-8 strings with optional null support |
| mixed | Mixed, OptMixed |
Mixed types with automatic Python object fallback |
cast_column and strings. Casting a non-optional Str column to int or float is strict: each cell must parse; otherwise ValueError is raised (values are not coerced to 0). Casting optional OptStr to numeric optional types maps unparseable strings to missing values where applicable.
📚 Documentation
- Getting Started Guide - Learn the basics
- Advanced Usage - Complex operations and patterns
- API Reference - Complete API documentation
- Tutorials - Step-by-step learning guides
- Contributing - How to contribute to the project
🏗️ Build System & CI/CD
Automated Cross-Platform Builds
feathertail uses GitHub Actions to automatically build and test wheels for all major platforms:
- 15 build configurations covering Python 3.8-3.12
- 3 operating systems: Linux (Ubuntu), macOS (ARM64), Windows
- Automated testing with wheel installation verification
- Artifact management with 30-day retention
- PyPI deployment on version tags
Build Matrix
| Platform | Python Versions | Architecture |
|---|---|---|
| Ubuntu | 3.8, 3.9, 3.10, 3.11, 3.12 | x86_64 |
| macOS | 3.8, 3.9, 3.10, 3.11, 3.12 | ARM64 (aarch64) |
| Windows | 3.8, 3.9, 3.10, 3.11, 3.12 | x86_64 |
Quality Assurance
- ✅ Rust compilation with proper target architecture
- ✅ Python wheel building with maturin
- ✅ CI test matrix:
pyteston Python 3.8–3.12 on Ubuntu, macOS, and Windows (release wheels built for the same range) - ✅ Installation testing from temp directories
- ✅ Import verification to ensure module works correctly
- ✅ Cross-platform compatibility testing
🧪 Testing
# Run all tests (Rust + Python; 250+ Python unit tests plus Rust tests)
make test
# Run specific test categories
python -m pytest tests/python/unit/test_tinyframe.py
python -m pytest tests/python/unit/test_joins.py
python -m pytest tests/python/unit/test_analytics.py
🏗️ Building from Source
# Clone the repository
git clone https://github.com/your-username/feathertail.git
cd feathertail
# Set up development environment
make dev
# Build the package
make build
# Run tests
make test
# Build documentation
make docs
🐉 Why "feathertail"?
In Fourth Wing, a "feathertail" is a juvenile dragon — small, golden, and nonviolent, known for grace rather than brute force.
This library follows the same spirit: gentle on dependencies, elegant in design, and capable of handling complex data types with ease — but with the power and performance of a full-grown dragon when you need it.
📊 Performance Benchmarks
- 250+ Python unit tests plus Rust tests run in well under one second locally
- SIMD-accelerated numerical operations
- Parallel processing for multi-core performance
- Memory-optimized with string interning and lazy evaluation
- Production-ready with comprehensive error handling and logging
❤️ Contributing
Contributions, ideas, and feedback are always welcome! Please see our Contributing Guide for details.
📄 License
MIT
🎯 Roadmap
- Cross-platform PyPI builds - ✅ Automated builds for Linux, macOS, and Windows
- Additional time series functions
- More statistical distributions
- Enhanced plotting integration
- Database connectors
- Arrow/Parquet integration
Built with ❤️ using Rust and Python
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file feathertail-0.6.0-cp312-cp312-win_amd64.whl.
File metadata
- Download URL: feathertail-0.6.0-cp312-cp312-win_amd64.whl
- Upload date:
- Size: 477.1 kB
- Tags: CPython 3.12, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
72045245cf5959bc36844b3beda31a8d8957d9a458df9884adab9a8b8ffe0a02
|
|
| MD5 |
8ff023464705578856d290b7fed3afec
|
|
| BLAKE2b-256 |
6741e7638a49528a5e36448179c3770238b0febb6ab83f1d0cc5b0cacbd71394
|
File details
Details for the file feathertail-0.6.0-cp312-cp312-manylinux_2_34_x86_64.whl.
File metadata
- Download URL: feathertail-0.6.0-cp312-cp312-manylinux_2_34_x86_64.whl
- Upload date:
- Size: 633.7 kB
- Tags: CPython 3.12, manylinux: glibc 2.34+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1c392dba675de0cedde200491e68cb461a2ffba5d5105d7fc4e5727001395d7c
|
|
| MD5 |
6874f926502cdaf9a2998be3671c89e7
|
|
| BLAKE2b-256 |
2d1ddc16424d712e68ec8dc445eed53b5090cdeff2529240ae55f2660d81918c
|
File details
Details for the file feathertail-0.6.0-cp312-cp312-macosx_11_0_arm64.whl.
File metadata
- Download URL: feathertail-0.6.0-cp312-cp312-macosx_11_0_arm64.whl
- Upload date:
- Size: 567.5 kB
- Tags: CPython 3.12, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bd501cbc1e3fb9f79f6616834186a94f63d16d346322e3566e38f2c57b388143
|
|
| MD5 |
5a296da850cc2e141e0b8d26468cce47
|
|
| BLAKE2b-256 |
c4d5fc22d6ccf639f066b62ec1ad16bdc49208777038298d99c482fdb432f830
|
File details
Details for the file feathertail-0.6.0-cp311-cp311-win_amd64.whl.
File metadata
- Download URL: feathertail-0.6.0-cp311-cp311-win_amd64.whl
- Upload date:
- Size: 475.0 kB
- Tags: CPython 3.11, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
761fe37d2bc5503d8ff588151632c89a162c10bda9c0e31feb5da9152ec63569
|
|
| MD5 |
c57260ed63249834f502aa9609414148
|
|
| BLAKE2b-256 |
11f32859e083ccd8044c02acd15499d6ae0291308184eac2e75005979e8c7cd6
|
File details
Details for the file feathertail-0.6.0-cp311-cp311-manylinux_2_34_x86_64.whl.
File metadata
- Download URL: feathertail-0.6.0-cp311-cp311-manylinux_2_34_x86_64.whl
- Upload date:
- Size: 630.9 kB
- Tags: CPython 3.11, manylinux: glibc 2.34+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
12a77bf2f5aed67703b1f7a406cb1d57f382177d31028ddb8a84eca8acfe1eea
|
|
| MD5 |
7d4ae899eab5b5912ee3c6c0132300a1
|
|
| BLAKE2b-256 |
d60c9da630f6b3600a3336e28e110ce8334080f5c141e44cfa22603175032ae4
|
File details
Details for the file feathertail-0.6.0-cp311-cp311-macosx_11_0_arm64.whl.
File metadata
- Download URL: feathertail-0.6.0-cp311-cp311-macosx_11_0_arm64.whl
- Upload date:
- Size: 567.8 kB
- Tags: CPython 3.11, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
950734572201c37928cc76c2d12dcad218eb542778ebc1bfd3a650d230f1c847
|
|
| MD5 |
a2a6f1676ecda330541e38a87f365b8b
|
|
| BLAKE2b-256 |
0ff7f4b40535049775a3a9e0ffa79c86762d3d72237e33ff0c67a8cffdcfb3b7
|
File details
Details for the file feathertail-0.6.0-cp310-cp310-win_amd64.whl.
File metadata
- Download URL: feathertail-0.6.0-cp310-cp310-win_amd64.whl
- Upload date:
- Size: 475.0 kB
- Tags: CPython 3.10, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3abfa53739350b0afb63c91dcf15f403421a4ecbd049671c68db24ffdd9616d0
|
|
| MD5 |
cd29fdaac7f89d60603e697cebb92973
|
|
| BLAKE2b-256 |
092b0556680a18558a644a24293212f3c0aa4195914ab1932fea26424bad9354
|
File details
Details for the file feathertail-0.6.0-cp310-cp310-manylinux_2_34_x86_64.whl.
File metadata
- Download URL: feathertail-0.6.0-cp310-cp310-manylinux_2_34_x86_64.whl
- Upload date:
- Size: 630.9 kB
- Tags: CPython 3.10, manylinux: glibc 2.34+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
79193a5b336d070a193a9bb443ce8bfa322b801721ea850bf212d87de9c1bee6
|
|
| MD5 |
df73907c03e942e7a9388d5a8dcbb6e3
|
|
| BLAKE2b-256 |
c943101a670571fb9ce6927ec9322aa26321198c0b84fce0aa6614c12abf0fd5
|
File details
Details for the file feathertail-0.6.0-cp310-cp310-macosx_11_0_arm64.whl.
File metadata
- Download URL: feathertail-0.6.0-cp310-cp310-macosx_11_0_arm64.whl
- Upload date:
- Size: 567.7 kB
- Tags: CPython 3.10, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
75df706275b5c29703d4950d3ebe748093749f50c8ac2786e45fca051c2df4f4
|
|
| MD5 |
b4381d1b6d1ffd7f6f665cab23ba175e
|
|
| BLAKE2b-256 |
dc22ef95482327c82f28de53d9655805bb6e77beb881780e06aa424352d642c1
|
File details
Details for the file feathertail-0.6.0-cp39-cp39-win_amd64.whl.
File metadata
- Download URL: feathertail-0.6.0-cp39-cp39-win_amd64.whl
- Upload date:
- Size: 477.0 kB
- Tags: CPython 3.9, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
19ed21eb5bce77b24180d21b7ff8e403cc8e89d4700ade0be96d09939fc1aec8
|
|
| MD5 |
ebdfcfb0ff4aa9467f97ac0a0418c243
|
|
| BLAKE2b-256 |
07f31c79299849c9e7b519ef9739106222776c3242f8606cbe58055167e7c78a
|
File details
Details for the file feathertail-0.6.0-cp39-cp39-manylinux_2_34_x86_64.whl.
File metadata
- Download URL: feathertail-0.6.0-cp39-cp39-manylinux_2_34_x86_64.whl
- Upload date:
- Size: 631.5 kB
- Tags: CPython 3.9, manylinux: glibc 2.34+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ab155e2b37d8f31fd886cc201d25b69a8663d5bc4ae085f40fed641aec42893b
|
|
| MD5 |
6aaac4983ddda790f21b2796bd0857a4
|
|
| BLAKE2b-256 |
734a1e59f5888763c9f870ae77069580f9e803ffba363786243204a631336461
|
File details
Details for the file feathertail-0.6.0-cp39-cp39-macosx_11_0_arm64.whl.
File metadata
- Download URL: feathertail-0.6.0-cp39-cp39-macosx_11_0_arm64.whl
- Upload date:
- Size: 569.0 kB
- Tags: CPython 3.9, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fe958ace87c94db7005898dd7b1c2ba300c7581175d6b9c19120d4c84e82a182
|
|
| MD5 |
f0f89da16445be28460b4c2e8d516191
|
|
| BLAKE2b-256 |
c62b6e780a1abbcb0507c1430a024cab590a33724e5d2ec6f6207df059fb7795
|
File details
Details for the file feathertail-0.6.0-cp38-cp38-win_amd64.whl.
File metadata
- Download URL: feathertail-0.6.0-cp38-cp38-win_amd64.whl
- Upload date:
- Size: 476.0 kB
- Tags: CPython 3.8, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
90d27c2e873e0a3ddf1f71e331e8fc6302a005a87cf4a6dbf3f960a10c49344d
|
|
| MD5 |
6505bbb236e797be0ccd5b49f98c6d59
|
|
| BLAKE2b-256 |
363bf5af2fe04a82c1398d7f2ee1fe0014f1214a607263e54c86976a784ae605
|
File details
Details for the file feathertail-0.6.0-cp38-cp38-manylinux_2_34_x86_64.whl.
File metadata
- Download URL: feathertail-0.6.0-cp38-cp38-manylinux_2_34_x86_64.whl
- Upload date:
- Size: 631.1 kB
- Tags: CPython 3.8, manylinux: glibc 2.34+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6c623c30a95ea466d07bcad0d087583dcd014006536579154b8e49b9e12d0500
|
|
| MD5 |
2d8a6529a8ee6833cb798b0a16f7064b
|
|
| BLAKE2b-256 |
d1f0a75184235b2379fd44e2de9f1e97134555a943fca5e5f2e99579a476bd54
|
File details
Details for the file feathertail-0.6.0-cp38-cp38-macosx_11_0_arm64.whl.
File metadata
- Download URL: feathertail-0.6.0-cp38-cp38-macosx_11_0_arm64.whl
- Upload date:
- Size: 568.7 kB
- Tags: CPython 3.8, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a2f6c4e29d02367b5c1ce31274121aa855ca414dd69b09dd28c1888fb860832d
|
|
| MD5 |
14fc7bcf2e8d96247dba3e31ba1cd998
|
|
| BLAKE2b-256 |
00cab111efe7716c8bbc6cdf91c19b8069e8b177647d7b529751b491cb88169f
|