Skip to main content

A tiny, fast, Rust-backed transformation core for Python table data

Project description

🪶 feathertail

A high-performance Python DataFrame library powered by Rust — designed for flexibility, blazing speed, and intelligent type handling. Built for production with comprehensive features, advanced analytics, and enterprise-grade performance.


✨ Key Features

🚀 Core DataFrame Operations

  • ✅ Build TinyFrame from Python dict records (from_dicts)
  • ✅ Automatic type inference, including mixed-type and optional columns
  • ✅ Intelligent fallback to Python objects when Rust-native types aren't possible (stored by runtime pointer identity for the lifetime of the frame—keep references alive while using TinyFrame)
  • ✅ Flexible fillna to handle missing data
  • ✅ Powerful cast_column to convert columns between types
  • ✅ Smart edit_column: edits that automatically adjust column type if needed
  • ✅ Drop or rename columns easily
  • ✅ Export back to Python dicts (to_dicts)

🔗 Advanced Data Operations

  • Join Operations: Inner, left, right, outer, and cross joins
  • Filtering & Sorting: Advanced filtering with multiple conditions and multi-column sorting
  • GroupBy Aggregations: TinyGroupBy with string key columns — sum, mean, min, max, std, var, median, first, last, count, size (call each aggregation separately)
  • Window Functions: Rolling and expanding window operations
  • Ranking Functions: Rank calculation with multiple methods and percentage change

📊 Advanced Analytics

  • Descriptive Statistics: describe(), skew(), kurtosis(), quantile(), mode(), nunique()
  • Correlation & Covariance: Full correlation/covariance matrices and pairwise calculations
  • Time Series Operations: DateTime parsing (strict: invalid or empty strings raise ValueError), component extraction, time differences, and shifting
  • String Operations: Case conversion, whitespace removal, replacement, splitting, pattern matching, length, and concatenation
  • Data Validation: Not null, range, pattern, uniqueness validation with comprehensive reporting

Performance & Optimization

  • SIMD Operations: x86_64 optimized numerical operations for blazing speed
  • Parallel Processing: Multi-core operations using Rayon for GroupBy, filtering, and sorting
  • Memory Optimization: String interning, lazy evaluation, and copy-on-write optimizations
  • Chunked Processing: Handle large datasets efficiently with streaming operations
  • Rust-backed Core: Lightweight, fast, and dependency-light
  • Cross-Platform Builds: Automated CI/CD with pre-built wheels for all major platforms

🛠️ Developer Experience

  • Comprehensive Documentation: Sphinx-generated API docs with tutorials and guides
  • Logging & Debugging: Built-in logging system with performance monitoring
  • Profiling Tools: Performance profiling and optimization insights
  • Development Tools: Pre-commit hooks, automated testing, and development scripts
  • 250+ Comprehensive Tests: Full test coverage running in well under one second locally

📦 Installation

pip install feathertail

✅ Cross-Platform Support: Pre-built wheels are available for Python 3.8+ on:

  • Linux (x86_64)
  • macOS (ARM64/aarch64)
  • Windows (x86_64)

Building from Source

# Clone the repository
git clone https://github.com/eddiethedean/feathertail.git
cd feathertail

# Install dependencies and build
pip install maturin
maturin develop --release

# Or install in development mode
pip install -e .

🧑‍💻 Quickstart

Basic DataFrame Operations

import feathertail as ft

records = [
    {"name": "Alice", "age": 30, "city": "New York", "score": 95.5},
    {"name": "Bob", "age": None, "city": "Paris", "score": 85.0},
    {"name": "Charlie", "age": 25, "city": "New York", "score": None},
]

frame = ft.TinyFrame.from_dicts(records)
print(frame)

Output:

TinyFrame(rows=3, columns=4, cols={ 'name': 'Str', 'age': 'OptInt', 'city': 'Str', 'score': 'OptFloat' })

Advanced Filtering and Sorting

# Filter and sort data
filtered = frame.filter("age", ">", 25)
sorted_frame = frame.sort_values(["city", "age"], ascending=[True, False])
print(sorted_frame.to_dicts())

GroupBy Aggregations

# Group keys must be string columns. Build TinyGroupBy, then aggregate with the frame:
gb = ft.TinyGroupBy(frame, ["city"])
mean_age = gb.mean(frame, "age")
max_score = gb.max(frame, "score")
row_counts = gb.count(frame)

Join Operations

# Inner join with another DataFrame
other_data = [
    {"city": "New York", "population": 8_000_000},
    {"city": "Paris", "population": 2_000_000},
]
other_frame = ft.TinyFrame.from_dicts(other_data)

joined = frame.join(other_frame, "city", "city", "inner")
print(joined.to_dicts())

Join semantics

  • Composite keys: A row is used for matching only if every join-key column is non-null (SQL-style). If any key component is null, that row does not appear in the join key index.
  • Column names: The result keeps one name per logical column. If the same basename appears as a non-join column on both sides, or if a join-key name on one frame collides with a non-key column on the other, feathertail raises ValueError—rename on one frame first. Automatic _x / _y suffixing (pandas-style) is not implemented yet.
  • cross_join: Left and right must have disjoint column names; overlaps raise ValueError.
  • Python object fallback: Fallback object storage from both sides is merged on join outputs so to_dicts() can resolve references. Conflicting reuse of the same internal id for different objects raises a runtime error (very rare).

Advanced Analytics

# Descriptive statistics
description = frame.describe("score")
print(description.to_dicts())

# Correlation analysis
correlation = frame.corr("age", "score")
print(f"Age-Score correlation: {correlation}")

# Time series operations
time_data = [
    {"timestamp": "2023-01-01 10:00:00", "value": 100},
    {"timestamp": "2023-01-01 11:00:00", "value": 120},
]
time_frame = ft.TinyFrame.from_dicts(time_data)
time_frame = time_frame.to_timestamps("timestamp")  # adds `timestamp_timestamp` (Unix seconds)
time_frame = time_frame.dt_year("timestamp")      # still parses from original string column
print(time_frame.to_dicts())

Window Functions

# Rolling window operations
data = [{"value": i} for i in range(1, 11)]
window_frame = ft.TinyFrame.from_dicts(data)
rolling_mean = window_frame.rolling_mean("value", 3)
print(rolling_mean.to_dicts())

String Operations

# String manipulation
text_data = [{"text": "  hello world  "}, {"text": "foo bar"}]
text_frame = ft.TinyFrame.from_dicts(text_data)
processed = text_frame.str_upper("text").str_strip("text")
print(processed.to_dicts())

Data Validation

# Data quality checks
validation = frame.validate_not_null("age")
validation_summary = frame.validation_summary("age")
print(f"Validation summary: {validation_summary}")

🚀 Performance Features

SIMD-Accelerated Operations

# Automatic SIMD optimization for numerical operations
large_data = [{"category": "A" if i % 2 == 0 else "B", "value": i * 1.5} for i in range(100000)]
large_frame = ft.TinyFrame.from_dicts(large_data)

# TinyGroupBy keys must be string columns; aggregates run over numeric columns
gb = ft.TinyGroupBy(large_frame, ["category"])
sum_result = gb.sum(large_frame, "value")

Parallel Processing

# Multi-core operations for large datasets
# Automatically uses all available CPU cores
filtered = large_frame.filter("value", ">", 50000)
sorted_data = large_frame.sort_values("value")

Memory Optimization

# String interning and lazy evaluation
# Memory usage is automatically optimized
frame = ft.TinyFrame.from_dicts(records)
# Operations are optimized for memory efficiency

🛠️ Developer Tools

Logging and Debugging

# Enable comprehensive logging
ft.init_logging_with_config("info", log_memory=True, log_performance=True, log_operations=True)

# Enable debug mode
ft.enable_debug()

# Enable profiling
ft.enable_profiling()

# Your operations will be logged and profiled
frame = ft.TinyFrame.from_dicts(data)
result = frame.filter("age", ">", 25)

# View profiling report
ft.print_profiling_report()

Performance Monitoring

# Get operation statistics
stats = ft.get_operation_stats("filter")
print(f"Filter operations: {stats}")

# Get overall performance metrics
overall_stats = ft.get_overall_stats()
print(f"Total operations: {overall_stats['total_operations']}")

⚙️ Supported Types

Type Column variants Description
int Int, OptInt 64-bit integers with optional null support
float Float, OptFloat 64-bit floats with optional null support
bool Bool, OptBool Boolean values with optional null support
str Str, OptStr UTF-8 strings with optional null support
mixed Mixed, OptMixed Mixed types with automatic Python object fallback

cast_column and strings. Casting a non-optional Str column to int or float is strict: each cell must parse; otherwise ValueError is raised (values are not coerced to 0). Casting optional OptStr to numeric optional types maps unparseable strings to missing values where applicable.


📚 Documentation


🏗️ Build System & CI/CD

Automated Cross-Platform Builds

feathertail uses GitHub Actions to automatically build and test wheels for all major platforms:

  • 15 build configurations covering Python 3.8-3.12
  • 3 operating systems: Linux (Ubuntu), macOS (ARM64), Windows
  • Automated testing with wheel installation verification
  • Artifact management with 30-day retention
  • PyPI deployment on version tags

Build Matrix

Platform Python Versions Architecture
Ubuntu 3.8, 3.9, 3.10, 3.11, 3.12 x86_64
macOS 3.8, 3.9, 3.10, 3.11, 3.12 ARM64 (aarch64)
Windows 3.8, 3.9, 3.10, 3.11, 3.12 x86_64

Quality Assurance

  • Rust compilation with proper target architecture
  • Python wheel building with maturin
  • CI test matrix: pytest on Python 3.8–3.12 on Ubuntu, macOS, and Windows (release wheels built for the same range)
  • Installation testing from temp directories
  • Import verification to ensure module works correctly
  • Cross-platform compatibility testing

🧪 Testing

# Run all tests (Rust + Python; 250+ Python unit tests plus Rust tests)
make test

# Run specific test categories
python -m pytest tests/python/unit/test_tinyframe.py
python -m pytest tests/python/unit/test_joins.py
python -m pytest tests/python/unit/test_analytics.py

🏗️ Building from Source

# Clone the repository
git clone https://github.com/your-username/feathertail.git
cd feathertail

# Set up development environment
make dev

# Build the package
make build

# Run tests
make test

# Build documentation
make docs

🐉 Why "feathertail"?

In Fourth Wing, a "feathertail" is a juvenile dragon — small, golden, and nonviolent, known for grace rather than brute force.

This library follows the same spirit: gentle on dependencies, elegant in design, and capable of handling complex data types with ease — but with the power and performance of a full-grown dragon when you need it.


📊 Performance Benchmarks

  • 250+ Python unit tests plus Rust tests run in well under one second locally
  • SIMD-accelerated numerical operations
  • Parallel processing for multi-core performance
  • Memory-optimized with string interning and lazy evaluation
  • Production-ready with comprehensive error handling and logging

❤️ Contributing

Contributions, ideas, and feedback are always welcome! Please see our Contributing Guide for details.


📄 License

MIT


🎯 Roadmap

  • Cross-platform PyPI builds - ✅ Automated builds for Linux, macOS, and Windows
  • Additional time series functions
  • More statistical distributions
  • Enhanced plotting integration
  • Database connectors
  • Arrow/Parquet integration

Built with ❤️ using Rust and Python

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

feathertail-0.6.1-cp312-cp312-win_amd64.whl (477.0 kB view details)

Uploaded CPython 3.12Windows x86-64

feathertail-0.6.1-cp312-cp312-manylinux_2_34_x86_64.whl (633.8 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.34+ x86-64

feathertail-0.6.1-cp312-cp312-macosx_11_0_arm64.whl (567.5 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

feathertail-0.6.1-cp311-cp311-win_amd64.whl (475.0 kB view details)

Uploaded CPython 3.11Windows x86-64

feathertail-0.6.1-cp311-cp311-manylinux_2_34_x86_64.whl (631.1 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.34+ x86-64

feathertail-0.6.1-cp311-cp311-macosx_11_0_arm64.whl (567.8 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

feathertail-0.6.1-cp310-cp310-win_amd64.whl (475.0 kB view details)

Uploaded CPython 3.10Windows x86-64

feathertail-0.6.1-cp310-cp310-manylinux_2_34_x86_64.whl (631.0 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.34+ x86-64

feathertail-0.6.1-cp310-cp310-macosx_11_0_arm64.whl (567.6 kB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

feathertail-0.6.1-cp39-cp39-win_amd64.whl (476.8 kB view details)

Uploaded CPython 3.9Windows x86-64

feathertail-0.6.1-cp39-cp39-manylinux_2_34_x86_64.whl (631.6 kB view details)

Uploaded CPython 3.9manylinux: glibc 2.34+ x86-64

feathertail-0.6.1-cp39-cp39-macosx_11_0_arm64.whl (568.7 kB view details)

Uploaded CPython 3.9macOS 11.0+ ARM64

feathertail-0.6.1-cp38-cp38-win_amd64.whl (475.9 kB view details)

Uploaded CPython 3.8Windows x86-64

feathertail-0.6.1-cp38-cp38-manylinux_2_34_x86_64.whl (631.4 kB view details)

Uploaded CPython 3.8manylinux: glibc 2.34+ x86-64

feathertail-0.6.1-cp38-cp38-macosx_11_0_arm64.whl (568.4 kB view details)

Uploaded CPython 3.8macOS 11.0+ ARM64

File details

Details for the file feathertail-0.6.1-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: feathertail-0.6.1-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 477.0 kB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for feathertail-0.6.1-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 882b2d4190da438e74c97ca599e1fa353ad9db0f05beb65f4573be1917de9b01
MD5 d4d2e8e4b27fd5013df161375f354c7f
BLAKE2b-256 a04f563f7cd922c28ba68fefa4d91386efdb9a8087df7f4a30dd35968627365c

See more details on using hashes here.

File details

Details for the file feathertail-0.6.1-cp312-cp312-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for feathertail-0.6.1-cp312-cp312-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 94e9b69bc8004655f0855764ed577ee0c12ceff361206571489fac3b6a2af9cf
MD5 d6a9ab45c5e5b85f4422d952a9095f60
BLAKE2b-256 6da391e5d15247686237eb8879e77f1d369e8a178779d5c770e75c3fc2748324

See more details on using hashes here.

File details

Details for the file feathertail-0.6.1-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for feathertail-0.6.1-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 b9e10c1e48a15afbf1ca24eb4e14d54a3102261ca62438a7f9639d61717ef94e
MD5 feaef40fd7c3976325b169a64b44583f
BLAKE2b-256 bf5c76e1e316b5c968b4f4eb624591876f892e157ff7f6326618c9a413322882

See more details on using hashes here.

File details

Details for the file feathertail-0.6.1-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: feathertail-0.6.1-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 475.0 kB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for feathertail-0.6.1-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 58c678beee073854fa51d39602f2fd81885af0c43cb97f796cfec089704fe8af
MD5 3b0b08730db526adb13f70b61d3b6ee4
BLAKE2b-256 59070b36731c14a8996ac3a955c889c71864af765d5d57cc76f10ddf13ba643b

See more details on using hashes here.

File details

Details for the file feathertail-0.6.1-cp311-cp311-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for feathertail-0.6.1-cp311-cp311-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 3c3ed56050a6b945fec355865fb660cec82d4313c35d48345388d907435765fa
MD5 2e3ac89f5b806dbe1c7e94e8340abde7
BLAKE2b-256 be7853b1558b708c681a5f5938703184ef3ea00848d2cf3e2e607fa81bbb945b

See more details on using hashes here.

File details

Details for the file feathertail-0.6.1-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for feathertail-0.6.1-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 4e98d131e0c506b29cdba3f9ac27d95badadd6d937e094c5290ec398e54d48bf
MD5 2473ecbaf0e95f59dc55960293ca2559
BLAKE2b-256 78eeac649f7c67712fcca06696adcae8aebe5a45ba7b4a4612ec9d5bfb15ea10

See more details on using hashes here.

File details

Details for the file feathertail-0.6.1-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: feathertail-0.6.1-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 475.0 kB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for feathertail-0.6.1-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 91ff514f7d863cd15d1626e4044c2883c02f5b00723fb7085414acb2d533fbab
MD5 ced1b0f7f290aafe9a6aff6b27cce5c2
BLAKE2b-256 5a96d8d8191ceb6da87fe463205c0f3d125000f57b170371a51e0b7913298f38

See more details on using hashes here.

File details

Details for the file feathertail-0.6.1-cp310-cp310-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for feathertail-0.6.1-cp310-cp310-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 6362459ae6096222172af8317cad04a2df9b8b4e38bce47f656f88306f46a344
MD5 ef46f6cd94ca9c4ad9912c9270b8a692
BLAKE2b-256 0882ffa3a899d3bdced31687943c74b1a32e616b5eba259561ff397cc656711b

See more details on using hashes here.

File details

Details for the file feathertail-0.6.1-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for feathertail-0.6.1-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 06875ffc89694875b927fe2ee389ef7dc8372aaf471bcf258c444c9c0401232a
MD5 4a26167fd5969defc78474fe45416047
BLAKE2b-256 27db63564df27a56d9e346df7b91dfe0f4eb811ff5593c55c9698c8a393aa836

See more details on using hashes here.

File details

Details for the file feathertail-0.6.1-cp39-cp39-win_amd64.whl.

File metadata

  • Download URL: feathertail-0.6.1-cp39-cp39-win_amd64.whl
  • Upload date:
  • Size: 476.8 kB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for feathertail-0.6.1-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 687c39579ac411c671dbc884c268d2dc3fbf6171ade08434a76fcdc205c2b5d2
MD5 6d68894cdfd4192953d50554c69cde2e
BLAKE2b-256 2e781d9b85d6e1288f5b2bf93ab213362d948cf042c270049523692455ba185a

See more details on using hashes here.

File details

Details for the file feathertail-0.6.1-cp39-cp39-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for feathertail-0.6.1-cp39-cp39-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 a32915dc636a3cce52a138f9fe04e5da1346e147500e0e886edeff2f0037eee3
MD5 6b474b1e6597abb5cb186fcc88a62bb5
BLAKE2b-256 f169d392621c5fb93e5b986f012f2ff5ba6aebcc4ec788be29dbeafc0223f3e7

See more details on using hashes here.

File details

Details for the file feathertail-0.6.1-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for feathertail-0.6.1-cp39-cp39-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 8e31adc368fb533dd55900d426392b4b576928d808937395a282f2484155143d
MD5 a33ce36d0e34833efbd92161d1772b20
BLAKE2b-256 12c885bf95bdce58e755cd899da3091d3e94222e11c74cb260f342df618d75dc

See more details on using hashes here.

File details

Details for the file feathertail-0.6.1-cp38-cp38-win_amd64.whl.

File metadata

  • Download URL: feathertail-0.6.1-cp38-cp38-win_amd64.whl
  • Upload date:
  • Size: 475.9 kB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for feathertail-0.6.1-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 5725fe47ddd70a146e5c2662e062b45a2555254211451df7e99c272ae2e493d7
MD5 ce38f6104a3f47db08aa05f52a60212d
BLAKE2b-256 37427eed001221689e078f23c60bc45f405f863f6da5094c1a7e88aa8b6526d3

See more details on using hashes here.

File details

Details for the file feathertail-0.6.1-cp38-cp38-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for feathertail-0.6.1-cp38-cp38-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 ae312094e3ad17fb8122304df81e220be110fc29324f169d9fb5aa081b63b75b
MD5 02c20db4d0f5bffeaef1ec780de9befa
BLAKE2b-256 474ff7f7e9f844cfc5609b33688707309d73db10ad9dd67f308441e9771c29c9

See more details on using hashes here.

File details

Details for the file feathertail-0.6.1-cp38-cp38-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for feathertail-0.6.1-cp38-cp38-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 e79bd92755245da777b0ccacff7d670753db48c567f37baa2afcb15c9e42ba05
MD5 9a14e4945cf54930c9bc82cbd1382cd2
BLAKE2b-256 e21f8897f3eb91360c0f5c8e29185068040f6e0fc08aa0c083a83c26e25fe216

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page