Skip to main content

A tiny, fast, Rust-backed transformation core for Python table data

Project description

๐Ÿชถ feathertail

A high-performance Python DataFrame library powered by Rust โ€” designed for flexibility, blazing speed, and intelligent type handling. Built for production with comprehensive features, advanced analytics, and enterprise-grade performance.


โœจ Key Features

๐Ÿš€ Core DataFrame Operations

  • โœ… Build TinyFrame from Python dict records (from_dicts)
  • โœ… Automatic type inference, including mixed-type and optional columns
  • โœ… Intelligent fallback to Python objects when Rust-native types aren't possible
  • โœ… Flexible fillna to handle missing data
  • โœ… Powerful cast_column to convert columns between types
  • โœ… Smart edit_column: edits that automatically adjust column type if needed
  • โœ… Drop or rename columns easily
  • โœ… Export back to Python dicts (to_dicts)

๐Ÿ”— Advanced Data Operations

  • โœ… Join Operations: Inner, left, right, outer, and cross joins
  • โœ… Filtering & Sorting: Advanced filtering with multiple conditions and multi-column sorting
  • โœ… GroupBy Aggregations: Comprehensive statistical operations (sum, mean, min, max, std, var, median, first, last, count, size)
  • โœ… Window Functions: Rolling and expanding window operations
  • โœ… Ranking Functions: Rank calculation with multiple methods and percentage change

๐Ÿ“Š Advanced Analytics

  • โœ… Descriptive Statistics: describe(), skew(), kurtosis(), quantile(), mode(), nunique()
  • โœ… Correlation & Covariance: Full correlation/covariance matrices and pairwise calculations
  • โœ… Time Series Operations: DateTime parsing, component extraction, time differences, and shifting
  • โœ… String Operations: Case conversion, whitespace removal, replacement, splitting, pattern matching, length, and concatenation
  • โœ… Data Validation: Not null, range, pattern, uniqueness validation with comprehensive reporting

โšก Performance & Optimization

  • โœ… SIMD Operations: x86_64 optimized numerical operations for blazing speed
  • โœ… Parallel Processing: Multi-core operations using Rayon for GroupBy, filtering, and sorting
  • โœ… Memory Optimization: String interning, lazy evaluation, and copy-on-write optimizations
  • โœ… Chunked Processing: Handle large datasets efficiently with streaming operations
  • โœ… Rust-backed Core: Lightweight, fast, and dependency-light
  • โœ… Cross-Platform Builds: Automated CI/CD with pre-built wheels for all major platforms

๐Ÿ› ๏ธ Developer Experience

  • โœ… Comprehensive Documentation: Sphinx-generated API docs with tutorials and guides
  • โœ… Logging & Debugging: Built-in logging system with performance monitoring
  • โœ… Profiling Tools: Performance profiling and optimization insights
  • โœ… Development Tools: Pre-commit hooks, automated testing, and development scripts
  • โœ… 239 Comprehensive Tests: Full test coverage running in 0.17 seconds

๐Ÿ“ฆ Installation

pip install feathertail

โœ… Cross-Platform Support: Pre-built wheels are available for Python 3.8+ on:

  • Linux (x86_64)
  • macOS (ARM64/aarch64)
  • Windows (x86_64)

Building from Source

# Clone the repository
git clone https://github.com/eddiethedean/feathertail.git
cd feathertail

# Install dependencies and build
pip install maturin
maturin develop --release

# Or install in development mode
pip install -e .

๐Ÿง‘โ€๐Ÿ’ป Quickstart

Basic DataFrame Operations

import feathertail as ft

records = [
    {"name": "Alice", "age": 30, "city": "New York", "score": 95.5},
    {"name": "Bob", "age": None, "city": "Paris", "score": 85.0},
    {"name": "Charlie", "age": 25, "city": "New York", "score": None},
]

frame = ft.TinyFrame.from_dicts(records)
print(frame)

Output:

TinyFrame(rows=3, columns=4, cols={ 'name': 'Str', 'age': 'OptInt', 'city': 'Str', 'score': 'OptFloat' })

Advanced Filtering and Sorting

# Filter and sort data
filtered = frame.filter("age", ">", 25)
sorted_frame = frame.sort_values(["city", "age"], ascending=[True, False])
print(sorted_frame.to_dicts())

GroupBy Aggregations

# Comprehensive statistical aggregations
groupby = frame.groupby("city")
stats = groupby.agg([("age", "mean"), ("score", "max"), ("name", "count")])
print(stats.to_dicts())

Join Operations

# Inner join with another DataFrame
other_data = [
    {"city": "New York", "population": 8_000_000},
    {"city": "Paris", "population": 2_000_000},
]
other_frame = ft.TinyFrame.from_dicts(other_data)

joined = frame.join(other_frame, "city", "city", "inner")
print(joined.to_dicts())

Advanced Analytics

# Descriptive statistics
description = frame.describe("score")
print(description.to_dicts())

# Correlation analysis
correlation = frame.corr("age", "score")
print(f"Age-Score correlation: {correlation}")

# Time series operations
time_data = [
    {"timestamp": "2023-01-01 10:00:00", "value": 100},
    {"timestamp": "2023-01-01 11:00:00", "value": 120},
]
time_frame = ft.TinyFrame.from_dicts(time_data)
time_frame = time_frame.to_timestamps("timestamp")
time_frame = time_frame.dt_year("timestamp_ts")
print(time_frame.to_dicts())

Window Functions

# Rolling window operations
data = [{"value": i} for i in range(1, 11)]
window_frame = ft.TinyFrame.from_dicts(data)
rolling_mean = window_frame.rolling_mean("value", 3)
print(rolling_mean.to_dicts())

String Operations

# String manipulation
text_data = [{"text": "  hello world  "}, {"text": "foo bar"}]
text_frame = ft.TinyFrame.from_dicts(text_data)
processed = text_frame.str_upper("text").str_strip("text")
print(processed.to_dicts())

Data Validation

# Data quality checks
validation = frame.validate_not_null("age")
validation_summary = frame.validation_summary("age")
print(f"Validation summary: {validation_summary}")

๐Ÿš€ Performance Features

SIMD-Accelerated Operations

# Automatic SIMD optimization for numerical operations
large_data = [{"value": i * 1.5} for i in range(100000)]
large_frame = ft.TinyFrame.from_dicts(large_data)

# These operations use SIMD for maximum performance
sum_result = large_frame.groupby("value").agg([("value", "sum")])

Parallel Processing

# Multi-core operations for large datasets
# Automatically uses all available CPU cores
filtered = large_frame.filter("value", ">", 50000)
sorted_data = large_frame.sort_values("value")

Memory Optimization

# String interning and lazy evaluation
# Memory usage is automatically optimized
frame = ft.TinyFrame.from_dicts(records)
# Operations are optimized for memory efficiency

๐Ÿ› ๏ธ Developer Tools

Logging and Debugging

# Enable comprehensive logging
ft.init_logging_with_config("info", log_memory=True, log_performance=True, log_operations=True)

# Enable debug mode
ft.enable_debug()

# Enable profiling
ft.enable_profiling()

# Your operations will be logged and profiled
frame = ft.TinyFrame.from_dicts(data)
result = frame.filter("age", ">", 25)

# View profiling report
ft.print_profiling_report()

Performance Monitoring

# Get operation statistics
stats = ft.get_operation_stats("filter")
print(f"Filter operations: {stats}")

# Get overall performance metrics
overall_stats = ft.get_overall_stats()
print(f"Total operations: {overall_stats['total_operations']}")

โš™๏ธ Supported Types

Type Column variants Description
int Int, OptInt 64-bit integers with optional null support
float Float, OptFloat 64-bit floats with optional null support
bool Bool, OptBool Boolean values with optional null support
str Str, OptStr UTF-8 strings with optional null support
mixed Mixed, OptMixed Mixed types with automatic Python object fallback

๐Ÿ“š Documentation


๐Ÿ—๏ธ Build System & CI/CD

Automated Cross-Platform Builds

feathertail uses GitHub Actions to automatically build and test wheels for all major platforms:

  • 15 build configurations covering Python 3.8-3.12
  • 3 operating systems: Linux (Ubuntu), macOS (ARM64), Windows
  • Automated testing with wheel installation verification
  • Artifact management with 30-day retention
  • PyPI deployment on version tags

Build Matrix

Platform Python Versions Architecture
Ubuntu 3.8, 3.9, 3.10, 3.11, 3.12 x86_64
macOS 3.8, 3.9, 3.10, 3.11, 3.12 ARM64 (aarch64)
Windows 3.8, 3.9, 3.10, 3.11, 3.12 x86_64

Quality Assurance

  • โœ… Rust compilation with proper target architecture
  • โœ… Python wheel building with maturin
  • โœ… Installation testing from temp directories
  • โœ… Import verification to ensure module works correctly
  • โœ… Cross-platform compatibility testing

๐Ÿงช Testing

# Run all tests (239 tests in ~0.17 seconds)
make test

# Run specific test categories
python -m pytest tests/python/unit/test_tinyframe.py
python -m pytest tests/python/unit/test_joins.py
python -m pytest tests/python/unit/test_analytics.py

๐Ÿ—๏ธ Building from Source

# Clone the repository
git clone https://github.com/your-username/feathertail.git
cd feathertail

# Set up development environment
make dev

# Build the package
make build

# Run tests
make test

# Build documentation
make docs

๐Ÿ‰ Why "feathertail"?

In Fourth Wing, a "feathertail" is a juvenile dragon โ€” small, golden, and nonviolent, known for grace rather than brute force.

This library follows the same spirit: gentle on dependencies, elegant in design, and capable of handling complex data types with ease โ€” but with the power and performance of a full-grown dragon when you need it.


๐Ÿ“Š Performance Benchmarks

  • 239 comprehensive tests run in just 0.17 seconds
  • SIMD-accelerated numerical operations
  • Parallel processing for multi-core performance
  • Memory-optimized with string interning and lazy evaluation
  • Production-ready with comprehensive error handling and logging

โค๏ธ Contributing

Contributions, ideas, and feedback are always welcome! Please see our Contributing Guide for details.


๐Ÿ“„ License

MIT


๐ŸŽฏ Roadmap

  • Cross-platform PyPI builds - โœ… Automated builds for Linux, macOS, and Windows
  • Additional time series functions
  • More statistical distributions
  • Enhanced plotting integration
  • Database connectors
  • Arrow/Parquet integration

Built with โค๏ธ using Rust and Python# Trigger new build

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

feathertail-0.5.0-cp312-cp312-win_amd64.whl (453.6 kB view details)

Uploaded CPython 3.12Windows x86-64

feathertail-0.5.0-cp312-cp312-manylinux_2_34_x86_64.whl (620.0 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.34+ x86-64

feathertail-0.5.0-cp312-cp312-macosx_11_0_arm64.whl (547.4 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

feathertail-0.5.0-cp311-cp311-win_amd64.whl (454.8 kB view details)

Uploaded CPython 3.11Windows x86-64

feathertail-0.5.0-cp311-cp311-manylinux_2_34_x86_64.whl (622.6 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.34+ x86-64

feathertail-0.5.0-cp311-cp311-macosx_11_0_arm64.whl (550.6 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

feathertail-0.5.0-cp310-cp310-win_amd64.whl (455.0 kB view details)

Uploaded CPython 3.10Windows x86-64

feathertail-0.5.0-cp310-cp310-manylinux_2_34_x86_64.whl (621.9 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.34+ x86-64

feathertail-0.5.0-cp310-cp310-macosx_11_0_arm64.whl (550.5 kB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

feathertail-0.5.0-cp39-cp39-win_amd64.whl (456.1 kB view details)

Uploaded CPython 3.9Windows x86-64

feathertail-0.5.0-cp39-cp39-manylinux_2_34_x86_64.whl (622.5 kB view details)

Uploaded CPython 3.9manylinux: glibc 2.34+ x86-64

feathertail-0.5.0-cp39-cp39-macosx_11_0_arm64.whl (551.9 kB view details)

Uploaded CPython 3.9macOS 11.0+ ARM64

feathertail-0.5.0-cp38-cp38-win_amd64.whl (455.4 kB view details)

Uploaded CPython 3.8Windows x86-64

feathertail-0.5.0-cp38-cp38-manylinux_2_34_x86_64.whl (620.9 kB view details)

Uploaded CPython 3.8manylinux: glibc 2.34+ x86-64

feathertail-0.5.0-cp38-cp38-macosx_11_0_arm64.whl (550.7 kB view details)

Uploaded CPython 3.8macOS 11.0+ ARM64

File details

Details for the file feathertail-0.5.0-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for feathertail-0.5.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 88303b3bb6f24d578f81bfeeaebbb8f5cdfb1c67301dc7ffe1b4a59d8bbf0f9c
MD5 554d877f8da1297d354e4e6925fef52f
BLAKE2b-256 fb602231e15c73fd6437cdbb5d9d97cb19d2a1b198bc23670c53ce36025d81aa

See more details on using hashes here.

File details

Details for the file feathertail-0.5.0-cp312-cp312-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for feathertail-0.5.0-cp312-cp312-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 0d183672131b793f252f4c59973f40b73eb66b370ee16f6b48c740dc2e557fff
MD5 008ffc916594ff97e4f3f005585b7beb
BLAKE2b-256 f455a8aa7cf87c81bce9ed76b9cbc2e16257b3f0a62ed7a5019859b8071d1cc2

See more details on using hashes here.

File details

Details for the file feathertail-0.5.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for feathertail-0.5.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 af9c125f0998b22f381d2a8ffd324e4b9b3fe75a07e4770a25afff71d07e9293
MD5 b9e29f579ecb4275243004d4e149b55d
BLAKE2b-256 267c60610d965350984f9f3efbea0944b2b0394c73a498e8685d36cc34491873

See more details on using hashes here.

File details

Details for the file feathertail-0.5.0-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for feathertail-0.5.0-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 e6fbeedac1e54557e4e22c4271488821fb30bd39ccd67ca0a9ccd28b265f241c
MD5 0eeabdf6e4508f661a89fd6aee9d1370
BLAKE2b-256 6950174b52accd414fc1e5cf32af3816a024dfa30213c2b3b8d2c084101c4c66

See more details on using hashes here.

File details

Details for the file feathertail-0.5.0-cp311-cp311-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for feathertail-0.5.0-cp311-cp311-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 1b0abf5136975b5768871f07775359b907decd2b3c92e9e3e1d72bb42e3b87f1
MD5 7e3b28feee9a68f66c22164a5322e391
BLAKE2b-256 ae7898f964202c5f9e9e44821eaa78766a3037a193bbcd52694e7e0113ef4771

See more details on using hashes here.

File details

Details for the file feathertail-0.5.0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for feathertail-0.5.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 9b92ed2e88cb5fe595d144e7d7bee31c2f2fd104eced7b41e6f4a07720de77a7
MD5 19e7c41de7ca9627aa9923584035e4d9
BLAKE2b-256 3b3845fb50b9706b6906b039974836ef8be1a99b7d9105201e2b410f0c36955d

See more details on using hashes here.

File details

Details for the file feathertail-0.5.0-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for feathertail-0.5.0-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 afc6c8c7965d10c36bfa4839bf5bef93ec4d7371a3a0b7479affb6cd6b754cb9
MD5 6b318d336d3cc1e543fb5f5377820e61
BLAKE2b-256 e87b9d505aa18c3e317f6927fc3f51a4a30ac532130c86825ad2fd99c81bf3e4

See more details on using hashes here.

File details

Details for the file feathertail-0.5.0-cp310-cp310-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for feathertail-0.5.0-cp310-cp310-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 70a7a218885c5bf5124d601c36a3709a5aa4acee81c9e1bf2c405277307ab924
MD5 8195b5ef2758cc2b5d819b2c0217bf03
BLAKE2b-256 da983a6d01f754e6ee41b0edbb4140a6d016dbd019939e1b90c3c67c9038bac0

See more details on using hashes here.

File details

Details for the file feathertail-0.5.0-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for feathertail-0.5.0-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 73af34a7d81d505c4c7dc86de16cfdec5ec50cea5f83efa5288e5bd55071b045
MD5 1255daf0075323e314dae3e628954e5a
BLAKE2b-256 70b9f9913553e8417b025e28c3306ee23e7917052c7075f156aad760f88e883a

See more details on using hashes here.

File details

Details for the file feathertail-0.5.0-cp39-cp39-win_amd64.whl.

File metadata

  • Download URL: feathertail-0.5.0-cp39-cp39-win_amd64.whl
  • Upload date:
  • Size: 456.1 kB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for feathertail-0.5.0-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 166ef0561fea6e88cc0cb498ee463e30b2fec75516dfbff1c3d4c8f0a3b9405b
MD5 9d4dbee76aa75852e278353eb0504600
BLAKE2b-256 c436289e4dd4e572bd41687ad0576dfad4da7e3f0347322e48e309fddf541b7e

See more details on using hashes here.

File details

Details for the file feathertail-0.5.0-cp39-cp39-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for feathertail-0.5.0-cp39-cp39-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 0f747e4ffee8dc86d9740df7804357f9bfb9e4ad4b8e61789e0a373b35f9cceb
MD5 be54dc0fee384db3570263218fee020d
BLAKE2b-256 bbc8444706e7418f72de08db40825982231698c9adbfad3375beb03f21d5b166

See more details on using hashes here.

File details

Details for the file feathertail-0.5.0-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for feathertail-0.5.0-cp39-cp39-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 7c47b4a022511e9b8d44c7f05b511f4fdb4172ef6541322d27cf6378058c1341
MD5 bc9400ecc09719388e915d182509fc4b
BLAKE2b-256 081354112f619c8cf71d0bb1511e6541d8c6bed6767a5d8b013609386855161e

See more details on using hashes here.

File details

Details for the file feathertail-0.5.0-cp38-cp38-win_amd64.whl.

File metadata

  • Download URL: feathertail-0.5.0-cp38-cp38-win_amd64.whl
  • Upload date:
  • Size: 455.4 kB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for feathertail-0.5.0-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 1fcd89b3babc3ddd056dabdd8a7cbb5adc111075fae405b8abc7d9821e082018
MD5 261aea86e978943c54d46a9257a77c05
BLAKE2b-256 bf30e4f99dbb532f43721080fa3477590ec6dddc4aa34be5e2ae29cf0e5af93d

See more details on using hashes here.

File details

Details for the file feathertail-0.5.0-cp38-cp38-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for feathertail-0.5.0-cp38-cp38-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 0057be1de89edd8c99189f43b681bf681c14a00fb718e4c234ab875baeeab16e
MD5 5312921c39c443b3b874f0458bf20940
BLAKE2b-256 20f34336efabb60a408def22db52a09b4755f4b8013841edce75a8264f0f9654

See more details on using hashes here.

File details

Details for the file feathertail-0.5.0-cp38-cp38-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for feathertail-0.5.0-cp38-cp38-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 2cce600343c7aabca29bc0beea442b16b3debdfe4b5a3b2fc94c1d8198adada8
MD5 bfdb9c594b6a795130a1ebe2915c6c99
BLAKE2b-256 fbc419417abb8ed89bbc07fa7e13deefbcdeb33e4995dfbaac19e6a0e212140d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page