Skip to main content

A tiny, fast, Rust-backed transformation core for Python table data

Project description

๐Ÿชถ feathertail

A high-performance Python DataFrame library powered by Rust โ€” designed for flexibility, blazing speed, and intelligent type handling. Built for production with comprehensive features, advanced analytics, and enterprise-grade performance.


โœจ Key Features

๐Ÿš€ Core DataFrame Operations

  • โœ… Build TinyFrame from Python dict records (from_dicts)
  • โœ… Automatic type inference, including mixed-type and optional columns
  • โœ… Intelligent fallback to Python objects when Rust-native types aren't possible
  • โœ… Flexible fillna to handle missing data
  • โœ… Powerful cast_column to convert columns between types
  • โœ… Smart edit_column: edits that automatically adjust column type if needed
  • โœ… Drop or rename columns easily
  • โœ… Export back to Python dicts (to_dicts)

๐Ÿ”— Advanced Data Operations

  • โœ… Join Operations: Inner, left, right, outer, and cross joins
  • โœ… Filtering & Sorting: Advanced filtering with multiple conditions and multi-column sorting
  • โœ… GroupBy Aggregations: Comprehensive statistical operations (sum, mean, min, max, std, var, median, first, last, count, size)
  • โœ… Window Functions: Rolling and expanding window operations
  • โœ… Ranking Functions: Rank calculation with multiple methods and percentage change

๐Ÿ“Š Advanced Analytics

  • โœ… Descriptive Statistics: describe(), skew(), kurtosis(), quantile(), mode(), nunique()
  • โœ… Correlation & Covariance: Full correlation/covariance matrices and pairwise calculations
  • โœ… Time Series Operations: DateTime parsing, component extraction, time differences, and shifting
  • โœ… String Operations: Case conversion, whitespace removal, replacement, splitting, pattern matching, length, and concatenation
  • โœ… Data Validation: Not null, range, pattern, uniqueness validation with comprehensive reporting

โšก Performance & Optimization

  • โœ… SIMD Operations: x86_64 optimized numerical operations for blazing speed
  • โœ… Parallel Processing: Multi-core operations using Rayon for GroupBy, filtering, and sorting
  • โœ… Memory Optimization: String interning, lazy evaluation, and copy-on-write optimizations
  • โœ… Chunked Processing: Handle large datasets efficiently with streaming operations
  • โœ… Rust-backed Core: Lightweight, fast, and dependency-light
  • โœ… Cross-Platform Builds: Automated CI/CD with pre-built wheels for all major platforms

๐Ÿ› ๏ธ Developer Experience

  • โœ… Comprehensive Documentation: Sphinx-generated API docs with tutorials and guides
  • โœ… Logging & Debugging: Built-in logging system with performance monitoring
  • โœ… Profiling Tools: Performance profiling and optimization insights
  • โœ… Development Tools: Pre-commit hooks, automated testing, and development scripts
  • โœ… 239 Comprehensive Tests: Full test coverage running in 0.17 seconds

๐Ÿ“ฆ Installation

pip install feathertail

โœ… Cross-Platform Support: Pre-built wheels are available for Python 3.8+ on:

  • Linux (x86_64)
  • macOS (ARM64/aarch64)
  • Windows (x86_64)

Building from Source

# Clone the repository
git clone https://github.com/eddiethedean/feathertail.git
cd feathertail

# Install dependencies and build
pip install maturin
maturin develop --release

# Or install in development mode
pip install -e .

๐Ÿง‘โ€๐Ÿ’ป Quickstart

Basic DataFrame Operations

import feathertail as ft

records = [
    {"name": "Alice", "age": 30, "city": "New York", "score": 95.5},
    {"name": "Bob", "age": None, "city": "Paris", "score": 85.0},
    {"name": "Charlie", "age": 25, "city": "New York", "score": None},
]

frame = ft.TinyFrame.from_dicts(records)
print(frame)

Output:

TinyFrame(rows=3, columns=4, cols={ 'name': 'Str', 'age': 'OptInt', 'city': 'Str', 'score': 'OptFloat' })

Advanced Filtering and Sorting

# Filter and sort data
filtered = frame.filter("age", ">", 25)
sorted_frame = frame.sort_values(["city", "age"], ascending=[True, False])
print(sorted_frame.to_dicts())

GroupBy Aggregations

# Comprehensive statistical aggregations
groupby = frame.groupby("city")
stats = groupby.agg([("age", "mean"), ("score", "max"), ("name", "count")])
print(stats.to_dicts())

Join Operations

# Inner join with another DataFrame
other_data = [
    {"city": "New York", "population": 8_000_000},
    {"city": "Paris", "population": 2_000_000},
]
other_frame = ft.TinyFrame.from_dicts(other_data)

joined = frame.join(other_frame, "city", "city", "inner")
print(joined.to_dicts())

Advanced Analytics

# Descriptive statistics
description = frame.describe("score")
print(description.to_dicts())

# Correlation analysis
correlation = frame.corr("age", "score")
print(f"Age-Score correlation: {correlation}")

# Time series operations
time_data = [
    {"timestamp": "2023-01-01 10:00:00", "value": 100},
    {"timestamp": "2023-01-01 11:00:00", "value": 120},
]
time_frame = ft.TinyFrame.from_dicts(time_data)
time_frame = time_frame.to_timestamps("timestamp")
time_frame = time_frame.dt_year("timestamp_ts")
print(time_frame.to_dicts())

Window Functions

# Rolling window operations
data = [{"value": i} for i in range(1, 11)]
window_frame = ft.TinyFrame.from_dicts(data)
rolling_mean = window_frame.rolling_mean("value", 3)
print(rolling_mean.to_dicts())

String Operations

# String manipulation
text_data = [{"text": "  hello world  "}, {"text": "foo bar"}]
text_frame = ft.TinyFrame.from_dicts(text_data)
processed = text_frame.str_upper("text").str_strip("text")
print(processed.to_dicts())

Data Validation

# Data quality checks
validation = frame.validate_not_null("age")
validation_summary = frame.validation_summary("age")
print(f"Validation summary: {validation_summary}")

๐Ÿš€ Performance Features

SIMD-Accelerated Operations

# Automatic SIMD optimization for numerical operations
large_data = [{"value": i * 1.5} for i in range(100000)]
large_frame = ft.TinyFrame.from_dicts(large_data)

# These operations use SIMD for maximum performance
sum_result = large_frame.groupby("value").agg([("value", "sum")])

Parallel Processing

# Multi-core operations for large datasets
# Automatically uses all available CPU cores
filtered = large_frame.filter("value", ">", 50000)
sorted_data = large_frame.sort_values("value")

Memory Optimization

# String interning and lazy evaluation
# Memory usage is automatically optimized
frame = ft.TinyFrame.from_dicts(records)
# Operations are optimized for memory efficiency

๐Ÿ› ๏ธ Developer Tools

Logging and Debugging

# Enable comprehensive logging
ft.init_logging_with_config("info", log_memory=True, log_performance=True, log_operations=True)

# Enable debug mode
ft.enable_debug()

# Enable profiling
ft.enable_profiling()

# Your operations will be logged and profiled
frame = ft.TinyFrame.from_dicts(data)
result = frame.filter("age", ">", 25)

# View profiling report
ft.print_profiling_report()

Performance Monitoring

# Get operation statistics
stats = ft.get_operation_stats("filter")
print(f"Filter operations: {stats}")

# Get overall performance metrics
overall_stats = ft.get_overall_stats()
print(f"Total operations: {overall_stats['total_operations']}")

โš™๏ธ Supported Types

Type Column variants Description
int Int, OptInt 64-bit integers with optional null support
float Float, OptFloat 64-bit floats with optional null support
bool Bool, OptBool Boolean values with optional null support
str Str, OptStr UTF-8 strings with optional null support
mixed Mixed, OptMixed Mixed types with automatic Python object fallback

๐Ÿ“š Documentation


๐Ÿ—๏ธ Build System & CI/CD

Automated Cross-Platform Builds

feathertail uses GitHub Actions to automatically build and test wheels for all major platforms:

  • 15 build configurations covering Python 3.8-3.12
  • 3 operating systems: Linux (Ubuntu), macOS (ARM64), Windows
  • Automated testing with wheel installation verification
  • Artifact management with 30-day retention
  • PyPI deployment on version tags

Build Matrix

Platform Python Versions Architecture
Ubuntu 3.8, 3.9, 3.10, 3.11, 3.12 x86_64
macOS 3.8, 3.9, 3.10, 3.11, 3.12 ARM64 (aarch64)
Windows 3.8, 3.9, 3.10, 3.11, 3.12 x86_64

Quality Assurance

  • โœ… Rust compilation with proper target architecture
  • โœ… Python wheel building with maturin
  • โœ… Installation testing from temp directories
  • โœ… Import verification to ensure module works correctly
  • โœ… Cross-platform compatibility testing

๐Ÿงช Testing

# Run all tests (239 tests in ~0.17 seconds)
make test

# Run specific test categories
python -m pytest tests/python/unit/test_tinyframe.py
python -m pytest tests/python/unit/test_joins.py
python -m pytest tests/python/unit/test_analytics.py

๐Ÿ—๏ธ Building from Source

# Clone the repository
git clone https://github.com/your-username/feathertail.git
cd feathertail

# Set up development environment
make dev

# Build the package
make build

# Run tests
make test

# Build documentation
make docs

๐Ÿ‰ Why "feathertail"?

In Fourth Wing, a "feathertail" is a juvenile dragon โ€” small, golden, and nonviolent, known for grace rather than brute force.

This library follows the same spirit: gentle on dependencies, elegant in design, and capable of handling complex data types with ease โ€” but with the power and performance of a full-grown dragon when you need it.


๐Ÿ“Š Performance Benchmarks

  • 239 comprehensive tests run in just 0.17 seconds
  • SIMD-accelerated numerical operations
  • Parallel processing for multi-core performance
  • Memory-optimized with string interning and lazy evaluation
  • Production-ready with comprehensive error handling and logging

โค๏ธ Contributing

Contributions, ideas, and feedback are always welcome! Please see our Contributing Guide for details.


๐Ÿ“„ License

MIT


๐ŸŽฏ Roadmap

  • Cross-platform PyPI builds - โœ… Automated builds for Linux, macOS, and Windows
  • Additional time series functions
  • More statistical distributions
  • Enhanced plotting integration
  • Database connectors
  • Arrow/Parquet integration

Built with โค๏ธ using Rust and Python# Trigger new build

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

feathertail-0.4.2-cp312-cp312-win_amd64.whl (453.9 kB view details)

Uploaded CPython 3.12Windows x86-64

feathertail-0.4.2-cp312-cp312-manylinux_2_34_x86_64.whl (620.3 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.34+ x86-64

feathertail-0.4.2-cp312-cp312-macosx_11_0_arm64.whl (547.5 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

feathertail-0.4.2-cp311-cp311-win_amd64.whl (455.3 kB view details)

Uploaded CPython 3.11Windows x86-64

feathertail-0.4.2-cp311-cp311-manylinux_2_34_x86_64.whl (622.8 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.34+ x86-64

feathertail-0.4.2-cp311-cp311-macosx_11_0_arm64.whl (550.6 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

feathertail-0.4.2-cp310-cp310-win_amd64.whl (455.4 kB view details)

Uploaded CPython 3.10Windows x86-64

feathertail-0.4.2-cp310-cp310-manylinux_2_34_x86_64.whl (622.1 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.34+ x86-64

feathertail-0.4.2-cp310-cp310-macosx_11_0_arm64.whl (550.5 kB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

feathertail-0.4.2-cp39-cp39-win_amd64.whl (456.6 kB view details)

Uploaded CPython 3.9Windows x86-64

feathertail-0.4.2-cp39-cp39-manylinux_2_34_x86_64.whl (622.8 kB view details)

Uploaded CPython 3.9manylinux: glibc 2.34+ x86-64

feathertail-0.4.2-cp39-cp39-macosx_11_0_arm64.whl (551.9 kB view details)

Uploaded CPython 3.9macOS 11.0+ ARM64

feathertail-0.4.2-cp38-cp38-win_amd64.whl (455.9 kB view details)

Uploaded CPython 3.8Windows x86-64

feathertail-0.4.2-cp38-cp38-manylinux_2_34_x86_64.whl (621.1 kB view details)

Uploaded CPython 3.8manylinux: glibc 2.34+ x86-64

feathertail-0.4.2-cp38-cp38-macosx_11_0_arm64.whl (550.7 kB view details)

Uploaded CPython 3.8macOS 11.0+ ARM64

File details

Details for the file feathertail-0.4.2-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for feathertail-0.4.2-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 f5916442d7710c9fcabab2726255b621b0f10acb40dc1b30755adbde4701b1d5
MD5 944483a18b92bfb46e3003e7c2b6378a
BLAKE2b-256 1ea661ee8979996188b419b55ef0fe2dded550f2bbe7236e7270b6a76bd24eb4

See more details on using hashes here.

File details

Details for the file feathertail-0.4.2-cp312-cp312-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for feathertail-0.4.2-cp312-cp312-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 a92e688f5db604e10049d08570c79fed440ea971a6eb74a4980a7d4959ee0891
MD5 fee71da9dd48efec074ae2db604ab4ef
BLAKE2b-256 42a8ae6e37de25f41724578556cf8e7638d445202121dd426dc596b52977ab79

See more details on using hashes here.

File details

Details for the file feathertail-0.4.2-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for feathertail-0.4.2-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 363f660e726e3bb7a144c5d55360eeea7f28d5ba45d6971e74517abee6470c6f
MD5 4037f8796349f1cf8a1b6c3f760844c1
BLAKE2b-256 5b230c00749cc2df56d75fcf425e684cd8c363705d04783081e40b4158a895a1

See more details on using hashes here.

File details

Details for the file feathertail-0.4.2-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for feathertail-0.4.2-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 d3ed38c04a77e7ac184bddf56208c6aabe823c6c39a783eabc7ec5b3821bc6c2
MD5 8896c0f4752ff5178807ad2fcba0e618
BLAKE2b-256 567d61cbb04a44e802e5ec00785c1c162b95c89d081da69f94de3500456a6d05

See more details on using hashes here.

File details

Details for the file feathertail-0.4.2-cp311-cp311-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for feathertail-0.4.2-cp311-cp311-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 4b79df64f0bb5df4705a1413d7728965a626f9c27dbf0578c9fbad2d3c7cac04
MD5 1a627dd7319ca94b3ffa56fb4d9255ab
BLAKE2b-256 ae123a6f20e636a663260f41ca9cc9e1ab0ba48c32a3a04a6b4341b1243d0b87

See more details on using hashes here.

File details

Details for the file feathertail-0.4.2-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for feathertail-0.4.2-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 b367163d9a043ba44d49744afd2c98fde806257db12a75fd402673e94f53eadb
MD5 d0a0661432e272e537aa01a7220c1697
BLAKE2b-256 db4d385c7ed37616a6ea8c5edb4757c425b0b684a775f6c7eed4bc65c5975d81

See more details on using hashes here.

File details

Details for the file feathertail-0.4.2-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for feathertail-0.4.2-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 46696a44b182fe4e2f29cd1ce6be267bc0f9dc6860fc7d88706cd54f88761e91
MD5 25f44aec2148a623dc866d799db7da9c
BLAKE2b-256 88e4f41b0d093d37a728d2054407e7096c95e7c27b38189265b6c2ae5bd0da52

See more details on using hashes here.

File details

Details for the file feathertail-0.4.2-cp310-cp310-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for feathertail-0.4.2-cp310-cp310-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 5b28a71beb8a7cf302d55cf7a3b0ca3bec65f591143bd18141f96b048feea19a
MD5 7acb54d1d2fea2ba5cae8598f0548fa7
BLAKE2b-256 60b4b4360534e98b1b21191d557fd8cfbd43d2d07cb7b4e6caa4fa0b440203c9

See more details on using hashes here.

File details

Details for the file feathertail-0.4.2-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for feathertail-0.4.2-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 2774f44c0194843557d9f0e8c1a93fea744e40ebc0ee02307bf3ed5b58fa5631
MD5 660f9068fab34b72cc0a4c8200edd232
BLAKE2b-256 0aabd07e3402a9ee0eefe77388f1aac2d3f435feabcd92c125d2b0ed82e6ef58

See more details on using hashes here.

File details

Details for the file feathertail-0.4.2-cp39-cp39-win_amd64.whl.

File metadata

  • Download URL: feathertail-0.4.2-cp39-cp39-win_amd64.whl
  • Upload date:
  • Size: 456.6 kB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for feathertail-0.4.2-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 d4e4a5fe170d9260e5d1d5957e8f533bf5b1a5ef2b75000c52a8656191097ba7
MD5 5acf525c56aa227cf9224769b73d73d5
BLAKE2b-256 6f1b22b669e963819f501c8ecdc57405ec77f57e73808c4a154e30e53c883194

See more details on using hashes here.

File details

Details for the file feathertail-0.4.2-cp39-cp39-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for feathertail-0.4.2-cp39-cp39-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 7c381cbfd0629d9a03f43443811772257e778a5c566fc0df5b624c4dee689cc0
MD5 9fca048511a411a3758c2c60f2001795
BLAKE2b-256 607e40eb1d511b6bdbeafde6b285a58a2407b77b02366c51cbdc1c77646c186d

See more details on using hashes here.

File details

Details for the file feathertail-0.4.2-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for feathertail-0.4.2-cp39-cp39-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 5c8e443007908d4b651836703668123a8ade3a0639a23f43f51c23ce4d8e9062
MD5 1f15280b3b6e0d9f7d1849d1a19f3701
BLAKE2b-256 857684b9d5a43cb63813e7df9c2760d9b53c578ed62228de9970c23d33e40fa1

See more details on using hashes here.

File details

Details for the file feathertail-0.4.2-cp38-cp38-win_amd64.whl.

File metadata

  • Download URL: feathertail-0.4.2-cp38-cp38-win_amd64.whl
  • Upload date:
  • Size: 455.9 kB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for feathertail-0.4.2-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 bfbcf0758dbc33934d903c22e52984d7801f60b0bd2d991e3eb556e74524738b
MD5 312f473585948a225041b7b6cc21e9aa
BLAKE2b-256 1c9395532c62213eb9f629d049a0dc1952ef6a0a2f04cfc0368b7941152837c6

See more details on using hashes here.

File details

Details for the file feathertail-0.4.2-cp38-cp38-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for feathertail-0.4.2-cp38-cp38-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 c149450dd2f696394f12992f5b19a770f1bd27473d85712cafe00912c2298eaf
MD5 98d4a632259abaf500bb1ee984d3ca6e
BLAKE2b-256 f8a47bf809b34354da840c3f82482819a5a3e7859b8b55e0e9e10bd653b38cd1

See more details on using hashes here.

File details

Details for the file feathertail-0.4.2-cp38-cp38-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for feathertail-0.4.2-cp38-cp38-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 df8b7dc6922573780dff1742cb913ba9a7a6c3700372b02b2f1f4e65fb0b22b9
MD5 7d60070044fefa632d319a9f1935e125
BLAKE2b-256 ca3c47204a4e7c61aaa239e65a7098a610490283d6dadae0a7dbdadaa4c5899c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page