Elegant data operations for DataFrames - add.to(), add.transform(), add.synthetic()
Project description
additory - Rust Core
Elegant data operations for DataFrames.
Three Functions Only
add.to()- Add data FROM external sourceadd.transform()- Transform data WITHIN DataFrameadd.synthetic()- Create or augment with synthetic data
Project Structure
rust-core/
├── src/
│ ├── lib.rs # Main library entry
│ ├── core/ # Core functionality
│ │ ├── dataframe.rs # DataFrame abstraction
│ │ ├── types.rs # Type definitions
│ │ └── errors.rs # Error types
│ ├── utils/ # Utilities
│ │ ├── validation.rs # Input validation
│ │ ├── logging.rs # Logging system
│ │ └── type_detection.rs
│ ├── to/ # add.to() implementation
│ ├── transform/ # add.transform() implementation
│ └── synthetic/ # add.synthetic() implementation
├── Cargo.toml
└── README.md
Building
cargo build
cargo test
cargo build --release
Status
Phase 1: Core Infrastructure - ✅ Complete
Phase 2: Initial Features - In Progress
Core Infrastructure (Complete)
- ✅ Project structure
- ✅ Error types
- ✅ Core types (Mode, FetchColumn, UniversalParams)
- ✅ DataFrame abstraction
- ✅ Validation utilities
- ✅ Logging utilities
- ✅ Type detection utilities
Implemented Features
- ✅ add.to() LOOKUP mode - Add columns from reference DataFrame (5 tests passing)
- ✅ add.to() @merge mode - Merge multiple DataFrames (9 tests passing)
- ✅ add.transform() @filter mode - Filter rows and select columns (10 tests passing)
- ✅ add.transform() @sort mode - Sort rows by column(s) (8 tests passing)
- ✅ add.transform() @transpose mode - Transpose DataFrame (6 tests passing)
- ✅ add.transform() @aggregate mode - Group and aggregate data (10 tests passing)
- ✅ add.transform() @split mode - Split text column into multiple columns (7 tests passing)
- ✅ add.transform() @calc mode - Calculate new columns from expressions (8 tests passing)
- ✅ add.transform() @extract mode - Extract datetime components (9 tests passing)
- ✅ add.transform() @onehot mode - One-hot encoding for categorical columns (7 tests passing)
- ✅ add.transform() @label mode - Label encoding for categorical columns (6 tests passing)
- ✅ add.transform() @harmonize mode - Unit conversions (8 tests passing)
- ✅ add.transform() @knn mode - K-Nearest Neighbors imputation (27 tests passing, pure Python)
- ✅ add.synthetic() @new mode - Create synthetic DataFrames with 7 distributions + date/time + patterns (18 tests passing)
Test Status
- Total Tests: 163 passing
- Core Tests: 25 passing
- LOOKUP Tests: 5 passing
- @merge Tests: 9 passing
- @filter Tests: 10 passing
- @sort Tests: 8 passing
- @transpose Tests: 6 passing
- @aggregate Tests: 10 passing
- @split Tests: 7 passing
- @calc Tests: 8 passing
- @extract Tests: 9 passing
- @onehot Tests: 7 passing
- @label Tests: 6 passing
- @harmonize Tests: 8 passing
- @knn Tests: 27 passing (pure Python)
- @new (synthetic) Tests: 18 passing
Next Steps
- ✅
Implement add.to() LOOKUP mode - ✅
Implement add.to() @merge mode - ✅
Implement add.transform() @filter mode - ✅
Implement add.transform() @sort mode - ✅
Implement add.transform() @transpose mode - ✅
Implement add.transform() @aggregate mode - ✅
Implement add.transform() @split mode - ✅
Implement add.transform() @calc mode (basic version) - ✅
Implement add.transform() @extract mode (datetime only) - ✅
Implement add.transform() @onehot mode - ✅
Implement add.transform() @label mode - ✅
Implement add.transform() @harmonize mode - ✅
Implement add.transform() @knn mode (pure Python) - ✅
Implement add.synthetic() @new mode (basic version) - Expand add.synthetic() @new mode (more distributions, patterns, dates)
- Implement add.synthetic() augment mode
- Implement add.synthetic() @analyze mode
- Add Python bindings (PyO3)
- Enhance @calc with expression namespaces
- Add text extraction to @extract mode
Documentation
See shadow_library/ for comprehensive documentation of all modules.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
No source distribution files available for this release.See tutorial on generating distribution archives.
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file additory-0.1.3a2-cp313-cp313-manylinux_2_34_x86_64.whl.
File metadata
- Download URL: additory-0.1.3a2-cp313-cp313-manylinux_2_34_x86_64.whl
- Upload date:
- Size: 11.4 MB
- Tags: CPython 3.13, manylinux: glibc 2.34+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c10edae18e8a8a2c702940cdfeaa7c2b887de7a02a3236b9e72cc07f4fa66eb0
|
|
| MD5 |
e0ed335716bf21dbcac49c6d1baec791
|
|
| BLAKE2b-256 |
691d747adc4641556f6a1e144e3174f319515e2bf0fa0f56e70a4678c8f85012
|