Advanced data manipulation with XWNode integration, async operations, and universal format conversion
Project description
🚀 xwdata: Universal Data Engine with XWNode Integration
Company: eXonware.com
Author: Eng. Muhammad AlShehri
Email: connect@exonware.com
Version: 0.1.0.1
🎯 What is xwdata?
xwdata is the ultimate data manipulation engine that seamlessly combines format-agnostic operations, powerful graph navigation (XWNode), and intelligent orchestration into one async-first library. Load from any format, manipulate with confidence using copy-on-write semantics, and save to any format - all with one clean API.
The Problem We Solve
Traditional data libraries force you to:
- ❌ Learn different APIs for each format (json, yaml, xml, etc.)
- ❌ Write custom code for format conversions
- ❌ Deal with mutable state causing bugs
- ❌ Handle format-specific quirks manually
- ❌ Build complex navigation logic for nested data
The xwdata Solution
✅ One API for all formats - Load JSON, save as YAML, convert to XML
✅ Ultra-fast multi-format - 0.15-0.21ms for all 30+ formats
✅ V8 advanced features - Partial access, typed loading, canonical hashing (all formats!)
✅ XWNode integration - Powerful path navigation and graph operations
✅ Copy-on-write semantics - Safe concurrent access, immutable by default
✅ Universal metadata - Perfect roundtrips preserve format-specific features
✅ Async by design - High-performance async operations throughout
✅ Engine orchestration - Reuses xwsystem serialization (30+ formats!)
✅ Reference resolution - Automatic handling of $ref, @href, *anchors
✅ Beats V7 performance - 24-67% faster on medium/large files!
⚡ Quick Start
Installation
# Lite (Default) - Core Only
pip install exonware-xwdata
# Lazy (Recommended for Development) - Auto-install on demand
pip install exonware-xwdata[lazy]
# Full (Recommended for Production) - All dependencies pre-installed
pip install exonware-xwdata[full]
Basic Usage
from exonware.xwdata import XWData
# === Synchronous Creation ===
# From native Python data
data = XWData({'name': 'Alice', 'age': 30, 'city': 'NYC'})
# Get values (async)
import asyncio
name = asyncio.run(data.get('name')) # 'Alice'
# === Async Operations ===
async def main():
# Load from file (any format!)
data = await XWData.load('config.json')
# Navigate and modify (copy-on-write!)
data = await data.set('api.timeout', 30)
data = await data.set('api.retries', 3)
# Save to different format
await data.save('config.yaml') # JSON → YAML conversion!
await data.save('config.xml') # → XML too!
asyncio.run(main())
🌟 Key Features
1. Format-Agnostic Operations
# Load from any format
data = await XWData.load('config.json') # JSON
data = await XWData.load('config.yaml') # YAML
data = await XWData.load('config.xml') # XML
data = await XWData.load('config.toml') # TOML
# Save to any format
await data.save('output.json') # → JSON
await data.save('output.yaml') # → YAML
await data.save('output.xml') # → XML
Supported Formats:
- Text: JSON, YAML, XML, TOML, CSV, INI
- Extended: JSON5 (with comments), JSONL (streaming)
- Binary: BSON, MessagePack, Pickle (via xwsystem)
- Schema-based: Avro, Protobuf, Parquet (via xwsystem)
2. XWNode Integration - Powerful Navigation
# Create from nested data
data = XWData({
'users': [
{'name': 'Alice', 'age': 30},
{'name': 'Bob', 'age': 25}
]
})
# Navigate with paths
alice_age = await data.get('users.0.age') # 30
# Check existence
has_email = await data.exists('users.0.email') # False
# Copy-on-write mutations
data = await data.set('users.0.city', 'NYC')
data = await data.delete('users.1')
3. Copy-on-Write Semantics - Safe Concurrency
# Original data
data1 = XWData({'counter': 0})
# Modify creates new instance
data2 = await data1.set('counter', 1)
data3 = await data1.set('counter', 2)
# Original unchanged
assert await data1.get('counter') == 0
assert await data2.get('counter') == 1
assert await data3.get('counter') == 2
4. Multi-Source Merging
# Merge multiple sources intelligently
data = XWData([
{'base': 'config'}, # Base dict
'overrides.yaml', # Load and merge file
existing_xwdata_instance, # Merge another XWData
{'final': 'override'} # Final overrides
], merge_strategy='deep')
5. Async-First Design
# All I/O operations are async
async def process_configs():
# Load multiple files concurrently
config1 = await XWData.load('config1.json')
config2 = await XWData.load('config2.yaml')
# Merge them
merged = await config1.merge(config2)
# Transform
transformed = await merged.transform(lambda d: {
k.upper(): v for k, v in d.items()
})
# Save results
await transformed.save('result.json')
🏗️ Architecture
Engine Pattern (Inspired by xwquery)
XWData (facade) → XWDataEngine (orchestrator) → Services
↓
XWSerializer (xwsystem - reuse!)
↓
Format Strategies (metadata & references)
↓
XWNode (xwnode - navigation)
Components:
- XWData - User-facing facade with fluent API
- XWDataEngine - Core orchestrator (the brain)
- XWSerializer - Format I/O from xwsystem (reused, not duplicated)
- FormatStrategies - Lightweight format-specific logic (50 lines each)
- XWDataNode - Extends XWNode with COW and metadata
- Services - Metadata, References, Caching, Monitoring
No Handler Duplication: xwdata doesn't reimplement serialization - it orchestrates xwsystem's battle-tested serializers and adds data manipulation features on top!
📚 Advanced Features
Universal Metadata - Perfect Roundtrips
# Preserves format-specific semantics
data = await XWData.load('schema.json') # Has $ref, @id
await data.save('schema.xml') # Converts to @href, preserves meaning
await data.save('schema.json') # Perfect roundtrip!
Reference Resolution
from exonware.xwdata import XWData, XWDataConfig, ReferenceConfig
# Configure reference resolution
config = XWDataConfig.default()
config.reference = ReferenceConfig.eager() # Resolve immediately
# Load file with $ref, @href, *anchor references
data = await XWData.load('schema.json', config=config)
# References automatically detected and resolved!
Performance Caching
config = XWDataConfig.fast() # Enable all caching
# First load - cache miss
data1 = await XWData.load('large.json', config=config)
# Second load - cache hit (instant!)
data2 = await XWData.load('large.json', config=config)
Streaming Large Files
# Stream large JSONL files
async for chunk in XWData.stream_load('huge_data.jsonl'):
process(chunk)
🎓 Configuration
Presets
from exonware.xwdata import XWDataConfig
# Default balanced configuration
config = XWDataConfig.default()
# High security for untrusted data
config = XWDataConfig.strict()
# High performance for speed
config = XWDataConfig.fast()
# Development mode with debugging
config = XWDataConfig.development()
Custom Configuration
from exonware.xwdata import (
XWDataConfig, SecurityConfig, PerformanceConfig,
ReferenceConfig, MetadataConfig, COWConfig
)
config = XWDataConfig(
security=SecurityConfig(max_file_size_mb=50),
performance=PerformanceConfig.fast(),
reference=ReferenceConfig.lazy(),
metadata=MetadataConfig.full(),
cow=COWConfig.immutable()
)
🔧 Development
# Install in development mode
pip install -e .
# Run tests
python tests/runner.py
# Run specific test layers
python tests/runner.py --core # Fast core tests
python tests/runner.py --unit # Unit tests
python tests/runner.py --integration # Integration tests
# Run verification
python tests/verify_installation.py
🚀 Project Phases
Current Phase: 🧪 Version 0 - Experimental Stage
- Focus: Engine architecture, async operations, xwsystem integration
- Status: 🟢 ACTIVE - Foundation complete with engine pattern
Development Roadmap:
- Version 1 (Q1 2026): Production Ready - Enterprise deployment
- Version 2 (Q2 2026): Mars Standard Draft - Cross-platform interoperability
- Version 3 (Q3 2026): RUST Core & Facades - High-performance multi-language
- Version 4 (Q4 2026): Mars Standard Implementation - Full compliance
📖 View Complete Project Phases
🤝 Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests
- Run the test suite
- Submit a pull request
📄 License
MIT License - see LICENSE file for details.
🔗 eXonware Ecosystem
xwdata integrates seamlessly with:
- xwsystem - Core utilities, serialization (24+ formats), security
- xwnode - Node structures (57 strategies), graph operations
- xwquery - Query languages (35+ languages) - Coming soon!
- xwschema - Schema validation - Coming soon!
Built with ❤️ by eXonware.com - Making data manipulation effortless
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file exonware_xwdata-0.1.0.1.tar.gz.
File metadata
- Download URL: exonware_xwdata-0.1.0.1.tar.gz
- Upload date:
- Size: 125.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.8.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8c51d09f2b860a3dda94c61e7a533b80500f25c2b037fd4c0020fd6f2a5a6de9
|
|
| MD5 |
0907cc24c5f8d7624e92aa556a8fa08c
|
|
| BLAKE2b-256 |
a5ddef702ff2e43784e8c2bcc12dfb13bcd82826e20652d40f3fba4f739be14d
|
File details
Details for the file exonware_xwdata-0.1.0.1-py3-none-any.whl.
File metadata
- Download URL: exonware_xwdata-0.1.0.1-py3-none-any.whl
- Upload date:
- Size: 117.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.8.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
530bb6f09b2ba55b1f752c32e986923b4a34d33d8652608a573d2b86e27b5b44
|
|
| MD5 |
9f6544a0d14ba918c14e74033e19a123
|
|
| BLAKE2b-256 |
7be6c405e6a13b189db0bd1de8328aef0d345d26f056609cee8cac5b5a05ccac
|