Skip to main content

A meta-package to install the complete dot ecosystem.

Project description

The Dot Ecosystem

"What started as a single, humble function evolved into a complete, coherent ecosystem for manipulating data structures, a journey in API design guided by the principles of purity, pedagogy, and the principle of least power."

The dot ecosystem is a suite of composable tools for working with nested data structures like JSON, YAML, and Python dictionaries. Each tool follows the Unix philosophy: it does one thing exceptionally well, and they're designed to work together seamlessly.

Installation

# Install from PyPI
pip install dotsuite

Install from Source

# Clone and install
git clone https://github.com/queelius/dotsuite.git
cd dotsuite
pip install -e .

# For development with testing tools
make install-dev

Publishing to PyPI

# Build and publish (for maintainers)
make build         # Build distribution packages
make publish-test  # Publish to TestPyPI
make publish       # Publish to PyPI

Motivation

It always starts with a simple problem. You have a nested dictionary or JSON payload, and you need to get a value buried deep inside. You write data['user']['contacts'][0]['email'] and pray that no key or index is missing along the way, lest your program crash with a KeyError. This leads to brittle, defensive code full of try/except blocks.

What began as a simple helper function, dotget, evolved through questions and insights into a complete ecosystem. The result is a mathematically grounded, pedagogically structured collection of tools that makes data manipulation predictable, safe, and expressive.

The Four Pillars

The ecosystem is built on four fundamental pillars, each answering a core question about data:

Depth Pillar: "Where is the data?"

Tools for finding and extracting values from within documents.

Truth Pillar: "Is this assertion true?"

Tools for asking boolean questions and validating data.

Shape Pillar: "How should the data be transformed?"

Tools for reshaping and modifying data structures.

Collections Pillar: "How do documents relate?"

Tools for lifting single-document operations to collections via boolean and relational algebra.

Quick Start

import sys
sys.path.insert(0, 'src')  # If running from repo root

# Import from the four pillars
from depth.dotget.core import get
from depth.dotstar.core import search
from truth.dotquery.core import Q
from shape.dotmod.core import set_

# Simple exact addressing
data = {"users": [{"name": "Alice", "role": "admin"}]}
name = get(data, "users.0.name")  # "Alice"

# Pattern matching with wildcards
all_names = search(data, "users.*.name")  # ["Alice"]

# Boolean logic queries (fluent factory)
is_admin = Q("users.0.role").equals("admin").check(data)  # True

# Immutable modifications
new_data = set_(data, "users.0.status", "active")

The Tools

Depth Pillar: Addressing & Extraction

Tool Purpose Example
dotget Simple exact paths get(data, "user.name")
dotstar Wildcard patterns search(data, "users.*.name")
dotselect Advanced selection with predicates find_first(data, "users[role=admin].name")
dotpath Extensible path engine Powers other tools, JSONPath-compatible

Philosophy: Start simple with dotget for known paths, add dotstar for patterns, use dotselect for complex queries. The dotpath engine underpins them all with extensible, Turing-complete addressing.

Truth Pillar: Logic & Validation

Tool Purpose Example
dotexists Path existence check(data, "user.email")
dotequals Path equals a value equals(data, "user.role", "admin")
dotany Existential quantifier any_match(users, "role", "admin")
dotall Universal quantifier all_match(users, "active", True)
dotquery Compositional logic engine Q("users.*.role").equals("admin")

Philosophy: Boolean questions should be separate from data extraction. Start with dotexists or dotequals for simple checks, lift to collections with dotany/dotall, and compose complex logic with dotquery.

Shape Pillar: Transformation & Mutation

Tool Purpose Example
dotmod Surgical modifications set_(data, "user.status", "inactive")
dotbatch Atomic transactions Apply multiple changes safely
dotpipe Data transformation pipelines Reshape documents into new forms
dotpluck Extract multiple values Project selected paths into a new structure

Philosophy: Immutable by default. dotmod for precise edits, dotbatch for transactional safety, dotpipe for creating new data shapes, dotpluck for projection.

Collections Pillar: Boolean & Relational Algebra

Tool Purpose Domain
dotfilter Boolean algebra on document collections Filter, intersect, union with lazy evaluation
dotrelate Relational operations Join, project, union collections like database tables

Philosophy: Lift single-document operations to collections. dotfilter provides set operations with boolean logic, while dotrelate enables database-style joins and projections.

Design Principles

  • Compositionality: Every tool composes cleanly with others
  • Immutability: Original data is never modified
  • Pedagogical: Simple tools graduate to powerful ones
  • Single Purpose: Each tool has one clear responsibility
  • Interoperability: Common patterns work across all tools
  • Performance: Lazy evaluation and efficient algorithms
  • Safety: Graceful handling of missing paths and malformed data

Common Patterns

The "Steal This Code" Philosophy

Many tools are intentionally simple enough that you can copy their core logic rather than add a dependency:

# The essence of dotget
def get(data, path, default=None):
    try:
        for segment in path.split('.'):
            data = data[int(segment)] if segment.isdigit() and isinstance(data, list) else data[segment]
        return data
    except (KeyError, IndexError, TypeError):
        return default

Command-Line First

Every tool works from the command line, making them perfect for shell scripts and data pipelines:

# Check if any user is an admin
cat users.json | dotany users.*.role --equals admin && echo "Admin found"

# Extract all email addresses
cat contacts.json | dotstar "contacts.*.email" > emails.txt

# Join users with their orders
dotrelate join --left-on user_id --right-on user_id users.jsonl orders.jsonl

Dual APIs: Programmatic and Declarative

dotquery offers both a fluent Python builder and a declarative DSL:

from truth.dotquery.core import Q, Query

# Programmatic (fluent factory with operator overloading)
query = Q("role").equals("admin") & Q("login_count").greater(10)

# Declarative (DSL string)
query = Query("equals role admin and greater login_count 10")

# Both produce equivalent ASTs

From Simple to Sophisticated

The ecosystem is designed as a learning journey:

  1. Hello World: dotget, dotexists: O(1) mental load
  2. Patterns: Add dotstar, dotmod, dotequals: wildcards and basic changes
  3. Quantifiers: dotany, dotall, dotpluck, dotbatch: lift to collections, batch edits
  4. Power User: dotselect, dotquery, dotpipe: complex selection and reshaping
  5. Expert: dotpath, dotfilter, dotrelate: extensible engine and relational algebra

Each stage builds on the previous, with no tool becoming obsolete. A dotget call is still the right choice when you know the exact path.

Mathematical Foundation

The ecosystem is built on solid mathematical principles:

  • Addressing forms a free algebra on selectors (Turing-complete via user-defined reducers)
  • Logic implements Boolean algebra with homomorphic lifting to set operations
  • Transformations are endofunctors on document spaces with monoid composition
  • Collections lift via functorial map/filter operations preserving algebraic structure

This ensures predictable composition, parallelizability, and mathematical correctness.

Individual Tool Documentation

Each tool has comprehensive documentation under docs/tools/:

Production-Ready Alternative

While dotsuite focuses on pedagogy and simplicity, for production use cases requiring advanced features like streaming, complex path operations, and S-expression queries, consider JAF (Just Another Flow). JAF implements similar concepts to dotfilter and dotpipe in a feature-complete, production-ready package with:

  • Lazy streaming evaluation for large datasets
  • Advanced path system with regex, fuzzy matching, and wildcards
  • S-expression query language
  • Index-preserving result sets for powerful set operations
  • Support for multiple data sources (files, directories, stdin, compressed)

Think of dotsuite as the "learn by building" approach and JAF as the "battle-tested solution", both valuable for different purposes.

Contributing

The dot ecosystem welcomes contributions. Each tool lives in its own directory with its own tests and documentation. See CONTRIBUTING.md for guidelines.

License

MIT License. Use freely, modify as needed, and contribute back when you can.


The dot ecosystem: from simple paths to sophisticated data algebras, one tool at a time.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dotsuite-0.9.1b1.tar.gz (481.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dotsuite-0.9.1b1-py2.py3-none-any.whl (56.5 kB view details)

Uploaded Python 2Python 3

File details

Details for the file dotsuite-0.9.1b1.tar.gz.

File metadata

  • Download URL: dotsuite-0.9.1b1.tar.gz
  • Upload date:
  • Size: 481.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for dotsuite-0.9.1b1.tar.gz
Algorithm Hash digest
SHA256 e5cbdaa7333baebe41fc1040a1c5aefe2aaeb2daa96b22b6b105d35d5fc078ec
MD5 90d09646636ea935cb63abbb9bb2ea52
BLAKE2b-256 4c340574806ad650d03858aae56d6bb71da03809a4fd6cfaa55cd96d4eb9089e

See more details on using hashes here.

File details

Details for the file dotsuite-0.9.1b1-py2.py3-none-any.whl.

File metadata

  • Download URL: dotsuite-0.9.1b1-py2.py3-none-any.whl
  • Upload date:
  • Size: 56.5 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for dotsuite-0.9.1b1-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 3d3f9d8572a986ecc0f89285c0561a86b6111df9f828c621eaae2b85dc2be409
MD5 8b6a4f8efb8a189709f32480cd90cb0f
BLAKE2b-256 a768ed9d403c4e463ea7082ce01a590a834895544edd06c50477abcffcb3fc01

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page