Skip to main content

jn: plugin-based, universal data streaming system using JSONL/ND format.

Project description

JN (Junction) - Agent-Native ETL with JSON Pipelines

Version: 4.0.0-alpha1 Status: Active Development

A lightweight ETL framework where JSON Lines is the universal data format. Built for agents and humans who need to move data between formats, APIs, databases, and commands without writing bespoke scripts.

Philosophy

Three core principles:

  1. JSON Lines Everywhere - Universal data interchange format on the CLI
  2. Discoverable Without Execution - Tools are files on disk with parseable headers
  3. Automatic Pipeline Construction - Framework wires together sources → filters → targets

Inspired by Kelly Brazil's jc philosophy: make every command line tool speak JSON.

Quick Start

Installation

# Clone the repository
git clone https://github.com/yourusername/jn.git
cd jn

# Install in development mode
pip install -e .

# Verify installation
jn --version

Your First Pipeline

# Convert CSV to JSON
jn cat data.csv | jn put output.json

# Preview first 5 records
jn cat data.csv --limit 5

# Transform and filter
echo '{"name":"Alice","age":30}' | jn cat - | jq '.age' | jn put -

⚠️ Golden Path: Use JN Commands, Not Direct Plugin Calls

✅ CORRECT - Use jn commands:

jn cat data.csv | jn put output.json         # Auto-detects format
jn cat data.csv | jn filter '.age > 25'      # Streams efficiently
jn cat data.txt~csv?delimiter=tab            # Format override with params

❌ WRONG - Don't call plugins directly:

python csv_.py --mode read < data.csv        # Bypasses framework!
uv run csv_.py --mode read < data.csv        # Loses backpressure!

Why? Direct plugin calls bypass:

  • Automatic backpressure and memory efficiency
  • Early termination support (| head -n 10 stops upstream)
  • Parallel multi-stage execution
  • Better error messages and plugin discovery

Core Commands

Data Exploration

jn cat <source> - Read any source, output NDJSON

jn cat data.csv              # CSV file
jn cat config.yaml           # YAML file
jn cat https://api.com/data  # HTTP API
jn cat data.json --limit 10  # Preview first 10 records

jn put <output> - Write NDJSON to any format

jn cat data.csv | jn put output.json    # CSV → JSON
jn cat api | jn put data.yaml            # API → YAML
echo '{"x":1}' | jn put -                # Format to stdout

jn run <input> [filters...] <output> - Automatic pipeline

jn run data.csv output.json              # Simple conversion
jn run data.csv '.name' summary.xml      # With jq filter
jn run 'ls -la' output.csv               # Command output

Plugin Discovery

jn discover - List all available plugins

jn discover                   # All plugins
jn discover --type source     # Only sources
jn discover --category readers # Only readers

jn show <plugin> - Plugin details

jn show csv_reader           # Show plugin info
jn show csv_reader --examples # Show usage examples
jn show csv_reader --test    # Run plugin tests

jn which <extension> - Find plugin for extension

jn which .csv                # → csv_reader
jn which .yaml               # → yaml_reader

Plugin Development

jn create <type> <name> - Scaffold new plugin

jn create source my_reader --handles .txt
jn create filter my_transform
jn create target my_writer --handles .out

jn test <plugin> - Run plugin tests

jn test csv_reader           # Run built-in tests
jn test my_plugin --verbose  # Detailed output

jn validate <file> - Check plugin structure

jn validate plugins/readers/my_reader.py
jn validate my_plugin.py --strict

Supported Formats

Format Extension Reader Writer
CSV .csv
TSV .tsv
JSON .json
NDJSON .jsonl
YAML .yaml,.yml
XML .xml
TOML .toml

Plugin Ecosystem

19 Built-in Plugins:

Readers (8):

  • csv_reader - CSV/TSV files
  • json_reader - JSON/NDJSON files
  • yaml_reader - YAML files
  • xml_reader - XML files
  • toml_reader - TOML config files
  • http_get - HTTP APIs (GET)
  • ls - Directory listings
  • Plus 6 more shell command parsers (ps, find, env, df, ping, netstat, dig)

Writers (6):

  • csv_writer - CSV/TSV output
  • json_writer - JSON array output
  • yaml_writer - YAML output
  • xml_writer - XML output

Filters (1):

  • jq_filter - jq expression evaluation

Examples

Data Conversion

# CSV to multiple formats
jn cat sales.csv | jn put sales.json
jn cat sales.csv | jn put sales.yaml
jn cat sales.csv | jn put sales.xml

# API to database-ready CSV
jn cat https://api.com/users | jq '.items[]' | jn put users.csv

# Config file normalization
jn cat config.yaml | jn put config.json

System Monitoring

# Process list to JSON
jn cat 'ps aux' | jn put processes.json

# Disk usage analysis
jn cat 'df -h' | jq 'select(.use_percent > 80)' | jn put full_disks.csv

# Network connections
jn cat 'netstat -an' | jn put connections.json

Data Pipelines

# Filter and transform
jn run users.csv 'select(.age > 18)' adults.json

# Multi-step processing
jn cat data.csv | \
  jq 'select(.amount > 100)' | \
  jq '{customer, total: .amount}' | \
  jn put high_value.xml

# Combine sources
cat <(jn cat file1.csv) <(jn cat file2.yaml) | jn put combined.json

Creating Custom Plugins

1. Scaffold Plugin

jn create source my_api --description "Custom API reader"
# Created: plugins/readers/my_api.py

2. Implement Logic

Edit plugins/readers/my_api.py:

def run(config: Optional[dict] = None) -> Iterator[dict]:
    """Fetch data from custom API."""
    import requests

    response = requests.get('https://my-api.com/data')
    data = response.json()

    for item in data['items']:
        yield item

3. Add Tests

def examples() -> list[dict]:
    return [
        {
            "description": "Fetch user data",
            "input": "",
            "expected": [
                {"id": 1, "name": "Alice"},
                {"id": 2, "name": "Bob"}
            ]
        }
    ]

4. Test & Validate

jn test my_api
jn validate plugins/readers/my_api.py

5. Use in Pipelines

jn cat my_api | jn put users.csv

Architecture

Function-Based Plugins - No classes, just functions:

  • run(config) - Main processing logic
  • examples() - Test cases (optional)
  • test() - Built-in tests (optional)

Regex-Based Discovery - No Python imports needed:

  • Plugins discovered by scanning filesystem
  • Metadata parsed from # META: headers
  • Fast discovery (~10ms for 19 plugins)

Subprocess Isolation - Each plugin runs independently:

  • PEP 723 inline dependencies
  • UV manages per-plugin environments
  • No dependency conflicts

Unix Pipes - Standard composition:

plugin1 < input.txt | plugin2 | plugin3 > output.json

Development Status

Current (v4.0.0-alpha1):

  • ✅ Core pipeline framework
  • ✅ 19 working plugins
  • ✅ Full CLI (10 commands)
  • ✅ Plugin creation tools
  • ✅ 105 tests passing (78% coverage)

Coming Soon:

  • Excel reader/writer
  • Database plugins (PostgreSQL, MySQL, SQLite)
  • S3 integration
  • API authentication
  • Advanced filters (aggregations, group-by)

Contributing

See docs/plugins.md for plugin authoring guide.

License

MIT License - see LICENSE file for details.

Acknowledgments

  • Kelly Brazil - jc philosophy and JSON CLI tooling
  • Anthropic - MCP inspiration for agent-native design
  • UV - Modern Python packaging

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jn_cli-0.0.0.dev2.tar.gz (119.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

jn_cli-0.0.0.dev2-py3-none-any.whl (161.1 kB view details)

Uploaded Python 3

File details

Details for the file jn_cli-0.0.0.dev2.tar.gz.

File metadata

  • Download URL: jn_cli-0.0.0.dev2.tar.gz
  • Upload date:
  • Size: 119.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for jn_cli-0.0.0.dev2.tar.gz
Algorithm Hash digest
SHA256 1b8a1e63880949d04837df5986c3624cf956ae710da106ba5cba754bfb75ca77
MD5 26bcb3f0992ae7761cb6bdd56dbf4544
BLAKE2b-256 ab2836a08b3142c631c0033508769ae3715fc2abb085d73731e938ca114b3f73

See more details on using hashes here.

Provenance

The following attestation bundles were made for jn_cli-0.0.0.dev2.tar.gz:

Publisher: on-release-main.yml on botassembly/jn

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file jn_cli-0.0.0.dev2-py3-none-any.whl.

File metadata

  • Download URL: jn_cli-0.0.0.dev2-py3-none-any.whl
  • Upload date:
  • Size: 161.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for jn_cli-0.0.0.dev2-py3-none-any.whl
Algorithm Hash digest
SHA256 382cd163997162e4a17be3790e628f70e5a6460de93a7a154ec8fd54e69f8f9a
MD5 bf8d695b8d2a8f84a3c6ec3764a9dfaf
BLAKE2b-256 375fac76df3e6171a0c7393b6921962c0d6310319e55953afedfca4b8998018f

See more details on using hashes here.

Provenance

The following attestation bundles were made for jn_cli-0.0.0.dev2-py3-none-any.whl:

Publisher: on-release-main.yml on botassembly/jn

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page