Skip to main content

jn: plugin-based, universal data streaming system using JSONL/ND format.

Project description

JN (Junction) - Agent-Native ETL with JSON Pipelines

Version: 4.0.0-alpha1 Status: Active Development

A lightweight ETL framework where JSON Lines is the universal data format. Built for agents and humans who need to move data between formats, APIs, databases, and commands without writing bespoke scripts.

Philosophy

Three core principles:

  1. JSON Lines Everywhere - Universal data interchange format on the CLI
  2. Discoverable Without Execution - Tools are files on disk with parseable headers
  3. Automatic Pipeline Construction - Framework wires together sources → filters → targets

Inspired by Kelly Brazil's jc philosophy: make every command line tool speak JSON.

Quick Start

Installation

# Clone the repository
git clone https://github.com/yourusername/jn.git
cd jn

# Install in development mode
pip install -e .

# Verify installation
jn --version

Your First Pipeline

# Convert CSV to JSON
jn cat data.csv | jn put output.json

# Preview first 5 records
jn cat data.csv --limit 5

# Transform and filter
echo '{"name":"Alice","age":30}' | jn cat - | jq '.age' | jn put -

⚠️ Golden Path: Use JN Commands, Not Direct Plugin Calls

✅ CORRECT - Use jn commands:

jn cat data.csv | jn put output.json         # Auto-detects format
jn cat data.csv | jn filter '.age > 25'      # Streams efficiently
jn cat data.txt~csv?delimiter=tab            # Format override with params

❌ WRONG - Don't call plugins directly:

python csv_.py --mode read < data.csv        # Bypasses framework!
uv run csv_.py --mode read < data.csv        # Loses backpressure!

Why? Direct plugin calls bypass:

  • Automatic backpressure and memory efficiency
  • Early termination support (| head -n 10 stops upstream)
  • Parallel multi-stage execution
  • Better error messages and plugin discovery

Core Commands

Data Exploration

jn cat <source> - Read any source, output NDJSON

jn cat data.csv              # CSV file
jn cat config.yaml           # YAML file
jn cat https://api.com/data  # HTTP API
jn cat data.json --limit 10  # Preview first 10 records

jn put <output> - Write NDJSON to any format

jn cat data.csv | jn put output.json    # CSV → JSON
jn cat api | jn put data.yaml            # API → YAML
echo '{"x":1}' | jn put -                # Format to stdout

jn run <input> [filters...] <output> - Automatic pipeline

jn run data.csv output.json              # Simple conversion
jn run data.csv '.name' summary.xml      # With jq filter
jn run 'ls -la' output.csv               # Command output

Plugin Discovery

jn discover - List all available plugins

jn discover                   # All plugins
jn discover --type source     # Only sources
jn discover --category readers # Only readers

jn show <plugin> - Plugin details

jn show csv_reader           # Show plugin info
jn show csv_reader --examples # Show usage examples
jn show csv_reader --test    # Run plugin tests

jn which <extension> - Find plugin for extension

jn which .csv                # → csv_reader
jn which .yaml               # → yaml_reader

Plugin Development

jn create <type> <name> - Scaffold new plugin

jn create source my_reader --handles .txt
jn create filter my_transform
jn create target my_writer --handles .out

jn test <plugin> - Run plugin tests

jn test csv_reader           # Run built-in tests
jn test my_plugin --verbose  # Detailed output

jn validate <file> - Check plugin structure

jn validate plugins/readers/my_reader.py
jn validate my_plugin.py --strict

Supported Formats

Format Extension Reader Writer
CSV .csv
TSV .tsv
JSON .json
NDJSON .jsonl
YAML .yaml,.yml
XML .xml
TOML .toml

Plugin Ecosystem

19 Built-in Plugins:

Readers (8):

  • csv_reader - CSV/TSV files
  • json_reader - JSON/NDJSON files
  • yaml_reader - YAML files
  • xml_reader - XML files
  • toml_reader - TOML config files
  • http_get - HTTP APIs (GET)
  • ls - Directory listings
  • Plus 6 more shell command parsers (ps, find, env, df, ping, netstat, dig)

Writers (6):

  • csv_writer - CSV/TSV output
  • json_writer - JSON array output
  • yaml_writer - YAML output
  • xml_writer - XML output

Filters (1):

  • jq_filter - jq expression evaluation

Examples

Data Conversion

# CSV to multiple formats
jn cat sales.csv | jn put sales.json
jn cat sales.csv | jn put sales.yaml
jn cat sales.csv | jn put sales.xml

# API to database-ready CSV
jn cat https://api.com/users | jq '.items[]' | jn put users.csv

# Config file normalization
jn cat config.yaml | jn put config.json

System Monitoring

# Process list to JSON
jn cat 'ps aux' | jn put processes.json

# Disk usage analysis
jn cat 'df -h' | jq 'select(.use_percent > 80)' | jn put full_disks.csv

# Network connections
jn cat 'netstat -an' | jn put connections.json

Data Pipelines

# Filter and transform
jn run users.csv 'select(.age > 18)' adults.json

# Multi-step processing
jn cat data.csv | \
  jq 'select(.amount > 100)' | \
  jq '{customer, total: .amount}' | \
  jn put high_value.xml

# Combine sources
cat <(jn cat file1.csv) <(jn cat file2.yaml) | jn put combined.json

Creating Custom Plugins

1. Scaffold Plugin

jn create source my_api --description "Custom API reader"
# Created: plugins/readers/my_api.py

2. Implement Logic

Edit plugins/readers/my_api.py:

def run(config: Optional[dict] = None) -> Iterator[dict]:
    """Fetch data from custom API."""
    import requests

    response = requests.get('https://my-api.com/data')
    data = response.json()

    for item in data['items']:
        yield item

3. Add Tests

def examples() -> list[dict]:
    return [
        {
            "description": "Fetch user data",
            "input": "",
            "expected": [
                {"id": 1, "name": "Alice"},
                {"id": 2, "name": "Bob"}
            ]
        }
    ]

4. Test & Validate

jn test my_api
jn validate plugins/readers/my_api.py

5. Use in Pipelines

jn cat my_api | jn put users.csv

Architecture

Function-Based Plugins - No classes, just functions:

  • run(config) - Main processing logic
  • examples() - Test cases (optional)
  • test() - Built-in tests (optional)

Regex-Based Discovery - No Python imports needed:

  • Plugins discovered by scanning filesystem
  • Metadata parsed from # META: headers
  • Fast discovery (~10ms for 19 plugins)

Subprocess Isolation - Each plugin runs independently:

  • PEP 723 inline dependencies
  • UV manages per-plugin environments
  • No dependency conflicts

Unix Pipes - Standard composition:

plugin1 < input.txt | plugin2 | plugin3 > output.json

Development Status

Current (v4.0.0-alpha1):

  • ✅ Core pipeline framework
  • ✅ 19 working plugins
  • ✅ Full CLI (10 commands)
  • ✅ Plugin creation tools
  • ✅ 105 tests passing (78% coverage)

Coming Soon:

  • Excel reader/writer
  • Database plugins (PostgreSQL, MySQL, SQLite)
  • S3 integration
  • API authentication
  • Advanced filters (aggregations, group-by)

Contributing

See docs/plugins.md for plugin authoring guide.

License

MIT License - see LICENSE file for details.

Acknowledgments

  • Kelly Brazil - jc philosophy and JSON CLI tooling
  • Anthropic - MCP inspiration for agent-native design
  • UV - Modern Python packaging

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jn_cli-0.0.0.dev1.tar.gz (110.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

jn_cli-0.0.0.dev1-py3-none-any.whl (148.3 kB view details)

Uploaded Python 3

File details

Details for the file jn_cli-0.0.0.dev1.tar.gz.

File metadata

  • Download URL: jn_cli-0.0.0.dev1.tar.gz
  • Upload date:
  • Size: 110.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for jn_cli-0.0.0.dev1.tar.gz
Algorithm Hash digest
SHA256 79208de6e6c70c7836c6ef428b8e96999f0f57ba603b8acf958b8129536dc10f
MD5 e314b83656018af3b0e82ce6c3df85d8
BLAKE2b-256 d0564f30aca03e0e0263eeed0e531768ffb43de11510f6c2276f6c1e9bb11582

See more details on using hashes here.

File details

Details for the file jn_cli-0.0.0.dev1-py3-none-any.whl.

File metadata

  • Download URL: jn_cli-0.0.0.dev1-py3-none-any.whl
  • Upload date:
  • Size: 148.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for jn_cli-0.0.0.dev1-py3-none-any.whl
Algorithm Hash digest
SHA256 b40a0cd5280c1c528de12a78087796d18eb9c7ef387643a0a749906e94bfecff
MD5 8d29d4eada896583aa22c5b7d3abdf5d
BLAKE2b-256 d4b23fea067d62573f131b57785bdb34c680c581db13d972b7ed954dd005183a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page