CLI tool for doing data joining

These details have not been verified by PyPI

Project links

Project description

Cobble

A CLI tool for ad-hoc data joining, filtering, and analysis. Pipe in CSV, JSON, or plain text and query it with a shell-friendly syntax.

Install

pip install cobblequery

Quick Start

# Pipe CSV and filter
cat users.csv | cobble '| search role=admin | select name,email'

# Read a file directly (skip stdin)
cobble -s '| from data.csv | head 5'

# Chain JSON lines from another tool
kubectl get pods -o json | cobble '| select name,status'

CLI Usage

usage: cobble [-h] [--dry-run] [-i] [-s] [-q QUERY_FILE] [query]

positional arguments:
  query                 Query to run

optional arguments:
  -h, --help            show this help message and exit
  --dry-run             Parse and validate query without running
  -i, --interactive     Do an interactive edit of the query
  -s, --no-stdin        Dont insert the stdin generator
  -q QUERY_FILE, --query-file QUERY_FILE
                        Read query text from a file

Queries are pipe-delimited chains of commands:

| command1 args | command2 args | ...

Examples

Filtering rows

# Exact match
cat servers.csv | cobble '| search env=prod'

# Numeric comparison
cat orders.csv | cobble '| search total>100'

# Multiple conditions (AND)
cat orders.csv | cobble '| search status=shipped,total>=50'

# Regex match
cat logs.csv | cobble '| search path~=/api/v[23]'

# Not equal
cat users.csv | cobble '| search role!=guest'

# Python expression for complex logic
cat data.csv | cobble '| py "age > 21 and status == \"active\""'

Selecting and renaming fields

# Pick specific fields
cat users.csv | cobble '| select name,email,role'

# Rename a field
cat data.csv | cobble '| rename new_name=old_name'

Sorting

# Sort ascending
cat scores.csv | cobble '| sort name'

# Sort descending
cat scores.csv | cobble '| sort -score'

# Multi-field sort
cat employees.csv | cobble '| sort department,-salary'

Limiting results

# First 10 rows
cat huge.csv | cobble '| head 10'

# Rows 5 through 15
cat data.csv | cobble '| slice 5,15'

Aggregation

# Count and sum with auto-named output fields (field_operation)
cat sales.csv | cobble '| agg by=region sum(revenue) count(id)'
# Output: {"region": "west", "revenue_sum": 48000, "id_count": 12}

# Named output fields
cat sales.csv | cobble '| agg by=region, total=sum(revenue), n=count(id)'
# Output: {"region": "west", "total": 48000, "n": 12}

# No group-by (aggregate everything)
cat sales.csv | cobble '| agg total=sum(revenue)'

# Multiple group-by fields
cat data.csv | cobble '| agg by=year,quarter avg(revenue) min(cost) max(cost)'

Available aggregation functions: sum, count, avg, min, max, first, last, dc (distinct count), values, unique_values.

Joining datasets

# Join users with their departments (left join, first match)
cobble -s '| from users.csv | join dept_id [ | from departments.csv ] | select name,dept_name'

# Join with different key names (source_key:target_key)
cobble -s '| from orders.csv | join customer_id:id [ | from customers.csv ] | select order_id,name'

# Inner join (only matching rows)
cobble -s '| from orders.csv | join product_id, type=inner [ | from products.csv ]'

# Outer join (all rows from both sides)
cobble -s '| from left.csv | join id, type=outer [ | from right.csv ]'

# Expand join (one output row per match, like SQL)
cobble -s '| from students.csv | join class_id, target=expand [ | from enrollments.csv ]'

# Join with field selection
cobble -s '| from users.csv | join team_id [ | from teams.csv ] | select name,team_name'

Join types: left (default), inner, outer. Join targets: first (default), last, expand, agg, agg_str.

Computed fields

# Add a new field (fields are available by name directly)
cat products.csv | cobble '| set margin="price - cost"'

# String manipulation
cat users.csv | cobble '| set domain="email.split(\"@\")[1]"'

# The value. prefix also works for dot-access on nested data
cat data.json | cobble '| set full="value.first + \" \" + value.last"'

Combining datasets

# Append rows from another file
cobble -s '| from jan.csv | append { | from feb.csv }'

Generating data

# Generate a numbered sequence
cobble -s '| range end=100 | set squared="i ** 2"'

Real-world examples

# Find top 5 customers by total spend
cobble -s '| from orders.csv | agg by=customer_id sum(total) count(total) | sort -total_sum | head 5'

# Join server metrics with inventory, filter to production
cobble -s '| from metrics.csv
    | join hostname [ | from inventory.csv ]
    | search env=prod
    | sort -cpu
    | select hostname,cpu,memory,team'

# Aggregate log counts by status code, show top errors
cat access.log.csv | cobble '| search status>=400 | agg by=status count(path) | sort -path_count'

# Compare two CSVs - find entries only in the second file
cobble -s '| from new.csv | join id, type=inner [ | from old.csv ] | select id,name'

Quoting rules

Values with spaces or special characters should be quoted: "my value"
Expressions containing operators should be quoted: "Total Sales>100"
Escape quotes inside quoted strings with backslash: "say \"hello\""

Command Reference

Command	Aliases	Description
`from`		Read from file or stdin (`-`)
`search`	`s`	Filter rows by field matching
`py`	`where`	Filter rows with Python expressions
`select`	`table`	Pick specific fields
`set`	`eval`	Add/modify fields with Python expressions
`sort`		Sort by fields (`-field` for descending)
`agg`	`aggregate`, `stats`	Group and aggregate
`join`		Join with another dataset
`slice`	`head`, `limit`	Limit row range
`rename`		Rename fields
`unpack`		Expand list fields into rows
`append`		Concatenate another pipeline
`range`		Generate numbered rows

Search Operators

Operator	Example	Description
`=`	`field=value`	Exact match (numeric-aware)
`!=`	`field!=value`	Not equal
`>`	`field>10`	Greater than
`>=`	`field>=10`	Greater or equal
`<`	`field<10`	Less than
`<=`	`field<=10`	Less or equal
`~=`	`field~=pat.*`	Regex match

Numeric comparisons handle type coercion transparently -- string "25" from CSV is correctly compared as a number.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.1.0

Feb 15, 2026

1.0.2

Dec 23, 2023

1.0.1

Nov 24, 2023

1.0.0

Nov 24, 2023

0.1.1

Jul 16, 2023

0.1.0

Mar 24, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cobblequery-1.1.0.tar.gz (17.7 kB view details)

Uploaded Feb 15, 2026 Source

File details

Details for the file cobblequery-1.1.0.tar.gz.

File metadata

Download URL: cobblequery-1.1.0.tar.gz
Upload date: Feb 15, 2026
Size: 17.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.9.6

File hashes

Hashes for cobblequery-1.1.0.tar.gz
Algorithm	Hash digest
SHA256	`fda46ffae99666af620ed0df7e6de780cfac6d2f3ab3b135bf2a3e27cf9d9e9c`
MD5	`5cd44af5773102b25ce2860a9b32ac21`
BLAKE2b-256	`ca11b6040e1df2531554e2695c2084a3f4cc968ee4d77a1279009e1cfbffe6ba`

See more details on using hashes here.

cobblequery 1.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Cobble

Install

Quick Start

CLI Usage

Examples

Filtering rows

Selecting and renaming fields

Sorting

Limiting results

Aggregation

Joining datasets

Computed fields

Combining datasets

Generating data

Real-world examples

Quoting rules

Command Reference

Search Operators

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes