Skip to main content

Some CLI tools for text file processing

Project description

sodatools

A CLI toolkit for columnar text processing. Pipe-friendly, composable tools that operate on whitespace-delimited data from stdin.

pip install sodatools-core

Every tool is available both as a subcommand (soda <tool>) and as a direct sd<tool> entry point (e.g. sdalign, sdkut, sduniq). The sd prefix keeps the direct names out of the way of coreutils.

Tools

Tool Description
align Buffer and align columns to fixed widths (auto right-justify numbers)
coltest Filter lines by column test expressions (2gt50, 1seqOK, 3m^abc)
cutw Truncate lines to terminal or explicit width
delta Compute row-to-row differences for numeric columns
events Convert per-row state labels into bounded events with start/stop times
filter Filter noisy or invalid rows by column behavior (MAD, run-length, rate, range, clock, monotonicity)
kut Select, reorder, and format columns (ranges, exclusions, literals)
nf Enforce a specific number of fields per line (drop, pad, or truncate)
plot Interactive scatter plots from columnar data (matplotlib)
radix Convert integer columns between bases (dec, hex, bin, oct)
sample Sample rows (every N, random, percent) or resample by time interval
sample-data Generate sample datasets for testing (weather, small, noisy, etc.)
smooth Sliding window mean/median smoothing for numeric columns
stats Column statistics with formatted table output (mean, std, trend, outliers)
tag Assign named states to rows based on column conditions
uniq Consecutive deduplication with optional column keys and fuzzy matching

Examples

Tag temperature readings into states and convert to events:

soda sample-data weather | soda tag -H1 hot: 2gt25 cold: 2lt15 normal: 2ge15 2le25 \
  | soda events -H1 --gap 5m --print-header
state  start                stop                 duration  count
hot    2024-01-01T04:35:00  2024-01-01T10:55:00  6.3h      77
normal 2024-01-01T11:00:00  2024-01-01T18:00:00  7.0h      85
cold   2024-01-01T18:05:00  2024-01-02T04:30:00  10.4h     126
...

Select columns, compute deltas, and align:

soda sample-data weather | soda kut 1 2 3 -H1 | soda delta 2 3 -H1 | soda align -H1
timestamp            temp  humidity
-------------------  ----  --------
2024-01-01T00:05:00  0.03     -0.29
2024-01-01T00:10:00  0.12      0.51
2024-01-01T00:15:00  0.18     -0.87
...

Clean noisy sensor data — remove MAD outliers and short mode glitches, respecting time gaps:

soda sample-data noisy | soda filter 2:mad 3:run --gap 30s --drop | soda align

Sample every 10th row, compute statistics:

soda sample-data weather | soda sample every 10 -H1 | soda stats -H1

Convert hex register dumps to binary with aligned output:

echo -e "0xff\n0x0a\n0x80" | soda radix --from hex --to bin -b 8 --align
11111111
00001010
10000000

Filter lines where column 2 > 50, show only matching lines:

soda sample-data small | soda coltest 2gt50 -H1

Resample time series to 1-hour intervals, then plot temperature and humidity:

soda sample-data weather | soda sample interval 1h -H1 | soda plot -H1 2 c=red / 3 c=blue

Enforce 3 fields per line, pad short rows, smooth, and align:

soda sample-data small | soda nf 3 -H1 --pad '?' | soda smooth 3 -p 1 | soda align

Common flags

Most tools share these flags:

Flag Alias Description
--delimiter -d Input field delimiter (default: whitespace)
--header -H First N lines are headers
--align Adaptively align output columns
--no-fail Skip problematic rows instead of erroring
--example Show usage examples and exit

Plugins

Third-party packages can add new subcommands to soda by registering them under the sodatools.commands entry-point group. Core discovers plugins at startup; any failures are reported to stderr without crashing the CLI.

# your plugin's pyproject.toml
[project.entry-points."sodatools.commands"]
mytool = "sodatools_myplugin.cli:mytool"   # exposes `soda mytool`

[project.scripts]
sdmytool = "sodatools_myplugin.entrypoints:sdmytool"  # optional direct entry

The registered object must be a typer-compatible command function with the same shape as the built-ins (plain function with typer.Option / typer.Argument defaults). Plugin names that collide with built-ins are skipped with a warning.

Changelog

See CHANGELOG.md for release notes.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sodatools_core-0.0.6.tar.gz (117.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sodatools_core-0.0.6-py3-none-any.whl (69.3 kB view details)

Uploaded Python 3

File details

Details for the file sodatools_core-0.0.6.tar.gz.

File metadata

  • Download URL: sodatools_core-0.0.6.tar.gz
  • Upload date:
  • Size: 117.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for sodatools_core-0.0.6.tar.gz
Algorithm Hash digest
SHA256 ff08bb07610786aec3afdff22216cbca183c8dda1a5faf3d6f92c1a08c08bb0e
MD5 e682797e892d570685c21ec22a8579b0
BLAKE2b-256 be6b1b9c72653833d6c7f750f99189685c954bafb5b9d5d1e4b8b8d78d3b6d40

See more details on using hashes here.

File details

Details for the file sodatools_core-0.0.6-py3-none-any.whl.

File metadata

  • Download URL: sodatools_core-0.0.6-py3-none-any.whl
  • Upload date:
  • Size: 69.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for sodatools_core-0.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 9eee49342e215a681c597ba61a178c38142e6b408bc71beadecf56145ce2a2fc
MD5 5c5d4b27963c7175af8a483d2229cff8
BLAKE2b-256 8205c3f0e33d6be6eb9747148a256992dbfc4911356df6921a5be78513638f3b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page