Skip to main content

A minimal, deterministic, LLM-friendly, line-based file manipulation DSL

Project description

🔩 DataForge

A minimal, deterministic, LLM-friendly, line-based file manipulation DSL.

PyPI version Python License


Install

pip install dfgscript

Afterwards three entry points are available:

dataforge script.dfg          # canonical command
build     script.dfg          # alias matching the spec
py -m dataforge script.dfg    # module invocation

Quick example

create-file "notes.txt"
1+Hello world
2+This is a test file
3+It has some content
4+End of file
end-file
dataforge notes.dfg           # write the file
dataforge notes.dfg --dry-run # preview without writing

Multi-file support (v0.3)

A single .dfg script can create and edit multiple files at once. Each block is explicitly closed with end-file:

create-file "notes.txt"
1+Hello
end-file

create-file "Hi.txt"
1+Hi
end-file

Rules:

  • Every block must be closed with end-file before opening a new one
  • A single block without end-file is still valid (v0.2 backwards compatibility)
  • If a block fails fatally, execution stops and later blocks are not run

DSL reference

File-level commands

create-file "path"   # create new — error if file exists
replace-file "path"  # unconditional overwrite
change-file "path"   # patch an existing file (creates if absent)
end-file             # close the current block (required for multi-file scripts)

Line-level operations

N+<text>      # write/replace line N  (expands file with "" if N > length)
N-"<text>"    # delete line N  only if content matches exactly (warns otherwise)
N><text>      # insert <text> after line N  (shifts tail down)
$+<text>      # append as new final line

Comments & blank lines

# This is a comment — ignored by the parser

Blank lines in a .dfg are also ignored. To write an actual blank line into a target file use N+ with no text: 3+


Edit example

change-file "notes.txt"

# Delete line 2 only if it still matches exactly
2-"This is a test file"
2+This is an edited file

# Delete line 3 only if it matches
3-"It has some content"

# Insert a new line after line 1
1>Inserted line here

# Append to end
$+--- EOF ---
end-file

Result:

  1. Hello world
  2. Inserted line here
  3. This is an edited file
  4. End of file
  5. --- EOF ---

CLI flags

Flag Description
--dry-run / --preview Show resulting file(s); do not write anything
--backup-dir PATH Save a timestamped .bak copy before each write
--log PATH Append log output to a file
--verbose / -v Debug-level logging of every operation
--version Print version and exit

Python API

from dataforge import parse, run

# Multi-file — parse() returns a list of DfgScript blocks
scripts = parse(open("edit.dfg").read())
results = run(scripts, dry_run=True)   # → { "path": ["line", ...], ... }

# Single-file convenience
from dataforge import parse_one
script  = parse_one(open("single.dfg").read())
results = run(script, backup_dir=".backups")

Security

  • Path traversal (..) is blocked by default
  • No shell code is ever executed
  • Atomic rename semantics — original only replaced on full success
  • Transactional apply — writes go to a temp file first, then rename over the original

Project structure

dataforge_pkg/
├── pyproject.toml          ← pip install .
├── README.md
├── README.celes
├── tests.py
└── dataforge/
    ├── __init__.py         ← public Python API
    ├── __main__.py         ← py -m dataforge
    ├── parser.py           ← .dfg → DfgScript AST
    ├── interpreter.py      ← executes the AST
    └── cli.py              ← argparse CLI

Changelog

v0.3.0

  • Multi-file support — a single .dfg can now target multiple files
  • New end-file terminator closes each block explicitly
  • parse() now returns list[DfgScript] instead of a single DfgScript
  • parse_one() added as a convenience wrapper for single-block scripts
  • run() now accepts either a single DfgScript or a list[DfgScript]
  • run() now returns dict[path, list[str]]
  • Full backwards compatibility with v0.2 single-block .dfg files

v0.2.0

  • Initial release

License

Server-Lab Open-Control License (SOCL) — Copyright © 2025 Sourasish Das.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dfgscript-0.3.0.tar.gz (11.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dfgscript-0.3.0-py3-none-any.whl (11.0 kB view details)

Uploaded Python 3

File details

Details for the file dfgscript-0.3.0.tar.gz.

File metadata

  • Download URL: dfgscript-0.3.0.tar.gz
  • Upload date:
  • Size: 11.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for dfgscript-0.3.0.tar.gz
Algorithm Hash digest
SHA256 66eb65f0db5a7a58ade5d82956898a810775096bedde9aea05c635d54b0a6482
MD5 bac6b69146777e37d2b0fd740feeb935
BLAKE2b-256 c4f3089ee0a30e3a54d6387d55bdaaaeb11d8ae5509d9418a0a13fd544a62f45

See more details on using hashes here.

File details

Details for the file dfgscript-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: dfgscript-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 11.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for dfgscript-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fb82724074cf74a93afb9c74f98cf4903eb019c3f1f2d60e0e15226fbf3bd575
MD5 14296f008e4c1d08dedf338a650ad5a6
BLAKE2b-256 84f559545e86bb89995a424578b2a7b2349c55aa410d2a3157030b1e83ee9913

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page