Skip to main content

A minimal, deterministic, LLM-friendly, line-based file manipulation DSL

Project description

๐Ÿ”ฉ DataForge

A minimal, deterministic, LLM-friendly, line-based file manipulation DSL.

PyPI version Python License


Install

pip install dfgscript

Afterwards three entry points are available:

dataforge script.dfg          # canonical command
build     script.dfg          # alias matching the spec
py -m dataforge script.dfg    # module invocation

Quick example

create-file "notes.txt"
1+Hello world
2+This is a test file
3+It has some content
4+End of file
end-file
dataforge notes.dfg           # write the file
dataforge notes.dfg --dry-run # preview without writing

Multi-file support (v0.3)

A single .dfg script can create and edit multiple files at once. Each block is explicitly closed with end-file:

create-file "notes.txt"
1+Hello
end-file

create-file "Hi.txt"
1+Hi
end-file

Rules:

  • Every block must be closed with end-file before opening a new one
  • A single block without end-file is still valid (v0.2 backwards compatibility)
  • If a block fails fatally, execution stops and later blocks are not run

DSL reference

File-level commands

create-file "path"    # create new file โ€” error if it already exists
replace-file "path"   # overwrite file unconditionally
change-file "path"    # patch existing file (creates if absent)
remove-file "path"    # delete a file โ€” error if it does not exist
new-folder "path"     # create a folder โ€” error if it already exists
set-folder "path"     # create a folder + parents silently (mkdir -p)
remove-folder "path"  # delete a folder and all its contents
end-file              # close the current block

Line-level operations

N+<text>      # write/replace line N  (expands file with "" if N > length)
N-            # delete line N unconditionally (warns if N out of range)
N><text>      # insert <text> after line N  (shifts tail down)
$+<text>      # append as new final line

Comments & blank lines

# This is a comment โ€” ignored by the parser

Blank lines in a .dfg are also ignored. To write an actual blank line into a target file use N+ with no text: 3+


Edit example

change-file "notes.txt"

# Delete line 2 only if it still matches exactly
2-"This is a test file"
2+This is an edited file

# Delete line 3 only if it matches
3-"It has some content"

# Insert a new line after line 1
1>Inserted line here

# Append to end
$+--- EOF ---
end-file

Result:

  1. Hello world
  2. Inserted line here
  3. This is an edited file
  4. End of file
  5. --- EOF ---

Scaffold example

DataForge can bootstrap an entire project structure in one script:

new-folder "myapp"
end-file

new-folder "myapp/src"
end-file

new-folder "myapp/tests"
end-file

create-file "myapp/src/main.py"
1+# entry point
end-file

create-file "myapp/README.md"
1+# myapp
end-file
dataforge scaffold.dfg

Box variables

Declare a box to store a value and use [name] to interpolate it anywhere โ€” in paths, line operations, even folder names.

box name = "RICK"

create-file "[name].txt"
1+[name] is the name
end-file

Boxes can be used across multiple blocks in the same script:

box project = "myapp"
box author  = "Sourasish"

new-folder "[project]"
end-file

new-folder "[project]/src"
end-file

create-file "[project]/README.md"
1+# [project]
2+Author: [author]
end-file

Rules:

  • Declare with box name = "value" โ€” values are always quoted strings
  • Interpolate with [name] anywhere in paths or line content
  • Re-declaring a box overrides its previous value
  • Using an undeclared box is a parse error
  • Inline comments after the value are allowed: box name = "RICK" # the author

CLI flags

Flag Description
--dry-run / --preview Show resulting file(s); do not write anything
--backup-dir PATH Save a timestamped .bak copy before each write
--log PATH Append log output to a file
--verbose / -v Debug-level logging of every operation
--version Print version and exit

Python API

from dataforge import parse, run

# Multi-file โ€” parse() returns a list of DfgScript blocks
scripts = parse(open("edit.dfg").read())
results = run(scripts, dry_run=True)   # โ†’ { "path": ["line", ...], ... }

# Single-file convenience
from dataforge import parse_one
script  = parse_one(open("single.dfg").read())
results = run(script, backup_dir=".backups")

Security

  • Path traversal (..) is blocked by default
  • No shell code is ever executed
  • Atomic rename semantics โ€” original only replaced on full success
  • Transactional apply โ€” writes go to a temp file first, then rename over the original

Project structure

dataforge_pkg/
โ”œโ”€โ”€ pyproject.toml          โ† pip install .
โ”œโ”€โ”€ README.md
โ”œโ”€โ”€ README.celes
โ”œโ”€โ”€ tests.py
โ””โ”€โ”€ dataforge/
    โ”œโ”€โ”€ __init__.py         โ† public Python API
    โ”œโ”€โ”€ __main__.py         โ† py -m dataforge
    โ”œโ”€โ”€ parser.py           โ† .dfg โ†’ DfgScript AST
    โ”œโ”€โ”€ interpreter.py      โ† executes the AST
    โ””โ”€โ”€ cli.py              โ† argparse CLI

Changelog

v0.3.4

  • N- now has two forms:
    • N- โ€” unconditional delete (great for scaffolds and scripts that own the file)
    • N-"text" โ€” conditional delete (safe for LLMs editing existing files โ€” errors if content has drifted)
  • Both forms error if line N does not exist
  • [box] interpolation works inside N-"text" conditions

v0.3.3

  • Added box name = "value" โ€” declare a variable (box) anywhere in a script
  • Added [name] interpolation โ€” use box values in paths and all line operations
  • Using an undeclared box is a parse error
  • Inline comments on box declarations are supported
  • DataForge is now a full scaffold template engine

v0.3.2

  • N- no longer requires a text match โ€” it now deletes line N unconditionally
  • N- warns and skips if line N is out of range
  • Syntax simplified from N-"text" to just N-

v0.3.4

  • N- now has two forms:
    • N- โ€” unconditional delete (great for scaffolds and scripts that own the file)
    • N-"text" โ€” conditional delete (safe for LLMs editing existing files โ€” errors if content has drifted)
  • Both forms error if line N does not exist
  • [box] interpolation works inside N-"text" conditions

v0.3.3

  • Added box name = "value" โ€” declare a variable (box) anywhere in a script
  • Added [name] interpolation โ€” use box values in paths and all line operations
  • Using an undeclared box is a parse error
  • Inline comments on box declarations are supported
  • DataForge is now a full scaffold template engine

v0.3.2

  • N- no longer requires a text argument โ€” deletes line N unconditionally
  • Out-of-range deletes warn and skip instead of raising a fatal error

v0.3.4

  • N- now has two forms:
    • N- โ€” unconditional delete (great for scaffolds and scripts that own the file)
    • N-"text" โ€” conditional delete (safe for LLMs editing existing files โ€” errors if content has drifted)
  • Both forms error if line N does not exist
  • [box] interpolation works inside N-"text" conditions

v0.3.3

  • Added box name = "value" โ€” declare a variable (box) anywhere in a script
  • Added [name] interpolation โ€” use box values in paths and all line operations
  • Using an undeclared box is a parse error
  • Inline comments on box declarations are supported
  • DataForge is now a full scaffold template engine

v0.3.2

  • N- no longer requires quoted text โ€” deletes line N unconditionally
  • Out-of-range N- warns and skips instead of erroring

v0.3.4

  • N- now has two forms:
    • N- โ€” unconditional delete (great for scaffolds and scripts that own the file)
    • N-"text" โ€” conditional delete (safe for LLMs editing existing files โ€” errors if content has drifted)
  • Both forms error if line N does not exist
  • [box] interpolation works inside N-"text" conditions

v0.3.3

  • Added box name = "value" โ€” declare a variable (box) anywhere in a script
  • Added [name] interpolation โ€” use box values in paths and all line operations
  • Using an undeclared box is a parse error
  • Inline comments on box declarations are supported
  • DataForge is now a full scaffold template engine

v0.3.2

  • Added new-folder "path" โ€” create a folder; error if it already exists
  • Added set-folder "path" โ€” create a folder and all parents silently (mkdir -p)
  • Added remove-folder "path" โ€” delete a folder and all its contents
  • Folder commands respect --dry-run and --backup-dir
  • Line-level operations inside folder commands are a parse error
  • DataForge is now a scaffold tool for LLMs, scripts, and project generation

v0.3.1

  • Added remove-file "path" โ€” deletes a file; errors if it does not exist
  • Respects --backup-dir and --dry-run like all other commands
  • Line-level operations inside a remove-file block are a parse error

v0.3.0

  • Multi-file support โ€” a single .dfg can now target multiple files
  • New end-file terminator closes each block explicitly
  • parse() now returns list[DfgScript] instead of a single DfgScript
  • parse_one() added as a convenience wrapper for single-block scripts
  • run() now accepts either a single DfgScript or a list[DfgScript]
  • run() now returns dict[path, list[str]]
  • Full backwards compatibility with v0.2 single-block .dfg files

v0.2.0

  • Initial release

License

Server-Lab Open-Control License (SOCL) โ€” Copyright ยฉ 2025 Sourasish Das.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dfgscript-0.3.4.tar.gz (14.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dfgscript-0.3.4-py3-none-any.whl (13.1 kB view details)

Uploaded Python 3

File details

Details for the file dfgscript-0.3.4.tar.gz.

File metadata

  • Download URL: dfgscript-0.3.4.tar.gz
  • Upload date:
  • Size: 14.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for dfgscript-0.3.4.tar.gz
Algorithm Hash digest
SHA256 5a54924e0d8f9c482cfa77af56711691ad97ba91c7e84b042aa72e2bd6f050e5
MD5 8f8f5a85403addc31a791397db4559d7
BLAKE2b-256 2ec629e0453315d568decd139ff25942ea28f8eaba8c25115395b5526b702f21

See more details on using hashes here.

File details

Details for the file dfgscript-0.3.4-py3-none-any.whl.

File metadata

  • Download URL: dfgscript-0.3.4-py3-none-any.whl
  • Upload date:
  • Size: 13.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for dfgscript-0.3.4-py3-none-any.whl
Algorithm Hash digest
SHA256 bd356d9f8ec78b5c24eb1c5bda6ee12ee9752bcff4773bcfea98b09501792b1b
MD5 eec58f2ee9568dc82e02f0d6aed986ee
BLAKE2b-256 2eb70df2d834611f0dd731f36f25d5a0ab775fbcdde9a1cf36c3f3ef97573461

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page