A minimal, deterministic, LLM-friendly, line-based file manipulation DSL
Project description
๐ฉ DataForge
A minimal, deterministic, LLM-friendly, line-based file manipulation DSL.
Install
pip install dfgscript
Afterwards three entry points are available:
dataforge script.dfg # canonical command
build script.dfg # alias matching the spec
py -m dataforge script.dfg # module invocation
Quick example
create-file "notes.txt"
1+Hello world
2+This is a test file
3+It has some content
4+End of file
end-file
dataforge notes.dfg # write the file
dataforge notes.dfg --dry-run # preview without writing
Multi-file support (v0.3)
A single .dfg script can create and edit multiple files at once. Each block
is explicitly closed with end-file:
create-file "notes.txt"
1+Hello
end-file
create-file "Hi.txt"
1+Hi
end-file
Rules:
- Every block must be closed with
end-filebefore opening a new one - A single block without
end-fileis still valid (v0.2 backwards compatibility) - If a block fails fatally, execution stops and later blocks are not run
DSL reference
File-level commands
create-file "path" # create new file โ error if it already exists
replace-file "path" # overwrite file unconditionally
change-file "path" # patch existing file (creates if absent)
remove-file "path" # delete a file โ error if it does not exist
new-folder "path" # create a folder โ error if it already exists
set-folder "path" # create a folder + parents silently (mkdir -p)
remove-folder "path" # delete a folder and all its contents
end-file # close the current block
Line-level operations
N+<text> # write/replace line N (expands file with "" if N > length)
N- # delete line N unconditionally (warns if N out of range)
N><text> # insert <text> after line N (shifts tail down)
$+<text> # append as new final line
Comments & blank lines
# This is a comment โ ignored by the parser
Blank lines in a .dfg are also ignored. To write an actual blank line into
a target file use N+ with no text: 3+
Edit example
change-file "notes.txt"
# Delete line 2 only if it still matches exactly
2-"This is a test file"
2+This is an edited file
# Delete line 3 only if it matches
3-"It has some content"
# Insert a new line after line 1
1>Inserted line here
# Append to end
$+--- EOF ---
end-file
Result:
Hello worldInserted line hereThis is an edited fileEnd of file--- EOF ---
Scaffold example
DataForge can bootstrap an entire project structure in one script:
new-folder "myapp"
end-file
new-folder "myapp/src"
end-file
new-folder "myapp/tests"
end-file
create-file "myapp/src/main.py"
1+# entry point
end-file
create-file "myapp/README.md"
1+# myapp
end-file
dataforge scaffold.dfg
Box variables
Declare a box to store a value and use [name] to interpolate it anywhere โ in paths, line operations, even folder names.
box name = "RICK"
create-file "[name].txt"
1+[name] is the name
end-file
Boxes can be used across multiple blocks in the same script:
box project = "myapp"
box author = "Sourasish"
new-folder "[project]"
end-file
new-folder "[project]/src"
end-file
create-file "[project]/README.md"
1+# [project]
2+Author: [author]
end-file
Rules:
- Declare with
box name = "value"โ values are always quoted strings - Interpolate with
[name]anywhere in paths or line content - Re-declaring a box overrides its previous value
- Using an undeclared box is a parse error
- Inline comments after the value are allowed:
box name = "RICK" # the author
CLI flags
| Flag | Description |
|---|---|
--dry-run / --preview |
Show resulting file(s); do not write anything |
--backup-dir PATH |
Save a timestamped .bak copy before each write |
--log PATH |
Append log output to a file |
--verbose / -v |
Debug-level logging of every operation |
--version |
Print version and exit |
Python API
from dataforge import parse, run
# Multi-file โ parse() returns a list of DfgScript blocks
scripts = parse(open("edit.dfg").read())
results = run(scripts, dry_run=True) # โ { "path": ["line", ...], ... }
# Single-file convenience
from dataforge import parse_one
script = parse_one(open("single.dfg").read())
results = run(script, backup_dir=".backups")
Security
- Path traversal (
..) is blocked by default - No shell code is ever executed
- Atomic rename semantics โ original only replaced on full success
- Transactional apply โ writes go to a temp file first, then rename over the original
Project structure
dataforge_pkg/
โโโ pyproject.toml โ pip install .
โโโ README.md
โโโ README.celes
โโโ tests.py
โโโ dataforge/
โโโ __init__.py โ public Python API
โโโ __main__.py โ py -m dataforge
โโโ parser.py โ .dfg โ DfgScript AST
โโโ interpreter.py โ executes the AST
โโโ cli.py โ argparse CLI
Changelog
v0.3.4
N-now has two forms:N-โ unconditional delete (great for scaffolds and scripts that own the file)N-"text"โ conditional delete (safe for LLMs editing existing files โ errors if content has drifted)
- Both forms error if line N does not exist
[box]interpolation works insideN-"text"conditions
v0.3.3
- Added
box name = "value"โ declare a variable (box) anywhere in a script - Added
[name]interpolation โ use box values in paths and all line operations - Using an undeclared box is a parse error
- Inline comments on box declarations are supported
- DataForge is now a full scaffold template engine
v0.3.2
N-no longer requires a text match โ it now deletes line N unconditionallyN-warns and skips if line N is out of range- Syntax simplified from
N-"text"to justN-
v0.3.4
N-now has two forms:N-โ unconditional delete (great for scaffolds and scripts that own the file)N-"text"โ conditional delete (safe for LLMs editing existing files โ errors if content has drifted)
- Both forms error if line N does not exist
[box]interpolation works insideN-"text"conditions
v0.3.3
- Added
box name = "value"โ declare a variable (box) anywhere in a script - Added
[name]interpolation โ use box values in paths and all line operations - Using an undeclared box is a parse error
- Inline comments on box declarations are supported
- DataForge is now a full scaffold template engine
v0.3.2
N-no longer requires a text argument โ deletes line N unconditionally- Out-of-range deletes warn and skip instead of raising a fatal error
v0.3.4
N-now has two forms:N-โ unconditional delete (great for scaffolds and scripts that own the file)N-"text"โ conditional delete (safe for LLMs editing existing files โ errors if content has drifted)
- Both forms error if line N does not exist
[box]interpolation works insideN-"text"conditions
v0.3.3
- Added
box name = "value"โ declare a variable (box) anywhere in a script - Added
[name]interpolation โ use box values in paths and all line operations - Using an undeclared box is a parse error
- Inline comments on box declarations are supported
- DataForge is now a full scaffold template engine
v0.3.2
N-no longer requires quoted text โ deletes line N unconditionally- Out-of-range
N-warns and skips instead of erroring
v0.3.4
N-now has two forms:N-โ unconditional delete (great for scaffolds and scripts that own the file)N-"text"โ conditional delete (safe for LLMs editing existing files โ errors if content has drifted)
- Both forms error if line N does not exist
[box]interpolation works insideN-"text"conditions
v0.3.3
- Added
box name = "value"โ declare a variable (box) anywhere in a script - Added
[name]interpolation โ use box values in paths and all line operations - Using an undeclared box is a parse error
- Inline comments on box declarations are supported
- DataForge is now a full scaffold template engine
v0.3.2
- Added
new-folder "path"โ create a folder; error if it already exists - Added
set-folder "path"โ create a folder and all parents silently (mkdir -p) - Added
remove-folder "path"โ delete a folder and all its contents - Folder commands respect
--dry-runand--backup-dir - Line-level operations inside folder commands are a parse error
- DataForge is now a scaffold tool for LLMs, scripts, and project generation
v0.3.1
- Added
remove-file "path"โ deletes a file; errors if it does not exist - Respects
--backup-dirand--dry-runlike all other commands - Line-level operations inside a
remove-fileblock are a parse error
v0.3.0
- Multi-file support โ a single
.dfgcan now target multiple files - New
end-fileterminator closes each block explicitly parse()now returnslist[DfgScript]instead of a singleDfgScriptparse_one()added as a convenience wrapper for single-block scriptsrun()now accepts either a singleDfgScriptor alist[DfgScript]run()now returnsdict[path, list[str]]- Full backwards compatibility with v0.2 single-block
.dfgfiles
v0.2.0
- Initial release
License
Server-Lab Open-Control License (SOCL) โ Copyright ยฉ 2025 Sourasish Das.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dfgscript-0.3.4.tar.gz.
File metadata
- Download URL: dfgscript-0.3.4.tar.gz
- Upload date:
- Size: 14.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5a54924e0d8f9c482cfa77af56711691ad97ba91c7e84b042aa72e2bd6f050e5
|
|
| MD5 |
8f8f5a85403addc31a791397db4559d7
|
|
| BLAKE2b-256 |
2ec629e0453315d568decd139ff25942ea28f8eaba8c25115395b5526b702f21
|
File details
Details for the file dfgscript-0.3.4-py3-none-any.whl.
File metadata
- Download URL: dfgscript-0.3.4-py3-none-any.whl
- Upload date:
- Size: 13.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bd356d9f8ec78b5c24eb1c5bda6ee12ee9752bcff4773bcfea98b09501792b1b
|
|
| MD5 |
eec58f2ee9568dc82e02f0d6aed986ee
|
|
| BLAKE2b-256 |
2eb70df2d834611f0dd731f36f25d5a0ab775fbcdde9a1cf36c3f3ef97573461
|