AI Archive is a format and a tool for bundling a directory structure and its files into a single, executable bash/zsh script. It's designed to make sending and receiving file collections in a chat-based or text-only environment—like interacting with an LLM—as simple as copying and pasting a single block of text.
Project description
aiar (AI Archive)
A simple LLM-friendly archive format and utility for creating self-extracting shell archives.
Inspired by the classic Unix shar (shell archive), aiar is a format and a tool for bundling a directory structure and its files into a single, executable bash/zsh script. It's designed to make sending and receiving file collections in a chat-based or text-only environment—like interacting with an LLM—as simple as copying and pasting a single block of text.
Purpose
The primary purpose of the aiar format is to package a project's files into a single text block for use with a Large Language Model. This allows an LLM to receive or transmit a collection of files within a text-only interface, bypassing the need for binary archive formats like .zip. (Yes, I once had an LLM try to send me a base 64 encoded zip file, I kid you not. And, no, it wasn’t a valid zip file.)
Key Features
- Single File: The entire archive is one text file. Easy to copy, paste, and save.
- LLM-Friendly: The format is simple for an LLM to generate or consume. Because the file content is never executed, the LLM doesn't need to worry about shell-escaping special characters.
- Self-Contained: The extraction logic is bundled with the data. No external tools like zip or tar are needed to unpack it.
The aiar Format
An aiar script has two main parts, separated by an exit 0 command.
- The Unpacker Logic: A bash script that reads its own file, line by line. It looks for a unique separator line that denotes the start of a new file. This part is optional if you use the "aiar" tool and might even be desirable if you don't want to run code directly from an LLM.
- The exit 0 Guard: This command prevents the shell from ever trying to execute the data section below it.
- The Data Payload: The raw, unescaped contents of your files, each preceded by the unique separator line.
Supported Formats
The aiar tool supports multiple output formats:
- bare (
.aiar) - Data-only format without self-extraction script. Most LLM-friendly for creation. - bash (
.sh) - Self-extracting Bash script (default). Compatible with Linux/macOS/WSL. - python (
.py) - Self-extracting Python script. Requires Python 3.6+. - nodejs (
.js) - Self-extracting Node.js script. Requires Node.js. - powershell (
.ps1) - Self-extracting PowerShell script. Requires PowerShell 5.1+.
All formats support both text and binary files and can be extracted using the aiar extract command.
Example Archives
Below are examples of each format. Each demonstrates the same two-file archive structure.
Bare Format (.aiar)
The bare format is the simplest and most LLM-friendly. It contains only the data with a separator definition, no extraction logic. This format can be extracted using aiar extract archive.aiar.
Note for LLMs: To create a bare format archive:
- Start with
SEPARATOR="..."(use a unique string not found in file contents) - Mark each file with
:t:for text or:b:for binary - Use format:
SEPARATOR:t:filepathfollowed by file content - The contents of the binary blobs are base64 encoded.
SEPARATOR="++++++++++--------:a1b2c3d4-5678-90ab-cdef-1234567890ab:"
++++++++++--------:a1b2c3d4-5678-90ab-cdef-1234567890ab:t:example/README.md
# Example Project
This is a sample file in the bare aiar format.
++++++++++--------:a1b2c3d4-5678-90ab-cdef-1234567890ab:t:example/hello.txt
Hello from aiar!
This file can contain any text content.
Special characters like $, #, ', ", `, $(cmd) are preserved literally.
Bash Format (.sh)
The Bash format is a self-extracting shell script. Save as archive.sh and run with bash archive.sh.
#!/bin/bash
# aiar: AI Archive - Self-extracting script generated by aiar-py. #
#
# Note to LLMS, this archive contains multiple files with separator lines. #
# Text files are contained between separators verbatim, binary files are base64-encoded. #
# Every line ends with a '#' character to guard against a possible CRLF line ending that #
# would otherwise cause the script in cases where CRLF line endings are not supported. #
# Choose a random separator to avoid conflicts when archiving archives. #
#
SEPARATOR="++++++++++--------:8c7163c6-4902-46b0-9629-f75517de083c:" #
writing=false #
#
# Function to report errors and exit cleanly #
handle_error() { #
echo "Error: $1" >&2 #
exit 1 #
} #
#
# Function to close the previous file descriptor and wait for bg processes #
close_previous_fd() { #
if [ "$writing" = true ]; then #
exec 3>&- #
# Wait for any background process (like base64) to finish #
wait 2>/dev/null || true #
fi #
writing=false #
} #
#
while read -r line; do #
if [[ "$line" == "$SEPARATOR"* ]]; then #
close_previous_fd #
#
payload="${line#$SEPARATOR}" #
IFS=':' read -r type filepath <<< "$payload" #
# Strip any trailing carriage returns (DOS line endings) #
filepath="${filepath%$'\r'}" #
#
if [ -n "$filepath" -a ! -e "$filepath" ]; then #
echo "Creating: $filepath" #
mkdir -p "$(dirname "$filepath")" || handle_error "Cannot create directory for '$filepath'." #
#
if [ "$type" == "b" ]; then #
# Use process substitution to pipe output to base64 decoder #
# Wrap the entire pipeline in a single process that can be waited on #
# Use sed to strip any trailing carriage returns from base64 input #
exec 3> >( #
error_file="$(mktemp)" #
trap "rm -f \"$error_file\"" EXIT #
sed 's/\r$//' | base64 -d > "$filepath" 2>"$error_file" #
if [ -s "$error_file" ]; then #
echo "Error: base64 decoding failed for '$filepath':" >&2 #
cat "$error_file" >&2 #
rm -f "$filepath" #
exit 1 #
fi #
) || handle_error "Cannot start base64 process for '$filepath'." #
writing=true #
elif [ "$type" == "t" ]; then #
exec 3>"$filepath" || handle_error "Cannot open '$filepath' for writing." #
writing=true #
else #
handle_error "Invalid file type '$type' in separator." #
fi #
else #
echo "Skipping already existing file: '$filepath'" #
fi #
elif [ "$writing" = true ]; then #
echo "$line" >&3 #
fi #
done < "$0" #
#
close_previous_fd # Close the very last file #
#
echo "Extraction complete." #
exit 0 #
#
# --- DATA --- #
#
++++++++++--------:8c7163c6-4902-46b0-9629-f75517de083c:t:example/hello.txt
Hello from aiar!
She said, "He's going to the store for $5."
++++++++++--------:8c7163c6-4902-46b0-9629-f75517de083c:t:example/README.md
# Example Project
This file includes special characters: $PATH, #comment, 'quotes', "quotes", `backticks`, $(cmd)
All are preserved literally.
Python Format (.py)
The Python format is a self-extracting Python script. Save as archive.py and run with python archive.py. Files are embedded as commented lines with # prefix.
import sys, os, re, base64
from pathlib import Path
SEPARATOR="++++++++++--------:a1b2c3d4-5678-90ab-cdef-1234567890ab:"
SEP = re.escape(SEPARATOR)
def _safe_dest(rel: str) -> Path:
p = Path(rel)
if p.is_absolute():
raise ValueError(f"Absolute path not allowed: {rel}")
dest = (Path(".") / p).resolve()
if Path(".").resolve() not in (set(dest.parents) | {dest}):
raise ValueError(f"Path escapes output root: {rel}")
return dest
def extract_all():
with open(__file__, "r", encoding="utf-8") as f:
script_content = f.read()
pat = re.compile(
rf"^# ?{SEP}([tb]):([^\n]+)\n(.*?)(?=^# ?{SEP}[tb]:|\Z)",
re.DOTALL | re.MULTILINE,)
any_found = False
for ftype, path, body in pat.findall(script_content):
any_found = True
path = path.strip()
try:
dest = _safe_dest(path)
except ValueError as e:
print(f"Warning: {e}. Skipping.")
continue
if dest.exists():
print(f"Skipping already existing file: '{dest}'")
continue
print(f"Creating: {dest}")
dest.parent.mkdir(parents=True, exist_ok=True)
uncommented_body = re.sub(r"^# ?", "", body, flags=re.MULTILINE)
if ftype == "t":
with open(dest, "w", encoding="utf-8", newline="\n") as out:
out.write(uncommented_body)
else: # binary
with open(dest, "wb") as out:
out.write(base64.b64decode(uncommented_body.strip().encode("ascii"), validate=False))
if not any_found:
print("Error: No payload sections found in data block.")
sys.exit(1)
extract_all()
print("Extraction complete.")
sys.exit(0)
# ++++++++++--------:a1b2c3d4-5678-90ab-cdef-1234567890ab:t:example/README.md
# # Example Project
#
# This file includes special characters: $PATH, #comment, 'quotes', "quotes", `backticks`, $(cmd)
# All are preserved literally.
#
# ++++++++++--------:a1b2c3d4-5678-90ab-cdef-1234567890ab:t:example/hello.txt
# Hello from aiar!
# She said, "He's going to the store for $5."
Node.js Format (.js)
The Node.js format is a self-extracting Node.js script. Save as archive.js and run with node archive.js. Files are embedded as commented lines with // prefix.
#!/usr/bin/env node
const fs = require('fs');
const path = require('path');
function escapeRegex(str) {
return str.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
}
const SEPARATOR = "++++++++++--------:a1b2c3d4-5678-90ab-cdef-1234567890ab:";
const SEP = escapeRegex(SEPARATOR);
function safeDest(rel) {
if (path.isAbsolute(rel)) {
throw new Error(`Absolute path not allowed: ${rel}`);
}
const dest = path.resolve(process.cwd(), rel);
if (!dest.startsWith(process.cwd())) {
throw new Error(`Path escapes output root: ${rel}`);
}
return dest;
}
function extractAll() {
const scriptContent = fs.readFileSync(__filename, 'utf8');
const pat = new RegExp(
`^// ?${SEP}([tb]):([^\\n]+)\\n(.*?)(?=(^// ?${SEP}[tb]:|\\Z))`,
'gms'
);
const matches = [...scriptContent.matchAll(pat)];
if (matches.length === 0) {
console.error("Error: No payload sections found in data block.");
process.exit(1);
}
for (const match of matches) {
const [, ftype, relPath, body] = match;
const cleanPath = relPath.trim();
let dest;
try {
dest = safeDest(cleanPath);
} catch (e) {
console.warn(`Warning: ${e.message}. Skipping.`);
continue;
}
if (fs.existsSync(dest)) {
console.log(`Skipping already existing file: '${dest}'`);
continue;
}
console.log(`Creating: ${dest}`);
fs.mkdirSync(path.dirname(dest), { recursive: true });
const uncommentedBody = body.replace(/^\/\/ ?/gm, '');
if (ftype === 't') {
fs.writeFileSync(dest, uncommentedBody, { encoding: 'utf8' });
} else {
const buffer = Buffer.from(uncommentedBody.trim(), 'base64');
fs.writeFileSync(dest, buffer);
}
}
}
extractAll();
console.log("Extraction complete.");
process.exit(0);
// ++++++++++--------:a1b2c3d4-5678-90ab-cdef-1234567890ab:t:example/README.md
// # Example Project
//
// This file includes special characters: $PATH, #comment, 'quotes', "quotes", `backticks`, $(cmd)
// All are preserved literally.
//
// ++++++++++--------:a1b2c3d4-5678-90ab-cdef-1234567890ab:t:example/hello.txt
// Hello from aiar!
// She said, "He's going to the store for $5."
PowerShell Format (.ps1)
The PowerShell format is a self-extracting PowerShell script. Save as archive.ps1 and run with powershell -ExecutionPolicy Bypass -File archive.ps1. Files are embedded as commented lines with # prefix.
#Requires -Version 5.1
$SEPARATOR="++++++++++--------:a1b2c3d4-5678-90ab-cdef-1234567890ab:"
function Escape-Regex {
param([string]$String)
return [System.Text.RegularExpressions.Regex]::Escape($String)
}
function Safe-Dest {
param([string]$RelativePath)
if ([System.IO.Path]::IsPathRooted($RelativePath)) {
throw "Absolute path not allowed: $RelativePath"
}
$resolvedPath = [System.IO.Path]::GetFullPath((Join-Path -Path $PWD.Path -ChildPath $RelativePath))
if (-not $resolvedPath.StartsWith($PWD.Path)) {
throw "Path escapes output root: $RelativePath"
}
return $resolvedPath
}
function Extract-All {
$scriptPath = $PSCommandPath
$scriptContent = Get-Content -Path $scriptPath -Raw
$sep = Escape-Regex "$SEPARATOR"
$pattern = "(?ms)^#\s?$sep([tb]):([^\n]+)\n(.*?)(?=(^#\s?$sep[tb]:|\Z))"
$matches = [System.Text.RegularExpressions.Regex]::Matches($scriptContent, $pattern)
if ($matches.Count -eq 0) {
Write-Error "No payload sections found in data block."
exit 1
}
foreach ($match in $matches) {
$ftype = $match.Groups[1].Value
$relPath = $match.Groups[2].Value.Trim()
$body = $match.Groups[3].Value
try {
$dest = Safe-Dest -RelativePath $relPath
} catch {
Write-Warning "Warning: $_. Skipping."
continue
}
if (Test-Path -LiteralPath $dest) {
Write-Output "Skipping already existing file: '$dest'"
continue
}
Write-Output "Creating: $dest"
$null = New-Item -ItemType Directory -Force -Path (Split-Path -Path $dest -Parent)
$uncommentedBody = $body -replace '(?m)^#\s?' , ''
if ($ftype -eq 't') {
Set-Content -Path $dest -Value $uncommentedBody -NoNewline -Encoding utf8
} elseif ($ftype -eq 'b') {
$cleanBase64String = $uncommentedBody -replace '\s'
$bytes = [System.Convert]::FromBase64String($cleanBase64String)
[System.IO.File]::WriteAllBytes($dest, $bytes)
} else {
Write-Warning "Unknown file type '$ftype' for '$relPath'. Skipping."
}
}
}
Extract-All
Write-Output "Extraction complete."
exit 0
# --- PAYLOAD ---
# ++++++++++--------:a1b2c3d4-5678-90ab-cdef-1234567890ab:t:example/README.md
# # Example Project
#
# This file includes special characters: $PATH, #comment, 'quotes', "quotes", `backticks`, $(cmd)
# All are preserved literally.
#
# ++++++++++--------:a1b2c3d4-5678-90ab-cdef-1234567890ab:t:example/hello.txt
# Hello from aiar!
# She said, "He's going to the store for $5."
License
This project is licensed under the MIT License. See the LICENSE file for details.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file aiar-0.1.7.tar.gz.
File metadata
- Download URL: aiar-0.1.7.tar.gz
- Upload date:
- Size: 21.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3d4be6991e20aff3b3c5a2bf25348ab59fcef54709691365eb6e0d44556365cc
|
|
| MD5 |
d8eaa45734f6af9a5eaf418f6fa6c33d
|
|
| BLAKE2b-256 |
bf1a0e649b3cc77258e5ecf4c6df500c793f7e1e162224fff1946f65539c6422
|
Provenance
The following attestation bundles were made for aiar-0.1.7.tar.gz:
Publisher:
publish.yml on owebeeone/aiar
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
aiar-0.1.7.tar.gz -
Subject digest:
3d4be6991e20aff3b3c5a2bf25348ab59fcef54709691365eb6e0d44556365cc - Sigstore transparency entry: 607696217
- Sigstore integration time:
-
Permalink:
owebeeone/aiar@c55b1847358d30cc57a7a0a8a3f37ebb8f986828 -
Branch / Tag:
refs/tags/v0.1.7 - Owner: https://github.com/owebeeone
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@c55b1847358d30cc57a7a0a8a3f37ebb8f986828 -
Trigger Event:
release
-
Statement type:
File details
Details for the file aiar-0.1.7-py3-none-any.whl.
File metadata
- Download URL: aiar-0.1.7-py3-none-any.whl
- Upload date:
- Size: 18.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5def6f4d30d9815ba8bd4b6fdef2e3098ee2b05dcccc269971ee4cba3c80129f
|
|
| MD5 |
8339a4213fea54f1b658e639b84342d5
|
|
| BLAKE2b-256 |
2f9818a5a38ffd60acca522ecf42473e794ded8c2caf896be3ce116f2119b25c
|
Provenance
The following attestation bundles were made for aiar-0.1.7-py3-none-any.whl:
Publisher:
publish.yml on owebeeone/aiar
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
aiar-0.1.7-py3-none-any.whl -
Subject digest:
5def6f4d30d9815ba8bd4b6fdef2e3098ee2b05dcccc269971ee4cba3c80129f - Sigstore transparency entry: 607696221
- Sigstore integration time:
-
Permalink:
owebeeone/aiar@c55b1847358d30cc57a7a0a8a3f37ebb8f986828 -
Branch / Tag:
refs/tags/v0.1.7 - Owner: https://github.com/owebeeone
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@c55b1847358d30cc57a7a0a8a3f37ebb8f986828 -
Trigger Event:
release
-
Statement type: