Pip installable client for Glyph Forge API
Project description
Glyph Forge
A Python framework for turning LLM plaintext into styled DOCX documents. Inspired by HTML/CSS and Tailwind design patterns — schemas define the baseline formatting, inline markup handles the overrides, and AI agents handle the rest.
Installation
pip install glyph-forge
Quick Start
from glyph_forge import ForgeClient, create_workspace
ws = create_workspace()
client = ForgeClient()
# Build schema from a reference DOCX
schema = client.build_schema_from_docx(ws, docx_path="template.docx", save_as="my_schema")
# Generate a new DOCX from plaintext
docx_path = client.run_schema(
ws,
schema=schema,
plaintext="Your content here...",
dest_name="output.docx"
)
Tools & Their Purpose
Glyph Forge has several tools. Each one does a specific job. Understanding what each tool is (and isn't) for is key to building reliable workflows.
Schemas — Your Baseline Formatter
What it does: A schema maps heuristic types (headings, paragraphs, lists, tables) to styling rules. When you run a schema against plaintext, Glyph classifies each line by its structural role and applies the matching style.
What it is NOT: A schema is not AI. It does not understand meaning, context, or semantics. It matches structural patterns — a short title-cased line is a heading, a line starting with • is a bullet, and so on.
When to use it: Always. The schema is the foundation of every Glyph workflow. Start here.
Performance: Schemas compile in milliseconds. No API key, no network calls, no latency.
# Schema selectors target structural patterns
{
"type": "H-SHORT", # Short headings (title case, ALL CAPS, <=6 words)
"style": {"font": {"bold": true, "size": 18}}
}
Inline Markup — Context-Aware Overrides
What it does: Inline markup lets you (or an LLM) embed styling instructions directly in the plaintext. Block markup ($glyph-{utilities}) wraps entire paragraphs. Inline markup ([utilities]text[/]) styles specific words or phrases.
What it is NOT: A replacement for schemas. Markup handles exceptions and overrides — it is not meant to style every line of a document from scratch.
When to use it: When you need to style something a schema can't identify on its own. A schema knows what a heading looks like structurally, but it doesn't know that "Professional Summary" is a section you want bolded in blue. An LLM does.
Cascade rule: Inline markup always overrides schema styles. [bold,color-FF0000] on a word wins over whatever the schema says for that line.
$glyph-font-size-11
This is normal body text, but [bold,color-FF0000]this phrase[/] stands out.
$glyph
Plaintext Agent (Markup Agent) — LLM-Powered Styling
What it does: You describe what you want in natural language, and the agent rewrites the plaintext with the appropriate $glyph blocks and [utilities]text[/] inline tags inserted.
What it is NOT: A content generator. It does not write or rewrite your text. It wraps existing text in markup.
When to use it: When you already have an established schema and want to apply styling that requires understanding the meaning of the text — things like "bold the professional summary" or "make the warning section red."
Requires: API key. This is an AI agent, so it adds processing time.
# The agent reads the plaintext, understands the request, and inserts markup
marked_up = client.ask(message="Make the professional summary bold", current_plaintext=plaintext)
Schema Agent — Developer Scaffolding
What it does: Helps you quickly draft or edit schemas through natural language prompts. You describe the document structure you want, and it generates selector JSON.
What it is NOT: A source of truth. Schemas generated by this agent should be reviewed, tested, and stored in your backend. Do not use agent-generated schemas in production without human review.
When to use it: During development, to bootstrap a schema quickly. Think of it like a code generator — useful to get started, but you own the output.
Requires: API key.
XML Agent — Experimental Final Polish
What it does: Operates directly on the unzipped DOCX XML structure. An LLM identifies a target element in the XML and writes modifications to it.
What it is NOT: A content generator or primary formatter. Do not use it to style an entire document. Its job is surgical, targeted edits — a final polish step when the schema and markup aren't enough.
When to use it: Rarely. When you need something that can't be expressed through schemas or markup — for example, modifying a specific XML attribute that Glyph's styling utilities don't cover. In theory an LLM can write anything to a DOCX with this method, but it requires precision.
Status: Beta. The accuracy and reliability of direct XML writing is still being researched.
Requires: API key.
Form Detection — Heuristic Line Classification
What it does: Classifies each line of plaintext by its structural form (H-SHORT, L-BULLET, P-BODY, T-ROW, etc.) using the same heuristic engine that powers schemas. Returns a list of classifications with confidence scores.
What it is NOT: AI. This is the same deterministic heuristic engine used by schemas, exposed as a standalone tool.
When to use it: When you want to understand what Glyph "sees" in your plaintext before building a schema. Also useful for filtering — extract only headings, or only list items, from a large document.
Performance: Local, milliseconds, no API key.
result = client.detect_forms(ws, text=text, forms=["H-SHORT", "L-BULLET"])
Document Chunking — Heading-Bounded Splitting
What it does: Splits plaintext or DOCX files at heading boundaries, producing independent chunks that can be processed one at a time.
What it is NOT: Semantic chunking. It splits at structural heading boundaries detected by heuristics, not by topic or meaning.
When to use it: To reduce LLM context window usage. Instead of sending a 50-page document to an LLM, chunk it and process one section at a time. Works with both plaintext files and DOCX files.
Performance: Local, milliseconds, no API key.
result = client.chunk_plaintext_text(ws, text=text)
for chunk in result["chunks"]:
llm_response = call_llm(chunk["plaintext"]) # Each chunk fits in context
Tool Summary
| Tool | AI? | API Key? | Speed | Purpose |
|---|---|---|---|---|
| Schema | No | No | Milliseconds | Baseline structural styling |
| Inline Markup | No | No | Milliseconds | Embedded style overrides in plaintext |
| Plaintext Agent | Yes | Yes | Seconds | LLM applies markup based on meaning |
| Schema Agent | Yes | Yes | Seconds | LLM drafts/edits schemas |
| XML Agent | Yes | Yes | Seconds | Direct DOCX XML modifications (beta) |
| Form Detection | No | No | Milliseconds | Classify lines by heuristic form |
| Chunking | No | No | Milliseconds | Split documents at heading boundaries |
| Indexing | No | No | Milliseconds | Structured document index with sections and form-annotated segments |
Workflow Patterns
Pattern 1: Schema Only (Fastest)
The simplest path. Good when your document structure is consistent and predictable.
LLM writes plaintext --> Schema styles heuristics --> DOCX
(milliseconds, no AI)
schema = client.build_schema_from_docx(ws, docx_path="template.docx")
docx_path = client.run_schema(ws, schema=schema, plaintext=plaintext)
Pattern 2: Schema + Markup Agent (Most Common)
Schema handles the structural baseline, then the markup agent adds context-aware overrides.
LLM writes plaintext --> Markup agent inserts styling --> Schema compiles --> DOCX
(seconds, requires API key) (milliseconds)
schema = client.build_schema_from_docx(ws, docx_path="template.docx")
# Agent understands "professional summary" semantically and inserts markup
response = client.ask(
message="Bold the professional summary and make section headers dark blue",
current_plaintext=plaintext,
)
marked_up_plaintext = response["plaintext"]
docx_path = client.run_schema(ws, schema=schema, plaintext=marked_up_plaintext)
Pattern 3: Chunk + Process (Large Documents)
For documents that exceed LLM context windows, chunk first, process per-section, reassemble.
Document --> Chunk at headings --> Process each chunk --> Reassemble --> Schema --> DOCX
(milliseconds) (per-chunk LLM calls)
chunks = client.chunk_plaintext_text(ws, text=full_document)
processed_sections = []
for chunk in chunks["chunks"]:
result = call_llm(chunk["plaintext"]) # Your LLM call
processed_sections.append(result)
final_plaintext = "\n".join(processed_sections)
docx_path = client.run_schema(ws, schema=schema, plaintext=final_plaintext)
Pattern 4: Detect + Filter (Pre-Processing)
Use form detection to extract or filter specific content types before processing.
Document --> Detect forms --> Filter by type --> Process subset
(milliseconds)
result = client.detect_forms(ws, text=text, forms=["H-SHORT", "H-SECTION-N"])
headings = [c["text"] for c in result["classifications"]]
# Use headings as a table of contents, outline, or navigation structure
Pattern 5: Index + Annotate (Structured Pre-Processing)
Build a structured index of your document with heading-bounded sections and form-annotated segments. Extract exactly what you need before calling an LLM.
Document --> Index at headings --> Annotate segments --> Extract specific forms --> LLM
(milliseconds) (milliseconds) (filter by type)
result = client.index_document(
ws,
text=full_document,
annotate_forms=["L-BULLET", "T-ROW"],
)
for sec in result["sections"]:
bullets = [s for s in sec["segments"] if s["form"] == "L-BULLET"]
if bullets:
# Only send bullet content to LLM — reduces tokens and noise
for b in bullets:
llm_response = call_llm(b["content"])
Building a Workflow: Resume Builder Example
Here's how the tools layer together for a real use case.
Step 1 — Build your schema. Look at your reference resume DOCX. Note the font sizes, colors, and spacing for headings, body text, bullet lists. Create selectors for each:
schema = client.build_schema_from_docx(ws, docx_path="resume_template.docx", save_as="resume")
At this point, LLM plaintext -> schema -> DOCX already produces a solid result with correct heading sizes, bullet formatting, and paragraph spacing. This runs in milliseconds with no API key.
Step 2 — Add markup for semantic styling. A schema knows what a heading looks like structurally, but it doesn't know that "Professional Summary" should be styled differently from "Education." Use the markup agent:
response = client.ask(
message="Bold the professional summary section header and make it dark blue",
current_plaintext=resume_plaintext,
)
The agent returns the same plaintext, but now the professional summary line is wrapped in $glyph-bold-color-1F4E78 markup. When the schema runs, inline markup overrides the default heading style for just that section.
Step 3 — Compile. Feed the marked-up plaintext through the schema:
docx_path = client.run_schema(
ws,
schema=schema,
plaintext=response["plaintext"],
dest_name="resume_output.docx"
)
The key insight: Schemas are fast and deterministic. Agents are smart but slow. Layer them — schema for the 90% that's structural, agents for the 10% that requires understanding.
CLI Usage
# Build schema from template
glyph-forge build template.docx -o ./output
# Build and run in one command
glyph-forge build-and-run template.docx input.txt -o ./output
# Run existing schema
glyph-forge run schema.json input.txt -o ./output
# Detect forms in plaintext
glyph-forge detect-forms document.txt --forms H-SHORT,L-BULLET
# Chunk a document
glyph-forge chunk report.txt
glyph-forge chunk report.docx
# Index a document with annotations
glyph-forge index document.txt --annotate-forms L-BULLET,T-ROW
glyph-forge index report.docx
Documentation
Full documentation: glyphapi.ai
License
Apache License 2.0 - see LICENSE for details.
Copyright 2025 Devpro LLC
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file glyph_forge-3.1.0.tar.gz.
File metadata
- Download URL: glyph_forge-3.1.0.tar.gz
- Upload date:
- Size: 194.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.0.1 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
46f14302b7b2f46133c0cdddee70763b5ed40b5c034c8c2bfd79083f9b923c56
|
|
| MD5 |
68946a89b778353de148b05ff229aad8
|
|
| BLAKE2b-256 |
3179a8e4ee3ddc000c5eee1283598589cfe1f842f36b32923aa5c6784ff6a21b
|
Provenance
The following attestation bundles were made for glyph_forge-3.1.0.tar.gz:
Publisher:
release.yml on Devpro-LLC/glyph-forge-client
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
glyph_forge-3.1.0.tar.gz -
Subject digest:
46f14302b7b2f46133c0cdddee70763b5ed40b5c034c8c2bfd79083f9b923c56 - Sigstore transparency entry: 929618629
- Sigstore integration time:
-
Permalink:
Devpro-LLC/glyph-forge-client@bd932e1712ee2499f2904906335f52363510e2d9 -
Branch / Tag:
refs/heads/prod - Owner: https://github.com/Devpro-LLC
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@bd932e1712ee2499f2904906335f52363510e2d9 -
Trigger Event:
push
-
Statement type:
File details
Details for the file glyph_forge-3.1.0-py3-none-any.whl.
File metadata
- Download URL: glyph_forge-3.1.0-py3-none-any.whl
- Upload date:
- Size: 251.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.0.1 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a3c26b9b6892bcf24657a2945d29714e326d6d2394ff725b3b60c4969e7e5579
|
|
| MD5 |
df6293b7e5ec440be590ca89ee66346e
|
|
| BLAKE2b-256 |
fa8dd17334f1db7bb9c4eb30520f8c21836781b3e9366f34664ea18a45d1a6e9
|
Provenance
The following attestation bundles were made for glyph_forge-3.1.0-py3-none-any.whl:
Publisher:
release.yml on Devpro-LLC/glyph-forge-client
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
glyph_forge-3.1.0-py3-none-any.whl -
Subject digest:
a3c26b9b6892bcf24657a2945d29714e326d6d2394ff725b3b60c4969e7e5579 - Sigstore transparency entry: 929618630
- Sigstore integration time:
-
Permalink:
Devpro-LLC/glyph-forge-client@bd932e1712ee2499f2904906335f52363510e2d9 -
Branch / Tag:
refs/heads/prod - Owner: https://github.com/Devpro-LLC
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@bd932e1712ee2499f2904906335f52363510e2d9 -
Trigger Event:
push
-
Statement type: