Skip to main content

Pip installable client for Glyph Forge API

Project description

Glyph Forge

A Python framework for turning LLM plaintext into styled DOCX documents. Inspired by HTML/CSS and Tailwind design patterns — schemas define the baseline formatting, inline markup handles the overrides, and AI agents handle the rest.

Installation

pip install glyph-forge

Quick Start

from glyph_forge import ForgeClient, create_workspace

ws = create_workspace()
client = ForgeClient()

# Build schema from a reference DOCX
schema = client.build_schema_from_docx(ws, docx_path="template.docx", save_as="my_schema")

# Generate a new DOCX from plaintext
docx_path = client.run_schema(
    ws,
    schema=schema,
    plaintext="Your content here...",
    dest_name="output.docx"
)

Tools & Their Purpose

Glyph Forge has several tools. Each one does a specific job. Understanding what each tool is (and isn't) for is key to building reliable workflows.

Schemas — Your Baseline Formatter

What it does: A schema maps heuristic types (headings, paragraphs, lists, tables) to styling rules. When you run a schema against plaintext, Glyph classifies each line by its structural role and applies the matching style.

What it is NOT: A schema is not AI. It does not understand meaning, context, or semantics. It matches structural patterns — a short title-cased line is a heading, a line starting with is a bullet, and so on.

When to use it: Always. The schema is the foundation of every Glyph workflow. Start here.

Performance: Schemas compile in milliseconds. No API key, no network calls, no latency.

# Schema selectors target structural patterns
{
    "type": "H-SHORT",       # Short headings (title case, ALL CAPS, <=6 words)
    "style": {"font": {"bold": true, "size": 18}}
}

Inline Markup — Context-Aware Overrides

What it does: Inline markup lets you (or an LLM) embed styling instructions directly in the plaintext. Block markup ($glyph-{utilities}) wraps entire paragraphs. Inline markup ([utilities]text[/]) styles specific words or phrases.

What it is NOT: A replacement for schemas. Markup handles exceptions and overrides — it is not meant to style every line of a document from scratch.

When to use it: When you need to style something a schema can't identify on its own. A schema knows what a heading looks like structurally, but it doesn't know that "Professional Summary" is a section you want bolded in blue. An LLM does.

Cascade rule: Inline markup always overrides schema styles. [bold,color-FF0000] on a word wins over whatever the schema says for that line.

$glyph-font-size-11
This is normal body text, but [bold,color-FF0000]this phrase[/] stands out.
$glyph

Plaintext Agent (Markup Agent) — LLM-Powered Styling

What it does: You describe what you want in natural language, and the agent rewrites the plaintext with the appropriate $glyph blocks and [utilities]text[/] inline tags inserted.

What it is NOT: A content generator. It does not write or rewrite your text. It wraps existing text in markup.

When to use it: When you already have an established schema and want to apply styling that requires understanding the meaning of the text — things like "bold the professional summary" or "make the warning section red."

Requires: API key. This is an AI agent, so it adds processing time.

# The agent reads the plaintext, understands the request, and inserts markup
marked_up = client.ask(message="Make the professional summary bold", current_plaintext=plaintext)

Schema Agent — Developer Scaffolding

What it does: Helps you quickly draft or edit schemas through natural language prompts. You describe the document structure you want, and it generates selector JSON.

What it is NOT: A source of truth. Schemas generated by this agent should be reviewed, tested, and stored in your backend. Do not use agent-generated schemas in production without human review.

When to use it: During development, to bootstrap a schema quickly. Think of it like a code generator — useful to get started, but you own the output.

Requires: API key.

XML Agent — Experimental Final Polish

What it does: Operates directly on the unzipped DOCX XML structure. An LLM identifies a target element in the XML and writes modifications to it.

What it is NOT: A content generator or primary formatter. Do not use it to style an entire document. Its job is surgical, targeted edits — a final polish step when the schema and markup aren't enough.

When to use it: Rarely. When you need something that can't be expressed through schemas or markup — for example, modifying a specific XML attribute that Glyph's styling utilities don't cover. In theory an LLM can write anything to a DOCX with this method, but it requires precision.

Status: Beta. The accuracy and reliability of direct XML writing is still being researched.

Requires: API key.

Form Detection — Heuristic Line Classification

What it does: Classifies each line of plaintext by its structural form (H-SHORT, L-BULLET, P-BODY, T-ROW, etc.) using the same heuristic engine that powers schemas. Returns a list of classifications with confidence scores.

What it is NOT: AI. This is the same deterministic heuristic engine used by schemas, exposed as a standalone tool.

When to use it: When you want to understand what Glyph "sees" in your plaintext before building a schema. Also useful for filtering — extract only headings, or only list items, from a large document.

Performance: Local, milliseconds, no API key.

result = client.detect_forms(ws, text=text, forms=["H-SHORT", "L-BULLET"])

Document Chunking — Heading-Bounded Splitting

What it does: Splits plaintext or DOCX files at heading boundaries, producing independent chunks that can be processed one at a time.

What it is NOT: Semantic chunking. It splits at structural heading boundaries detected by heuristics, not by topic or meaning.

When to use it: To reduce LLM context window usage. Instead of sending a 50-page document to an LLM, chunk it and process one section at a time. Works with both plaintext files and DOCX files.

Performance: Local, milliseconds, no API key.

result = client.chunk_plaintext_text(ws, text=text)
for chunk in result["chunks"]:
    llm_response = call_llm(chunk["plaintext"])  # Each chunk fits in context

Tool Summary

Tool AI? API Key? Speed Purpose
Schema No No Milliseconds Baseline structural styling
Inline Markup No No Milliseconds Embedded style overrides in plaintext
Plaintext Agent Yes Yes Seconds LLM applies markup based on meaning
Schema Agent Yes Yes Seconds LLM drafts/edits schemas
XML Agent Yes Yes Seconds Direct DOCX XML modifications (beta)
Form Detection No No Milliseconds Classify lines by heuristic form
Chunking No No Milliseconds Split documents at heading boundaries

Workflow Patterns

Pattern 1: Schema Only (Fastest)

The simplest path. Good when your document structure is consistent and predictable.

LLM writes plaintext --> Schema styles heuristics --> DOCX
                         (milliseconds, no AI)
schema = client.build_schema_from_docx(ws, docx_path="template.docx")
docx_path = client.run_schema(ws, schema=schema, plaintext=plaintext)

Pattern 2: Schema + Markup Agent (Most Common)

Schema handles the structural baseline, then the markup agent adds context-aware overrides.

LLM writes plaintext --> Markup agent inserts styling --> Schema compiles --> DOCX
                         (seconds, requires API key)      (milliseconds)
schema = client.build_schema_from_docx(ws, docx_path="template.docx")

# Agent understands "professional summary" semantically and inserts markup
response = client.ask(
    message="Bold the professional summary and make section headers dark blue",
    current_plaintext=plaintext,
)
marked_up_plaintext = response["plaintext"]

docx_path = client.run_schema(ws, schema=schema, plaintext=marked_up_plaintext)

Pattern 3: Chunk + Process (Large Documents)

For documents that exceed LLM context windows, chunk first, process per-section, reassemble.

Document --> Chunk at headings --> Process each chunk --> Reassemble --> Schema --> DOCX
             (milliseconds)       (per-chunk LLM calls)
chunks = client.chunk_plaintext_text(ws, text=full_document)

processed_sections = []
for chunk in chunks["chunks"]:
    result = call_llm(chunk["plaintext"])  # Your LLM call
    processed_sections.append(result)

final_plaintext = "\n".join(processed_sections)
docx_path = client.run_schema(ws, schema=schema, plaintext=final_plaintext)

Pattern 4: Detect + Filter (Pre-Processing)

Use form detection to extract or filter specific content types before processing.

Document --> Detect forms --> Filter by type --> Process subset
             (milliseconds)
result = client.detect_forms(ws, text=text, forms=["H-SHORT", "H-SECTION-N"])
headings = [c["text"] for c in result["classifications"]]
# Use headings as a table of contents, outline, or navigation structure

Building a Workflow: Resume Builder Example

Here's how the tools layer together for a real use case.

Step 1 — Build your schema. Look at your reference resume DOCX. Note the font sizes, colors, and spacing for headings, body text, bullet lists. Create selectors for each:

schema = client.build_schema_from_docx(ws, docx_path="resume_template.docx", save_as="resume")

At this point, LLM plaintext -> schema -> DOCX already produces a solid result with correct heading sizes, bullet formatting, and paragraph spacing. This runs in milliseconds with no API key.

Step 2 — Add markup for semantic styling. A schema knows what a heading looks like structurally, but it doesn't know that "Professional Summary" should be styled differently from "Education." Use the markup agent:

response = client.ask(
    message="Bold the professional summary section header and make it dark blue",
    current_plaintext=resume_plaintext,
)

The agent returns the same plaintext, but now the professional summary line is wrapped in $glyph-bold-color-1F4E78 markup. When the schema runs, inline markup overrides the default heading style for just that section.

Step 3 — Compile. Feed the marked-up plaintext through the schema:

docx_path = client.run_schema(
    ws,
    schema=schema,
    plaintext=response["plaintext"],
    dest_name="resume_output.docx"
)

The key insight: Schemas are fast and deterministic. Agents are smart but slow. Layer them — schema for the 90% that's structural, agents for the 10% that requires understanding.


CLI Usage

# Build schema from template
glyph-forge build template.docx -o ./output

# Build and run in one command
glyph-forge build-and-run template.docx input.txt -o ./output

# Run existing schema
glyph-forge run schema.json input.txt -o ./output

# Detect forms in plaintext
glyph-forge detect-forms document.txt --forms H-SHORT,L-BULLET

# Chunk a document
glyph-forge chunk report.txt
glyph-forge chunk report.docx

Documentation

Full documentation: glyphapi.ai

License

Apache License 2.0 - see LICENSE for details.

Copyright 2025 Devpro LLC

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

glyph_forge-3.0.0.tar.gz (191.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

glyph_forge-3.0.0-py3-none-any.whl (248.5 kB view details)

Uploaded Python 3

File details

Details for the file glyph_forge-3.0.0.tar.gz.

File metadata

  • Download URL: glyph_forge-3.0.0.tar.gz
  • Upload date:
  • Size: 191.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for glyph_forge-3.0.0.tar.gz
Algorithm Hash digest
SHA256 30c0159fdd3e9ee7148268f46d06337d703a4bf8419d83c8dcaff50cf3e96eb7
MD5 2e89549a0d8184cb7ea2634c69188b58
BLAKE2b-256 5d7a8da39f4dde648743713520c9ef1f916db11253e997e7a12ddc047cde3751

See more details on using hashes here.

Provenance

The following attestation bundles were made for glyph_forge-3.0.0.tar.gz:

Publisher: release.yml on Devpro-LLC/glyph-forge-client

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glyph_forge-3.0.0-py3-none-any.whl.

File metadata

  • Download URL: glyph_forge-3.0.0-py3-none-any.whl
  • Upload date:
  • Size: 248.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for glyph_forge-3.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 204a3ed7ba31d9ff876cee56fda3c811bce93dfe5abf206a62b1ac880b28e0f9
MD5 250024f163f6308cbd670ebeb4f10683
BLAKE2b-256 f3875150467f7eeb72f48365ef4084173b352ff0cbb75e06588fa0bcd3e0cdb2

See more details on using hashes here.

Provenance

The following attestation bundles were made for glyph_forge-3.0.0-py3-none-any.whl:

Publisher: release.yml on Devpro-LLC/glyph-forge-client

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page