Skip to main content

Python client for the AILANG Parse document parsing API

Project description

AILANG Parse Python SDK

Python client for the AILANG Parse document parsing API. Parse 13 formats, generate 8 — zero dependencies for Office, pluggable AI for PDFs.

Install

pip install ailang-parse

Quick Start

from ailang_parse import DocParse

client = DocParse(api_key="dp_your_key_here")

# Parse a document
result = client.parse("report.docx")
print(f"{len(result.blocks)} blocks, format: {result.format}")

for block in result.blocks:
    if block.type == "heading":
        print(f"  H{block.level}: {block.text}")
    elif block.type == "table":
        print(f"  Table: {len(block.headers)} cols, {len(block.rows)} rows")
    elif block.type == "change":
        print(f"  {block.change_type} by {block.author}: {block.text}")
    else:
        print(f"  {block.type}: {block.text[:80]}")

Parse Documents

# Parse with different output formats
result = client.parse("report.docx")                        # Block ADT (default)
result = client.parse("report.docx", output_format="markdown")  # Markdown
result = client.parse("report.docx", output_format="html")      # HTML

# Access structured data
print(result.status)          # "success"
print(result.filename)        # "report.docx"
print(result.format)          # "zip-office"
print(result.blocks)          # List[Block]
print(result.metadata.title)  # Document title
print(result.metadata.author) # Document author
print(result.summary.tables)  # Number of tables found

Supported Formats

formats = client.formats()
print(formats.parse)       # ['docx', 'pptx', 'xlsx', 'odt', 'odp', 'ods', 'html', 'md', 'csv', 'epub', 'pdf', 'png', 'jpg']
print(formats.generate)    # ['docx', 'pptx', 'xlsx', 'odt', 'odp', 'ods', 'html', 'md']
print(formats.ai_required) # ['pdf', 'png', 'jpg', 'gif', 'bmp', 'tiff']

Block Types

AILANG Parse returns 9 block types:

Type Fields Description
text text, style, level Paragraphs, code blocks
heading text, level (1-6) Document headings
table headers, rows Tables with merge tracking
list items, ordered Ordered/unordered lists
image description, mime, data_length Embedded images
audio transcription, mime Audio transcriptions
video description, mime Video descriptions
section kind, children Slides, sheets, headers/footers
change change_type, author, date, text Track changes

Table cells

Table cells can be simple strings or merged cells:

for block in result.blocks:
    if block.type == "table":
        for cell in block.headers:
            print(f"  {cell.text} (colspan={cell.col_span}, merged={cell.merged})")

Nested sections

Section blocks contain child blocks (slides, sheets, headers/footers):

for block in result.blocks:
    if block.type == "section":
        print(f"Section: {block.kind}")  # "slide", "sheet", "header", "footer", etc.
        for child in block.children:
            print(f"  {child.type}: {child.text[:50]}")

API Key Management

Key generation uses the device auth flow (v0.10.0+). Direct generation is no longer available.

# Get a key via device auth flow:
#   1. POST /api/v1/auth/device       → {device_code, user_code, verification_url}
#   2. User opens verification_url, signs in, clicks Approve
#   3. POST /api/v1/auth/device/poll  → {api_key, tier}

# Check usage
usage = client.keys.usage(key_id="abc123", user_id="user123")
print(f"Requests today: {usage.usage.requests_today} / {usage.quota.requests_per_day}")
print(f"Pages this month: {usage.usage.pages_this_month} / {usage.quota.pages_per_month}")

# Rotate (new key, old one revoked, same tier)
new_key = client.keys.rotate(key_id="abc123", user_id="user123")
print(new_key.key)  # New key

# Revoke
client.keys.revoke(key_id="abc123", user_id="user123")

Migrating from Unstructured

One import change:

# Before
from unstructured_client import UnstructuredClient
client = UnstructuredClient(server_url="https://api.unstructured.io")

# After
from ailang_parse import UnstructuredClient
client = UnstructuredClient(
    server_url="https://ailang-dev-docparse-api-ejjw6zt3bq-ew.a.run.app"
)

# All existing code works unchanged
elements = client.general.partition(file="report.docx")
for el in elements:
    print(f"{el.type}: {el.text[:80]}")
    print(f"  metadata: {el.metadata.filename}")

Error Handling

from ailang_parse import DocParse, DocParseError, AuthError, QuotaError

client = DocParse(api_key="dp_invalid")

try:
    result = client.parse("file.docx")
except AuthError as e:
    print(f"Bad key: {e}")           # 401
except QuotaError as e:
    print(f"Quota exceeded: {e}")    # 429
except DocParseError as e:
    print(f"API error ({e.status_code}): {e}")

Configuration

client = DocParse(
    api_key="dp_your_key",
    base_url="https://your-deployment.run.app",  # Custom endpoint
    timeout=120,                                   # Request timeout (seconds)
)

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ailang_parse-0.1.0.tar.gz (7.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ailang_parse-0.1.0-py3-none-any.whl (9.5 kB view details)

Uploaded Python 3

File details

Details for the file ailang_parse-0.1.0.tar.gz.

File metadata

  • Download URL: ailang_parse-0.1.0.tar.gz
  • Upload date:
  • Size: 7.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.7

File hashes

Hashes for ailang_parse-0.1.0.tar.gz
Algorithm Hash digest
SHA256 c4aebb4308889999a5e5316c81670be364abe8c5dfff4b1bbe3b2629705a89b8
MD5 6b90074a64303f505fd134894f338d96
BLAKE2b-256 35c91f0cbb32e48fee1254b078a79765b8e9449811dec016646fd980c06fd44b

See more details on using hashes here.

File details

Details for the file ailang_parse-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for ailang_parse-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 50fd7db4e23baebd9c58c3c2da3c5b7ae56acfe4e185eee2465936b89b8dfe07
MD5 6e925980af3da7a1911dfedef71eb706
BLAKE2b-256 7de9be784b322b66466ab61dc6a12815305aa4487989dcc209c28c16840ea4ac

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page