Python client for the AILANG Parse document parsing API
Project description
AILANG Parse Python SDK
Python client for the AILANG Parse document parsing API. Parse 13 formats, generate 8 — zero dependencies for Office, pluggable AI for PDFs.
Install
pip install ailang-parse
Quick Start
from ailang_parse import DocParse
client = DocParse(api_key="dp_your_key_here")
# Parse a document
result = client.parse("report.docx")
print(f"{len(result.blocks)} blocks, format: {result.format}")
for block in result.blocks:
if block.type == "heading":
print(f" H{block.level}: {block.text}")
elif block.type == "table":
print(f" Table: {len(block.headers)} cols, {len(block.rows)} rows")
elif block.type == "change":
print(f" {block.change_type} by {block.author}: {block.text}")
else:
print(f" {block.type}: {block.text[:80]}")
Parse Documents
# Parse with different output formats
result = client.parse("report.docx") # Block ADT (default)
result = client.parse("report.docx", output_format="markdown") # Markdown
result = client.parse("report.docx", output_format="html") # HTML
# Access structured data
print(result.status) # "success"
print(result.filename) # "report.docx"
print(result.format) # "zip-office"
print(result.blocks) # List[Block]
print(result.metadata.title) # Document title
print(result.metadata.author) # Document author
print(result.summary.tables) # Number of tables found
Supported Formats
formats = client.formats()
print(formats.parse) # ['docx', 'pptx', 'xlsx', 'odt', 'odp', 'ods', 'html', 'md', 'csv', 'epub', 'pdf', 'png', 'jpg']
print(formats.generate) # ['docx', 'pptx', 'xlsx', 'odt', 'odp', 'ods', 'html', 'md']
print(formats.ai_required) # ['pdf', 'png', 'jpg', 'gif', 'bmp', 'tiff']
Block Types
AILANG Parse returns 9 block types:
| Type | Fields | Description |
|---|---|---|
text |
text, style, level |
Paragraphs, code blocks |
heading |
text, level (1-6) |
Document headings |
table |
headers, rows |
Tables with merge tracking |
list |
items, ordered |
Ordered/unordered lists |
image |
description, mime, data_length |
Embedded images |
audio |
transcription, mime |
Audio transcriptions |
video |
description, mime |
Video descriptions |
section |
kind, children |
Slides, sheets, headers/footers |
change |
change_type, author, date, text |
Track changes |
Table cells
Table cells can be simple strings or merged cells:
for block in result.blocks:
if block.type == "table":
for cell in block.headers:
print(f" {cell.text} (colspan={cell.col_span}, merged={cell.merged})")
Nested sections
Section blocks contain child blocks (slides, sheets, headers/footers):
for block in result.blocks:
if block.type == "section":
print(f"Section: {block.kind}") # "slide", "sheet", "header", "footer", etc.
for child in block.children:
print(f" {child.type}: {child.text[:50]}")
API Key Management
Key generation uses the device auth flow (v0.10.0+). Direct generation is no longer available.
# Get a key via device auth flow:
# 1. POST /api/v1/auth/device → {device_code, user_code, verification_url}
# 2. User opens verification_url, signs in, clicks Approve
# 3. POST /api/v1/auth/device/poll → {api_key, tier}
# Check usage
usage = client.keys.usage(key_id="abc123", user_id="user123")
print(f"Requests today: {usage.usage.requests_today} / {usage.quota.requests_per_day}")
print(f"Pages this month: {usage.usage.pages_this_month} / {usage.quota.pages_per_month}")
# Rotate (new key, old one revoked, same tier)
new_key = client.keys.rotate(key_id="abc123", user_id="user123")
print(new_key.key) # New key
# Revoke
client.keys.revoke(key_id="abc123", user_id="user123")
Migrating from Unstructured
One import change:
# Before
from unstructured_client import UnstructuredClient
client = UnstructuredClient(server_url="https://api.unstructured.io")
# After
from ailang_parse import UnstructuredClient
client = UnstructuredClient(
server_url="https://ailang-dev-docparse-api-ejjw6zt3bq-ew.a.run.app"
)
# All existing code works unchanged
elements = client.general.partition(file="report.docx")
for el in elements:
print(f"{el.type}: {el.text[:80]}")
print(f" metadata: {el.metadata.filename}")
Error Handling
from ailang_parse import DocParse, DocParseError, AuthError, QuotaError
client = DocParse(api_key="dp_invalid")
try:
result = client.parse("file.docx")
except AuthError as e:
print(f"Bad key: {e}") # 401
except QuotaError as e:
print(f"Quota exceeded: {e}") # 429
except DocParseError as e:
print(f"API error ({e.status_code}): {e}")
Configuration
client = DocParse(
api_key="dp_your_key",
base_url="https://your-deployment.run.app", # Custom endpoint
timeout=120, # Request timeout (seconds)
)
Links
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ailang_parse-0.1.1.tar.gz.
File metadata
- Download URL: ailang_parse-0.1.1.tar.gz
- Upload date:
- Size: 7.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1355b9a9e867152d7e67fa6bb17b5b407c67b02a39ae44f2ad3b8c88249e0d7e
|
|
| MD5 |
e1ac3afd7f59b0cc4d9cb3796d1c063e
|
|
| BLAKE2b-256 |
756b0832c82bdcec9e5f075c617c9ebf682b13b0ac44938d7b2a50d470e5e523
|
Provenance
The following attestation bundles were made for ailang_parse-0.1.1.tar.gz:
Publisher:
publish-sdks.yml on sunholo-data/docparse
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ailang_parse-0.1.1.tar.gz -
Subject digest:
1355b9a9e867152d7e67fa6bb17b5b407c67b02a39ae44f2ad3b8c88249e0d7e - Sigstore transparency entry: 1195369515
- Sigstore integration time:
-
Permalink:
sunholo-data/docparse@0c9350c090589cbfad5a043a6444bf509a1582f1 -
Branch / Tag:
refs/tags/sdk-v0.1.1 - Owner: https://github.com/sunholo-data
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-sdks.yml@0c9350c090589cbfad5a043a6444bf509a1582f1 -
Trigger Event:
push
-
Statement type:
File details
Details for the file ailang_parse-0.1.1-py3-none-any.whl.
File metadata
- Download URL: ailang_parse-0.1.1-py3-none-any.whl
- Upload date:
- Size: 9.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
762cacdaa5a36fdd73b0c217c8f2b7ec9da9df16d51f5f2ec87a980c3381ca53
|
|
| MD5 |
703fff057a876a9b0274fd602e92871a
|
|
| BLAKE2b-256 |
1d272f1c828136c9b423984da5b29958015933bf45aba0a5f6cbb1f42acd1af7
|
Provenance
The following attestation bundles were made for ailang_parse-0.1.1-py3-none-any.whl:
Publisher:
publish-sdks.yml on sunholo-data/docparse
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ailang_parse-0.1.1-py3-none-any.whl -
Subject digest:
762cacdaa5a36fdd73b0c217c8f2b7ec9da9df16d51f5f2ec87a980c3381ca53 - Sigstore transparency entry: 1195369648
- Sigstore integration time:
-
Permalink:
sunholo-data/docparse@0c9350c090589cbfad5a043a6444bf509a1582f1 -
Branch / Tag:
refs/tags/sdk-v0.1.1 - Owner: https://github.com/sunholo-data
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-sdks.yml@0c9350c090589cbfad5a043a6444bf509a1582f1 -
Trigger Event:
push
-
Statement type: