AI-powered document intelligence platform - Turn your data into structured data with a single line of code.
Project description
ByteIT Python SDK
Python client for ByteIT — AI-powered document parsing. Extract structured text from PDFs, Word files, images, and more with a single API call.
Installation
pip install byteit
Requires Python 3.8+ and an API key from byteit.ai.
Quick Start
from byteit import ByteITClient, OutputFormat
client = ByteITClient(api_key="your_api_key")
result = client.parse("document.pdf")
print(result.decode())
Returns raw bytes. Pass output="result.md" to save directly to disk.
Usage
Parse and save
# Returns bytes
result = client.parse("invoice.pdf", result_format=OutputFormat.JSON)
# Save to file
client.parse(
"invoice.pdf",
result_format=OutputFormat.MD,
output="invoice.md",
)
Output formats: OutputFormat.MD (default), OutputFormat.TXT,
OutputFormat.JSON, OutputFormat.HTML, OutputFormat.EXCEL
Excel output note: OutputFormat.EXCEL extracts tables into one or more Excel
files. Because a document can contain multiple tables, we return the Excel
files bundled in a single .zip archive. If you pass the output parameter
with result_format=OutputFormat.EXCEL, the output path should end with .zip
instead of .xlsx.
Async (non-blocking)
Submit a job and check back later — useful for large files or batch workflows.
# Submit without waiting
job = client.parse_async("document.pdf")
# Poll status
status = client.get_job_status(job.id)
# status.processing_status: "pending" | "processing" | "completed" | "failed"
# Fetch full job details when needed
details = client.get_job_details(job.id)
# Download when ready
if status.is_completed:
result = client.get_job_result(job.id)
Job management
job_list = client.get_jobs()
for job in job_list.jobs:
print(f"{job.id} {job.processing_status} {job.result_format}")
Processing options
from byteit import ProcessingOptions
result = client.parse(
"document.pdf",
processing_options=ProcessingOptions(languages=["de", "en"], page_range="1-5"),
)
Or pass a plain dict:
result = client.parse("doc.pdf", processing_options={"languages": ["de"]})
API key from environment
import os
client = ByteITClient(api_key=os.environ["BYTEIT_API_KEY"])
Context manager
with ByteITClient(api_key="your_key") as client:
result = client.parse("doc.pdf")
Supported File Types
| Documents | Images |
|---|---|
PDF .pdf |
PNG .png |
Word .docx |
JPEG .jpg .jpeg |
PowerPoint .pptx |
TIFF .tiff |
HTML .html |
BMP .bmp |
Markdown .md |
|
Plain text .txt |
|
JSON .json |
|
XML .xml |
Error Handling
All exceptions inherit from ByteITError.
from byteit.exceptions import (
AuthenticationError,
ValidationError,
RateLimitError,
JobProcessingError,
ByteITError,
)
try:
result = client.parse("document.pdf")
except AuthenticationError:
print("Invalid API key")
except ValidationError as e:
print("Bad request:", e.message)
except RateLimitError:
print("Rate limit hit — retry later")
except JobProcessingError as e:
print("Processing failed:", e.message)
except ByteITError as e:
print("Unexpected error:", e.message)
| Exception | When raised |
|---|---|
AuthenticationError |
Invalid or missing API key |
APIKeyError |
API key rejected (403) |
ValidationError |
Bad request parameters |
ResourceNotFoundError |
Job not found |
RateLimitError |
Rate limit exceeded |
JobProcessingError |
Job failed during processing |
ServerError |
Server-side error (5xx) |
API Reference
ByteITClient(api_key)
| Method | Description |
|---|---|
parse(input, ...) |
Parse a document, block until complete, return bytes |
parse_async(input, ...) |
Submit a job, return ParseJob immediately |
get_job_details(job_id) |
Get full ParseJob details |
get_job_status(job_id) |
Get current JobStatus |
get_job_result(job_id) |
Download result as bytes |
get_jobs() |
List all jobs as JobList |
parse(input, output=None, processing_options=None, result_format=OutputFormat.MD) → bytes
| Param | Type | Description |
|---|---|---|
input |
str | Path | InputConnector |
File to parse |
output |
str | Path | None |
Save result to disk (optional) |
processing_options |
ProcessingOptions | dict | None |
Languages, page range, etc. |
result_format |
OutputFormat |
OutputFormat.MD, OutputFormat.TXT, OutputFormat.JSON, OutputFormat.HTML, OutputFormat.EXCEL |
When result_format is OutputFormat.EXCEL, the returned bytes represent a
.zip archive containing the generated Excel files.
parse_async(input, processing_options=None, result_format=OutputFormat.MD) → ParseJob
Same parameters as parse, minus output. Returns a ParseJob without waiting.
ParseJob properties
| Property | Type | Description |
|---|---|---|
id |
str |
Unique job identifier |
processing_status |
str |
pending / processing / completed / failed |
result_format |
str |
Output format |
is_completed |
bool |
True when result is ready |
is_failed |
bool |
True if job failed |
metadata |
DocumentMetadata |
Filename, page count, language, etc. |
Notebook Integration
Results are automatically rendered when running in Jupyter:
OutputFormat.MD→ rendered MarkdownOutputFormat.HTML→ rendered HTMLOutputFormat.JSON→ interactive treeOutputFormat.TXT→ code block
To disable auto-display, pass output="file.md".
Resources
- Studio: studio.byteit.ai — Process and test with a graphical user interface.
- Colab notebook: Quick demo
- Pricing: byteit.ai/pricing — 1,000 free credits
- Support: byteit.ai/support
- LinkedIn: ByteIT on LinkedIn
Licensed under Apache 2.0. © 2026 ByteIT GmbH.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file byteit-1.0.1.tar.gz.
File metadata
- Download URL: byteit-1.0.1.tar.gz
- Upload date:
- Size: 35.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.8.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0f66f723e8d5629c413e7980adb9394a593f410e99e55ed4176b3a9720e5196e
|
|
| MD5 |
3321f6e36916b7631f960b4e4f640e02
|
|
| BLAKE2b-256 |
cff7abf071e04da5bb87f2c76cade8c1957883785cd1f2adfd8a2782b947894b
|
File details
Details for the file byteit-1.0.1-py3-none-any.whl.
File metadata
- Download URL: byteit-1.0.1-py3-none-any.whl
- Upload date:
- Size: 28.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.8.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
926031135e1d811185bab2c2b7f59edce779686926f5f69e4c9e2e90b7a219cd
|
|
| MD5 |
52dc13d7aca540f44a3d9aeaaa8cc66f
|
|
| BLAKE2b-256 |
a78a7e221ccb962d5a4d329a9ad379baebd8ce4549d0d6a204e888b00502bd50
|