AI-powered document intelligence platform - Turn your data into structured data with a single line of code.

These details have not been verified by PyPI

Project links

Project description

ByteIT API Library

Turn your data into AI - Transform documents into structured data with a single line of code.

ByteIT is an AI-powered document intelligence platform that extracts clean, structured data from PDFs, Word, Excel, and many other file formats. This Python SDK provides a simple, developer-first interface to ByteIT's advanced document processing capabilities.

Why ByteIT?

Lightning Fast - Process documents in under 2 seconds
AI-Powered - Advanced ML models trained on millions of documents
Simple API - Parse documents in one line: client.parse("document.pdf")
Developer First - Clean code, full type hints, comprehensive SDKs
Enterprise Security - End-to-end encryption and GDPR compliance
Smart Extraction - Extract text, tables, forms, and structured data with AI precision

Quick Start

Installation

pip install byteit

Basic Usage

from byteit import ByteITClient

# Initialize client
client = ByteITClient(api_key="your_api_key")

# Parse a document
result = client.parse("invoice.pdf")
print(result.decode())

That's it. Your document is now structured text.

Features

Parse Any Document

# Local files
result = client.parse("contract.pdf")

# Different formats
txt_result = client.parse("doc.pdf", output_format="txt")
json_result = client.parse("doc.pdf", output_format="json")
md_result = client.parse("doc.pdf", output_format="md")
html_result = client.parse("doc.pdf", output_format="html")

# Save to file
client.parse("doc.pdf", output="result.txt")

S3 Integration

Process files directly from S3 without downloading - perfect for high-volume workflows:

from byteit.connectors import S3InputConnector

# Parse from S3
result = client.parse(
    S3InputConnector(
        source_bucket="my-documents",
        source_path_inside_bucket="invoices/jan-2024.pdf"
    )
)

Job Management

Track and retrieve processing jobs:

# List all jobs
jobs = client.get_all_jobs()
for job in jobs:
    print(f"{job.id}: {job.processing_status}")

# Get specific job
job = client.get_job_by_id("job_123")

# Download result later
if job.is_completed:
    result = client.get_result(job.id)

Context Manager

Automatic resource cleanup:

with ByteITClient(api_key="your_key") as client:
    result = client.parse("document.pdf")
    # Session automatically closed

API Reference

ByteITClient

ByteITClient(api_key: str)

Initialize the ByteIT client.

Parameters:

api_key (str): Your ByteIT API key

Methods:

`parse(input, output_format="txt", output=None)`

Parse a document and return the result.

Parameters:

input (str | Path | InputConnector): File to parse
- str or Path: Local file path
- S3InputConnector: For S3 files
output_format (str): Output format - "txt", "json", "md", or "html" (default: "txt")
output (str | Path | None): Optional file path to save result

Returns: bytes - Parsed content

Example:

result = client.parse("doc.pdf", output_format="json")

`get_all_jobs()`

Get all jobs for your account.

Returns: List[Job] - List of Job objects

`get_job_by_id(job_id: str)`

Get a specific job by ID.

Parameters:

job_id (str): The job ID

Returns: Job - Job object

`get_result(job_id: str)`

Download result for a completed job.

Parameters:

job_id (str): The job ID

Returns: bytes - Result content

Connectors

LocalFileInputConnector

Read files from local filesystem.

from byteit.connectors import LocalFileInputConnector

connector = LocalFileInputConnector("path/to/file.pdf")
result = client.parse(connector)

S3InputConnector

Read files from Amazon S3 using IAM role authentication - files never pass through your machine.

Prerequisites:

Contact ByteIT support to set up AWS connection
Provide IAM role ARN for ByteIT to assume
Grant role read access to your bucket

from byteit.connectors import S3InputConnector

connector = S3InputConnector(
    source_bucket="my-bucket",
    source_path_inside_bucket="documents/file.pdf"
)
result = client.parse(connector)

Error Handling

ByteIT SDK provides specific exceptions for different error scenarios:

from byteit.exceptions import (
    APIKeyError,           # Invalid API key
    AuthenticationError,   # Authentication failed
    ValidationError,       # Invalid parameters
    ResourceNotFoundError, # Job/resource not found
    RateLimitError,        # Rate limit exceeded
    ServerError,           # Server-side error (5xx)
    JobProcessingError,    # Job processing failed
)

try:
    result = client.parse("document.pdf")
except ValidationError as e:
    print(f"Invalid input: {e.message}")
except RateLimitError:
    print("Rate limit exceeded - please wait")
except JobProcessingError as e:
    print(f"Processing failed: {e.message}")

All exceptions inherit from ByteITError:

from byteit.exceptions import ByteITError

try:
    result = client.parse("document.pdf")
except ByteITError as e:
    print(f"ByteIT error: {e.message}")
    print(f"Status code: {e.status_code}")
    print(f"Response: {e.response}")

Advanced Usage

Batch Processing

Process multiple files efficiently:

files = ["doc1.pdf", "doc2.pdf", "doc3.pdf"]
results = []

for file in files:
    result = client.parse(file, output_format="json")
    results.append(result)

Custom Output Paths

Organize results systematically:

from pathlib import Path

input_dir = Path("inputs")
output_dir = Path("outputs")
output_dir.mkdir(exist_ok=True)

for pdf_file in input_dir.glob("*.pdf"):
    output_file = output_dir / f"{pdf_file.stem}.txt"
    client.parse(pdf_file, output=output_file)

S3 Workflow

High-volume cloud processing:

from byteit.connectors import S3InputConnector

# Process multiple S3 files
s3_files = [
    "invoices/2024-01.pdf",
    "invoices/2024-02.pdf",
    "invoices/2024-03.pdf",
]

for s3_path in s3_files:
    connector = S3InputConnector(
        source_bucket="my-documents",
        source_path_inside_bucket=s3_path
    )
    result = client.parse(connector, output_format="json")
    # Process result...

Configuration

Environment Variables

Set your API key via environment variable:

export BYTEIT_API_KEY="your_api_key_here"

import os
from byteit import ByteITClient

client = ByteITClient(api_key=os.getenv("BYTEIT_API_KEY"))

Custom Base URL

For testing or custom deployments:

from byteit import ByteITClient

# Set custom URL (for development/testing)
ByteITClient.BASE_URL = "http://localhost:8000"
client = ByteITClient(api_key="test_key")

Testing

The SDK includes comprehensive unit and integration tests.

Run Unit Tests

pytest

Run Integration Tests

Integration tests require a running ByteIT API and valid API key:

export BYTEIT_API_KEY="your_api_key"
pytest -m integration

Run All Tests

pytest -m ""

Requirements

Python 3.8+
requests library

About ByteIT

ByteIT transforms unstructured documents into clean, structured data with AI-powered precision. Built for scale, designed for developers.

Get started today: Start Processing Free - 1,000 free pages/month

Support & Resources

Website: https://byteit.ai
Pricing: https://byteit.ai/pricing
Support: https://byteit.ai/support
Contact: https://byteit.ai/contact
LinkedIn: ByteIT on LinkedIn

Legal

Privacy Policy: https://byteit.ai/privacy-policy
Terms of Service: https://byteit.ai/terms
Impressum: https://byteit.ai/impressum

This project is licensed under the terms specified in the LICENSE file.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.1.0

May 6, 2026

1.0.1

Apr 29, 2026

1.0.0

Mar 25, 2026

0.1.2

Jan 31, 2026

This version

0.1.1

Jan 24, 2026

0.1.0

Jan 18, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

byteit-0.1.1.tar.gz (27.3 kB view details)

Uploaded Jan 24, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

byteit-0.1.1-py3-none-any.whl (23.6 kB view details)

Uploaded Jan 24, 2026 Python 3

File details

Details for the file byteit-0.1.1.tar.gz.

File metadata

Download URL: byteit-0.1.1.tar.gz
Upload date: Jan 24, 2026
Size: 27.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.8.18

File hashes

Hashes for byteit-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`9fab9a173aebd209c749f52a5c4f345cfac0c6a2c05e6fe1c7e7aea9c339ef3e`
MD5	`2bce777027b3cd3881270cf5fc4bd386`
BLAKE2b-256	`37ce87ef92c1011bed1f246f4325048d8bdbe6cc08e5f2e58b8d027129930d43`

See more details on using hashes here.

File details

Details for the file byteit-0.1.1-py3-none-any.whl.

File metadata

Download URL: byteit-0.1.1-py3-none-any.whl
Upload date: Jan 24, 2026
Size: 23.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.8.18

File hashes

Hashes for byteit-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`839656ebc633bd80e963e2ee49b4e27b8f776d26b532b77064a73cac3a761def`
MD5	`c94b50d48060ea9ba28152fad6f8c45c`
BLAKE2b-256	`b8651a9b0ecc23158aab997c68ac3c166605a4d2d8984cd4e4f35be5e9b8387c`

See more details on using hashes here.

byteit 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

ByteIT API Library

Why ByteIT?

Quick Start

Installation

Basic Usage

Features

Parse Any Document

S3 Integration

Job Management

Context Manager

API Reference

ByteITClient

parse(input, output_format="txt", output=None)

get_all_jobs()

get_job_by_id(job_id: str)

get_result(job_id: str)

Connectors

LocalFileInputConnector

S3InputConnector

Error Handling

Advanced Usage

Batch Processing

Custom Output Paths

S3 Workflow

Configuration

Environment Variables

Custom Base URL

Testing

Run Unit Tests

Run Integration Tests

Run All Tests

Requirements

About ByteIT

Support & Resources

Legal

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`parse(input, output_format="txt", output=None)`

`get_all_jobs()`

`get_job_by_id(job_id: str)`

`get_result(job_id: str)`