Python client for Adobe PDF to Word conversion using Adobe's online services

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

Adobe Helper

Adobe Helper is a Python library for converting PDF files to Word (DOCX) format using Adobe's online conversion services. It provides a clean, async API with automatic session management, rate limiting, and quota tracking.

⚠️ Current Status

This project is ~98% complete. The architecture, all modules, and examples are fully implemented and tested. However, API endpoint discovery is required before the library can perform actual conversions.

Recent Updates (2025-10-21)

✅ Multi-Tenant Architecture

Automatic tenant discovery during session initialization
Dynamic endpoint switching per session
Support for multiple regions and tenant IDs
Each session discovers its own numeric tenant ID from Adobe's servers

✅ Logging Enhancement

Examples now include proper logging configuration
Real-time visibility into conversion progress
Better debugging and troubleshooting support

See docs/discovery/API_DISCOVERY.md for instructions on discovering Adobe's actual API endpoints using Chrome DevTools.

Features

✨ Easy to Use

Simple async API with context manager support
Automatic session management and rotation
Built-in retry logic with exponential backoff
Bypass local usage limits (mimics clearing browser data)

📊 Smart Management

Optional usage tracking with daily limits
Intelligent rate limiting with human-like delays
Automatic session rotation for unlimited conversions
Fresh session creation (like incognito mode)

🔒 Reliable

Streaming upload/download for large files
File integrity verification
Comprehensive error handling
Progress tracking support

🚀 Fast

Async/await throughout
HTTP/2 support via httpx
Concurrent batch processing

Installation

Using uv (Recommended)

# Clone the repository
git clone https://github.com/karlorz/adobe-helper.git
cd adobe-helper

# Install with uv
uv sync --all-extras

Using pip

# Clone the repository
git clone https://github.com/karlorz/adobe-helper.git
cd adobe-helper

# Install in development mode
pip install -e .

Quick Start

Basic Usage

import asyncio
import logging
from pathlib import Path
from adobe import AdobePDFConverter

# Configure logging to see conversion progress
logging.basicConfig(
    level=logging.INFO,
    format="%(asctime)s - %(name)s - %(levelname)s - %(message)s",
)

async def main():
    # Convert a PDF to Word (bypasses local limits by default)
    async with AdobePDFConverter(
        bypass_local_limits=True  # Mimics clearing browser data
    ) as converter:
        output_file = await converter.convert_pdf_to_word(
            Path("document.pdf")
        )
        print(f"Converted: {output_file}")

asyncio.run(main())

Batch Conversion

from adobe import AdobePDFConverter

async def batch_convert():
    pdf_files = [
        Path("doc1.pdf"),
        Path("doc2.pdf"),
        Path("doc3.pdf"),
    ]

    async with AdobePDFConverter() as converter:
        for pdf_file in pdf_files:
            try:
                output = await converter.convert_pdf_to_word(pdf_file)
                print(f"✓ {pdf_file.name} -> {output.name}")
            except Exception as e:
                print(f"✗ {pdf_file.name}: {e}")

Advanced Configuration

from adobe import AdobePDFConverter
from pathlib import Path

async def advanced_convert():
    # Custom configuration
    converter = AdobePDFConverter(
        session_dir=Path(".cache"),      # Custom cache directory
        use_session_rotation=True,       # Enable session rotation
        track_usage=True,                # Track daily quota
        enable_rate_limiting=True,       # Rate limiting
    )

    try:
        await converter.initialize()

        # Convert with custom output path
        output = await converter.convert_pdf_to_word(
            Path("input.pdf"),
            output_path=Path("output/converted.docx"),
        )

        # Check usage stats
        usage = converter.get_usage_summary()
        print(f"Daily usage: {usage['count']}/{usage['limit']}")

    finally:
        await converter.close()

Endpoint Discovery CLI

Use the bundled helper to capture endpoints and keep discovery files synced:

# Show available commands
python -m adobe.cli.api_discovery_helper --help

# Create or refresh the project discovery template
python -m adobe.cli.api_discovery_helper template

# Validate captured URLs and sync project ↔ user cache copies
python -m adobe.cli.api_discovery_helper update

# Installed entry point (after `pip install .`)
adobe-api-discovery checklist

See docs/discovery/API_DISCOVERY.md for the full walkthrough.

The helper stores discovered endpoints in ~/.adobe-helper by default, but will fall back to ./.adobe-helper (or the system temp directory) automatically when the home directory is not writable—useful for containerized or sandboxed environments.

Architecture

Core Components

adobe/
├── client.py              # Main AdobePDFConverter class
├── auth.py                # Session management
├── session_cycling.py     # Anonymous session rotation
├── cookie_manager.py      # Cookie persistence
├── upload.py              # File upload handler
├── conversion.py          # Conversion workflow manager
├── download.py            # File download handler
├── rate_limiter.py        # Rate limiting with backoff
├── usage_tracker.py       # Free tier quota tracking
├── models.py              # Pydantic data models
├── exceptions.py          # Custom exceptions
├── constants.py           # Configuration constants
├── urls.py                # API endpoints
└── utils.py               # Helper functions

Data Flow

PDF File → Upload → Conversion Job → Poll Status → Download DOCX
           ↓         ↓                 ↓             ↓
        Validate  Create Job      Wait/Poll    Stream Download
        Retry     Track Status    Adaptive      Verify
                                  Polling       Integrity

Examples

See the examples/adobe/ directory for complete examples:

basic_usage.py - Simple conversion with bypass enabled
batch_convert.py - Sequential and concurrent batch processing
advanced_usage.py - Advanced configuration and error handling

Legacy bypass/reset scripts now live under archive/docs/ for reference.

Bypassing Usage Limits

By default, the library now bypasses local usage tracking and relies on Adobe's server-side limits with automatic session rotation:

# Automatic session rotation (recommended for batch processing)
async with AdobePDFConverter(
    bypass_local_limits=True,  # Default: True
    use_session_rotation=True,  # Auto-rotate sessions
) as converter:
    for pdf in pdf_files:
        await converter.convert_pdf_to_word(pdf)

For more details, see BYPASS_LIMITS.md.

Quick reset: Call AdobePDFConverter.reset_session_data() (or use AdobePDFConverter.create_with_fresh_session()) to clear all local state; the legacy helper script now resides in archive/docs/.

API Discovery Required

⚠️ Important: Before this library can perform actual conversions, you need to discover Adobe's API endpoints using Chrome DevTools.

See docs/discovery/API_DISCOVERY.md for detailed instructions.

Discovered endpoint files are cached automatically: any discovered_endpoints.json found in docs/discovery/ or archive/discovery/ is copied into ~/.adobe-helper/ on first run, and a template is generated if missing.

Development

Setup Development Environment

# Install UV (if not already installed)
curl -LsSf https://astral.sh/uv/install.sh | sh

# Clone and setup
git clone https://github.com/karlorz/adobe-helper.git
cd adobe-helper
uv sync --all-extras --dev

Run Tests

# Run all tests
uv run pytest

# Run with coverage
uv run pytest --cov=adobe --cov-report=html

# Run specific test file
uv run pytest tests/test_models.py -v

Code Quality

# Format code
uv run black adobe/ tests/

# Lint code
uv run ruff check adobe/ tests/

# Type checking
uv run mypy adobe/

Project Status

✅ Completed (Phases 1-10)

Project setup and architecture
Data models with Pydantic validation
Custom exception hierarchy
Session management and rotation
Cookie management
Rate limiting with adaptive backoff
Usage tracking
File upload handler
Conversion workflow manager
File download handler
Main client class
Example scripts with logging
Unit tests (30 tests, 100% pass rate)
Documentation
Multi-tenant architecture with automatic discovery ✨ NEW
Dynamic endpoint switching per session ✨ NEW

🔄 Remaining

API endpoint discovery (critical - see docs/discovery/API_DISCOVERY.md)
Integration tests with real API
CLI tool (optional)
Browser automation fallback (optional)

Contributing

Contributions are welcome! Please:

Fork the repository
Create a feature branch
Make your changes
Add tests
Run code quality checks
Submit a pull request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Disclaimer

This library is for legitimate use only. Please respect Adobe's Terms of Service and rate limits. The library includes built-in rate limiting and quota tracking to prevent abuse.

Acknowledgments

Inspired by Adobe's online PDF conversion services
Built with httpx, pydantic, and modern Python async patterns
Developed using uv for fast dependency management

Support

📫 Issues: GitHub Issues
📖 Documentation: See examples/ and AGENTS.md
💬 Discussions: GitHub Discussions

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

karlorz

These details have not been verified by PyPI

Release history Release notifications | RSS feed

1.2.0

Oct 22, 2025

This version

1.1.2

Oct 21, 2025

1.1.1

Oct 21, 2025

1.1.0

Oct 21, 2025

1.0.6

Oct 17, 2025

1.0.5

Oct 17, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

adobe_helper-1.1.2.tar.gz (181.5 kB view details)

Uploaded Oct 21, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

adobe_helper-1.1.2-py3-none-any.whl (58.9 kB view details)

Uploaded Oct 21, 2025 Python 3

File details

Details for the file adobe_helper-1.1.2.tar.gz.

File metadata

Download URL: adobe_helper-1.1.2.tar.gz
Upload date: Oct 21, 2025
Size: 181.5 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: uv/0.9.4

File hashes

Hashes for adobe_helper-1.1.2.tar.gz
Algorithm	Hash digest
SHA256	`38118d7b456ad61ad5b14f87c25aa9ce9598183a9d5ba99062deff9d3f82ea18`
MD5	`2490aae2f975fd329eb0eb8f3f113928`
BLAKE2b-256	`02d8ad1ca388ad3a9c66bffb9c280f54cdc1f1835895ce9d0b974dee41f09137`

See more details on using hashes here.

File details

Details for the file adobe_helper-1.1.2-py3-none-any.whl.

File metadata

Download URL: adobe_helper-1.1.2-py3-none-any.whl
Upload date: Oct 21, 2025
Size: 58.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: uv/0.9.4

File hashes

Hashes for adobe_helper-1.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ecf6de21a4ffa54a27d085ff8c7422c4cec6d68723195f9d608c4e6c14a9a191`
MD5	`e115bd38e2c6466c18ba91d5dba264f9`
BLAKE2b-256	`fdb5de161a3f851911f6e38bdefd00a89334a8e6f16431e39d3ba79d1ae3c2bc`

See more details on using hashes here.

adobe-helper 1.1.2

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Adobe Helper

⚠️ Current Status

Recent Updates (2025-10-21)

Features

Installation

Using uv (Recommended)

Using pip

Quick Start

Basic Usage

Batch Conversion

Advanced Configuration

Endpoint Discovery CLI

Architecture

Core Components

Data Flow

Examples

Bypassing Usage Limits

API Discovery Required

Development

Setup Development Environment

Run Tests

Code Quality

Project Status

✅ Completed (Phases 1-10)

🔄 Remaining

Contributing

License

Disclaimer

Acknowledgments

Support

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes