Skip to main content

Get descriptions of images from OpenAI, Azure OpenAI, and Anthropic Claude models with support for local files and batch processing.

Project description

TextFromImage

Python Version PyPI Version License Downloads

A powerful Python library for obtaining detailed descriptions of images using various AI models including OpenAI's GPT models, Azure OpenAI, and Anthropic Claude. Perfect for applications requiring image understanding, accessibility features, and content analysis. Supports both local files and URLs, with batch processing capabilities.

🌟 Key Features

  • 🤖 Multiple AI Providers: Support for OpenAI, Azure OpenAI, and Anthropic Claude
  • 🌐 Flexible Input: Support for both URLs and local file paths
  • 📦 Batch Processing: Process multiple images (up to 20) concurrently
  • 🔄 Flexible Integration: Easy-to-use API with multiple initialization options
  • 🎯 Custom Prompting: Configurable prompts for targeted descriptions
  • 🔑 Secure Authentication: Multiple authentication methods including environment variables
  • 🛠️ Model Selection: Support for different model versions and configurations
  • 📝 Type Hints: Full typing support for better development experience

📦 Installation

pip install textfromimage

# With Azure support
pip install textfromimage[azure]

# With all optional dependencies
pip install textfromimage[all]

🚀 Quick Start

import textfromimage

# Initialize with API key
textfromimage.openai.init(api_key="your-openai-api-key")

# Process single image (URL or local file)
image_url = 'https://example.com/image.jpg'
local_image = '/path/to/local/image.jpg'

# Get description from URL
url_description = textfromimage.openai.get_description(image_path=image_url)

# Get description from local file
local_description = textfromimage.openai.get_description(image_path=local_image)

# Batch processing
image_paths = [
    'https://example.com/image1.jpg',
    '/path/to/local/image2.jpg',
    'https://example.com/image3.jpg'
]

batch_results = textfromimage.openai.get_description_batch(
    image_paths=image_paths,
    concurrent_limit=3  # Process 3 images at a time
)

# Process results
for result in batch_results:
    if result.success:
        print(f"Success for {result.image_path}: {result.description}")
    else:
        print(f"Failed for {result.image_path}: {result.error}")

💡 Advanced Usage

🤖 Multiple Provider Support

# Anthropic Claude Integration
textfromimage.claude.init(api_key="your-anthropic-api-key")

# Single image
claude_description = textfromimage.claude.get_description(
    image_path=image_path,
    model="claude-3-sonnet-20240229"
)

# Batch processing
claude_results = textfromimage.claude.get_description_batch(
    image_paths=image_paths,
    model="claude-3-sonnet-20240229",
    concurrent_limit=3
)

# Azure OpenAI Integration
textfromimage.azure_openai.init(
    api_key="your-azure-openai-api-key",
    api_base="https://your-azure-endpoint.openai.azure.com/",
    deployment_name="your-deployment-name"
)

# Single image with system prompt
azure_description = textfromimage.azure_openai.get_description(
    image_path=image_path,
    system_prompt="Analyze this image in detail"
)

# Batch processing
azure_results = textfromimage.azure_openai.get_description_batch(
    image_paths=image_paths,
    system_prompt="Analyze each image in detail",
    concurrent_limit=3
)

🔧 Configuration Options

# Environment Variable Configuration
import os
os.environ['OPENAI_API_KEY'] = 'your-openai-api-key'
os.environ['ANTHROPIC_API_KEY'] = 'your-anthropic-api-key'
os.environ['AZURE_OPENAI_API_KEY'] = 'your-azure-openai-api-key'
os.environ['AZURE_OPENAI_ENDPOINT'] = 'your-azure-endpoint'
os.environ['AZURE_OPENAI_DEPLOYMENT'] = 'your-deployment-name'

# Custom options for batch processing
batch_results = textfromimage.openai.get_description_batch(
    image_paths=image_paths,
    model='gpt-4-vision-preview',
    prompt="Describe the main elements of each image",
    max_tokens=300,
    concurrent_limit=5
)

📋 Parameters and Types

# Single image processing parameters
def get_description(
    image_path: str,
    prompt: str = "What's in this image?",
    max_tokens: int = 300,
    model: str = "gpt-4-vision-preview"
) -> str: ...

# Batch processing result type
@dataclass
class BatchResult:
    success: bool
    description: Optional[str]
    error: Optional[str]
    image_path: str

# Batch processing parameters
def get_description_batch(
    image_paths: List[str],
    prompt: str = "What's in this image?",
    max_tokens: int = 300,
    model: str = "gpt-4-vision-preview",
    concurrent_limit: int = 3
) -> List[BatchResult]: ...

🔍 Error Handling

from textfromimage.utils import BatchResult

# Single image processing
try:
    description = textfromimage.openai.get_description(image_path=image_path)
except ValueError as e:
    print(f"Image processing error: {e}")
except RuntimeError as e:
    print(f"API error: {e}")

# Batch processing error handling
results = textfromimage.openai.get_description_batch(image_paths)
successful = [r for r in results if r.success]
failed = [r for r in results if not r.success]

for result in failed:
    print(f"Failed to process {result.image_path}: {result.error}")

🤝 Contributing

We welcome contributions! Here's how you can help:

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

textfromimage-1.1.0.tar.gz (10.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

textfromimage-1.1.0-py3-none-any.whl (9.5 kB view details)

Uploaded Python 3

File details

Details for the file textfromimage-1.1.0.tar.gz.

File metadata

  • Download URL: textfromimage-1.1.0.tar.gz
  • Upload date:
  • Size: 10.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.20

File hashes

Hashes for textfromimage-1.1.0.tar.gz
Algorithm Hash digest
SHA256 3528441727c98fcf14b406c3a083705199c98ee2464125d22cee4d23f3547924
MD5 21864ed9f93b5dc7e724cb7d6b9cc980
BLAKE2b-256 ce4d84d61b5b1f49458704c08a2daad4ef117cd2e9a830ae3974ec35646325f3

See more details on using hashes here.

File details

Details for the file textfromimage-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: textfromimage-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 9.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.20

File hashes

Hashes for textfromimage-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7ea4e929c40b7245557e00ce0ef90bf4fcc1ed093b5ca572cacd09d2f36735ab
MD5 5fa70d7eb8fbaba59c82f6392c5a2f3c
BLAKE2b-256 84b1d7105befbbdb4c7d875ceb75b1c4f156a38703f3ec5d034fd0fcb499e342

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page