Skip to main content

Get descriptions of images from OpenAI, Azure OpenAI, and Anthropic Claude models with support for local files and batch processing.

Project description

TextFromImage

Python Version PyPI Version License Downloads

A powerful Python library for obtaining detailed descriptions of images using various AI models including OpenAI's GPT models, Azure OpenAI, and Anthropic Claude. Perfect for applications requiring image understanding, accessibility features, and content analysis. Supports both local files and URLs, with batch processing capabilities.

🌟 Key Features

  • 🤖 Multiple AI Providers: Support for OpenAI, Azure OpenAI, and Anthropic Claude
  • 🌐 Flexible Input: Support for both URLs and local file paths
  • 📦 Batch Processing: Process multiple images (up to 20) concurrently
  • 🔄 Flexible Integration: Easy-to-use API with multiple initialization options
  • 🎯 Custom Prompting: Configurable prompts for targeted descriptions
  • 🔑 Secure Authentication: Multiple authentication methods including environment variables
  • 🛠️ Model Selection: Support for different model versions and configurations
  • 📝 Type Hints: Full typing support for better development experience

📦 Installation

pip install textfromimage

# With Azure support
pip install textfromimage[azure]

# With all optional dependencies
pip install textfromimage[all]

🚀 Quick Start

import textfromimage

# Initialize with API key
textfromimage.openai.init(api_key="your-openai-api-key")

# Process single image (URL or local file)
image_url = 'https://example.com/image.jpg'
local_image = '/path/to/local/image.jpg'

# Get description from URL
url_description = textfromimage.openai.get_description(image_path=image_url)

# Get description from local file
local_description = textfromimage.openai.get_description(image_path=local_image)

# Batch processing
image_paths = [
    'https://example.com/image1.jpg',
    '/path/to/local/image2.jpg',
    'https://example.com/image3.jpg'
]

batch_results = textfromimage.openai.get_description_batch(
    image_paths=image_paths,
    concurrent_limit=3  # Process 3 images at a time
)

# Process results
for result in batch_results:
    if result.success:
        print(f"Success for {result.image_path}: {result.description}")
    else:
        print(f"Failed for {result.image_path}: {result.error}")

💡 Advanced Usage

🤖 Multiple Provider Support

# Anthropic Claude Integration
textfromimage.claude.init(api_key="your-anthropic-api-key")

# Single image
claude_description = textfromimage.claude.get_description(
    image_path=image_path,
    model="claude-3-sonnet-20240229"
)

# Batch processing
claude_results = textfromimage.claude.get_description_batch(
    image_paths=image_paths,
    model="claude-3-sonnet-20240229",
    concurrent_limit=3
)

# Azure OpenAI Integration
textfromimage.azure_openai.init(
    api_key="your-azure-openai-api-key",
    api_base="https://your-azure-endpoint.openai.azure.com/",
    deployment_name="your-deployment-name"
)

# Single image with system prompt
azure_description = textfromimage.azure_openai.get_description(
    image_path=image_path,
    system_prompt="Analyze this image in detail"
)

# Batch processing
azure_results = textfromimage.azure_openai.get_description_batch(
    image_paths=image_paths,
    system_prompt="Analyze each image in detail",
    concurrent_limit=3
)

🔧 Configuration Options

# Environment Variable Configuration
import os
os.environ['OPENAI_API_KEY'] = 'your-openai-api-key'
os.environ['ANTHROPIC_API_KEY'] = 'your-anthropic-api-key'
os.environ['AZURE_OPENAI_API_KEY'] = 'your-azure-openai-api-key'
os.environ['AZURE_OPENAI_ENDPOINT'] = 'your-azure-endpoint'
os.environ['AZURE_OPENAI_DEPLOYMENT'] = 'your-deployment-name'

# Custom options for batch processing
batch_results = textfromimage.openai.get_description_batch(
    image_paths=image_paths,
    model='gpt-4-vision-preview',
    prompt="Describe the main elements of each image",
    max_tokens=300,
    concurrent_limit=5
)

📋 Parameters and Types

# Single image processing parameters
def get_description(
    image_path: str,
    prompt: str = "What's in this image?",
    max_tokens: int = 300,
    model: str = "gpt-4-vision-preview"
) -> str: ...

# Batch processing result type
@dataclass
class BatchResult:
    success: bool
    description: Optional[str]
    error: Optional[str]
    image_path: str

# Batch processing parameters
def get_description_batch(
    image_paths: List[str],
    prompt: str = "What's in this image?",
    max_tokens: int = 300,
    model: str = "gpt-4-vision-preview",
    concurrent_limit: int = 3
) -> List[BatchResult]: ...

🔍 Error Handling

from textfromimage.utils import BatchResult

# Single image processing
try:
    description = textfromimage.openai.get_description(image_path=image_path)
except ValueError as e:
    print(f"Image processing error: {e}")
except RuntimeError as e:
    print(f"API error: {e}")

# Batch processing error handling
results = textfromimage.openai.get_description_batch(image_paths)
successful = [r for r in results if r.success]
failed = [r for r in results if not r.success]

for result in failed:
    print(f"Failed to process {result.image_path}: {result.error}")

🤝 Contributing

We welcome contributions! Here's how you can help:

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

textfromimage-1.1.1.tar.gz (10.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

textfromimage-1.1.1-py3-none-any.whl (9.4 kB view details)

Uploaded Python 3

File details

Details for the file textfromimage-1.1.1.tar.gz.

File metadata

  • Download URL: textfromimage-1.1.1.tar.gz
  • Upload date:
  • Size: 10.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.20

File hashes

Hashes for textfromimage-1.1.1.tar.gz
Algorithm Hash digest
SHA256 2bf355c52a768d3ee6c52bb05cda9175197d6d008daeb2c05d81f4c95fc3f257
MD5 d0ef437d37966be5efe35b8a0d76d7b4
BLAKE2b-256 568570ce17d1f7b60a82ab2625af9ba2878027bddec835bb2ebf8791482c8a17

See more details on using hashes here.

File details

Details for the file textfromimage-1.1.1-py3-none-any.whl.

File metadata

  • Download URL: textfromimage-1.1.1-py3-none-any.whl
  • Upload date:
  • Size: 9.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.20

File hashes

Hashes for textfromimage-1.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 ecf3a7fa4cefe2b51336fa126f9af6ff80cc7a2c4812036617318e1ce1335770
MD5 2f018475b41314c85c4bda509246a3bd
BLAKE2b-256 e345e77b2804d12c367087fd50c9509d25a41cf27192f65485a93701eaf4f786

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page