Get descriptions of images from OpenAI, Azure OpenAI, and Anthropic Claude models with support for local files and batch processing.
Project description
TextFromImage
A powerful Python library for obtaining detailed descriptions of images using various AI models including OpenAI's GPT models, Azure OpenAI, and Anthropic Claude. Perfect for applications requiring image understanding, accessibility features, and content analysis. Supports both local files and URLs, with batch processing capabilities.
🌟 Key Features
- 🤖 Multiple AI Providers: Support for OpenAI, Azure OpenAI, and Anthropic Claude
- 🌐 Flexible Input: Support for both URLs and local file paths
- 📦 Batch Processing: Process multiple images (up to 20) concurrently
- 🔄 Flexible Integration: Easy-to-use API with multiple initialization options
- 🎯 Custom Prompting: Configurable prompts for targeted descriptions
- 🔑 Secure Authentication: Multiple authentication methods including environment variables
- 🛠️ Model Selection: Support for different model versions and configurations
- 📝 Type Hints: Full typing support for better development experience
📦 Installation
pip install textfromimage
# With Azure support
pip install textfromimage[azure]
# With all optional dependencies
pip install textfromimage[all]
🚀 Quick Start
import textfromimage
# Initialize with API key
textfromimage.openai.init(api_key="your-openai-api-key")
# Process single image (URL or local file)
image_url = 'https://example.com/image.jpg'
local_image = '/path/to/local/image.jpg'
# Get description from URL
url_description = textfromimage.openai.get_description(image_path=image_url)
# Get description from local file
local_description = textfromimage.openai.get_description(image_path=local_image)
# Batch processing
image_paths = [
'https://example.com/image1.jpg',
'/path/to/local/image2.jpg',
'https://example.com/image3.jpg'
]
batch_results = textfromimage.openai.get_description_batch(
image_paths=image_paths,
concurrent_limit=3 # Process 3 images at a time
)
# Process results
for result in batch_results:
if result.success:
print(f"Success for {result.image_path}: {result.description}")
else:
print(f"Failed for {result.image_path}: {result.error}")
💡 Advanced Usage
🤖 Multiple Provider Support
# Anthropic Claude Integration
textfromimage.claude.init(api_key="your-anthropic-api-key")
# Single image
claude_description = textfromimage.claude.get_description(
image_path=image_path,
model="claude-3-sonnet-20240229"
)
# Batch processing
claude_results = textfromimage.claude.get_description_batch(
image_paths=image_paths,
model="claude-3-sonnet-20240229",
concurrent_limit=3
)
# Azure OpenAI Integration
textfromimage.azure_openai.init(
api_key="your-azure-openai-api-key",
api_base="https://your-azure-endpoint.openai.azure.com/",
deployment_name="your-deployment-name"
)
# Single image with system prompt
azure_description = textfromimage.azure_openai.get_description(
image_path=image_path,
system_prompt="Analyze this image in detail"
)
# Batch processing
azure_results = textfromimage.azure_openai.get_description_batch(
image_paths=image_paths,
system_prompt="Analyze each image in detail",
concurrent_limit=3
)
🔧 Configuration Options
# Environment Variable Configuration
import os
os.environ['OPENAI_API_KEY'] = 'your-openai-api-key'
os.environ['ANTHROPIC_API_KEY'] = 'your-anthropic-api-key'
os.environ['AZURE_OPENAI_API_KEY'] = 'your-azure-openai-api-key'
os.environ['AZURE_OPENAI_ENDPOINT'] = 'your-azure-endpoint'
os.environ['AZURE_OPENAI_DEPLOYMENT'] = 'your-deployment-name'
# Custom options for batch processing
batch_results = textfromimage.openai.get_description_batch(
image_paths=image_paths,
model='gpt-4-vision-preview',
prompt="Describe the main elements of each image",
max_tokens=300,
concurrent_limit=5
)
📋 Parameters and Types
# Single image processing parameters
def get_description(
image_path: str,
prompt: str = "What's in this image?",
max_tokens: int = 300,
model: str = "gpt-4-vision-preview"
) -> str: ...
# Batch processing result type
@dataclass
class BatchResult:
success: bool
description: Optional[str]
error: Optional[str]
image_path: str
# Batch processing parameters
def get_description_batch(
image_paths: List[str],
prompt: str = "What's in this image?",
max_tokens: int = 300,
model: str = "gpt-4-vision-preview",
concurrent_limit: int = 3
) -> List[BatchResult]: ...
🔍 Error Handling
from textfromimage.utils import BatchResult
# Single image processing
try:
description = textfromimage.openai.get_description(image_path=image_path)
except ValueError as e:
print(f"Image processing error: {e}")
except RuntimeError as e:
print(f"API error: {e}")
# Batch processing error handling
results = textfromimage.openai.get_description_batch(image_paths)
successful = [r for r in results if r.success]
failed = [r for r in results if not r.success]
for result in failed:
print(f"Failed to process {result.image_path}: {result.error}")
🤝 Contributing
We welcome contributions! Here's how you can help:
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
📝 License
This project is licensed under the MIT License - see the LICENSE file for details.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file textfromimage-1.1.0.tar.gz.
File metadata
- Download URL: textfromimage-1.1.0.tar.gz
- Upload date:
- Size: 10.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.20
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3528441727c98fcf14b406c3a083705199c98ee2464125d22cee4d23f3547924
|
|
| MD5 |
21864ed9f93b5dc7e724cb7d6b9cc980
|
|
| BLAKE2b-256 |
ce4d84d61b5b1f49458704c08a2daad4ef117cd2e9a830ae3974ec35646325f3
|
File details
Details for the file textfromimage-1.1.0-py3-none-any.whl.
File metadata
- Download URL: textfromimage-1.1.0-py3-none-any.whl
- Upload date:
- Size: 9.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.20
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7ea4e929c40b7245557e00ce0ef90bf4fcc1ed093b5ca572cacd09d2f36735ab
|
|
| MD5 |
5fa70d7eb8fbaba59c82f6392c5a2f3c
|
|
| BLAKE2b-256 |
84b1d7105befbbdb4c7d875ceb75b1c4f156a38703f3ec5d034fd0fcb499e342
|