Skip to main content

Python module providing unified access to OpenAI's chat, text-to-speech, and transcription APIs

Project description

AI Manager

A general-purpose Python toolkit for commonly performed AI tasks, providing a unified interface for OpenAI and Replicate APIs.

Features

  • Chat Completions - Generate text with OpenAI models, including schema validation
  • Text-to-Speech - Convert text to speech using OpenAI's TTS
  • Speech-to-Text - Transcribe audio using OpenAI's Whisper
  • Image Generation - Create images using FLUX PRO via Replicate
  • Video Generation - Generate videos using Google's VEO-2 via Replicate
  • Music Generation - Create music tracks with continuation and variations
  • Prompt Management - Load and manage prompts from files
  • Schema Validation - Validate AI responses against JSON schemas

Installation

pip install wl-ai-manager

Dependencies

  • wl_config_manager
  • wl_version_manager
  • openai
  • replicate
  • pillow
  • soundfile
  • jsonschema
  • pyyaml
  • requests

Configuration

Create a YAML configuration file with the ai_manager root element:

ai_manager:
  output_dir: "./output"
  temp_dir: "/tmp"
  prompt_folder: "./prompts"
  schema_folder: "./schemas"
  max_validation_retries: 3
  
  openai:
    api_key: "your-openai-api-key"
    organization_id: "your-org-id"
    chat_model: "gpt-4"
    tts_model: "tts-1"
    tts_voice: "nova"
    whisper_model: "whisper-1"
  
  replicate:
    api_key: "your-replicate-api-key"
    image_model: "black-forest-labs/flux-pro"
    video_model: "google/veo-2"
    music_model: "meta/musicgen"
    prompt_upsampling: true
    output_format: "png"
    num_inference_steps: 50
    guidance_scale: 7.5

Usage

Initialize AI Manager

from wl_config_manager import ConfigManager
from wl_ai_manager import AIManager

# Load configuration
config = ConfigManager("config.yaml")
ai_manager = AIManager(config.ai_manager)

Chat Completions

# Simple chat
response = ai_manager.chat(
    prompt_name="analyze_text",
    data={"text": "Hello world"},
    model="gpt-4"  # Optional, uses config default
)

# Chat with schema validation
validated_response = ai_manager.chat(
    prompt_name="extract_entities",
    data={"text": "John lives in New York"},
    validate=True  # Enables automatic retry with schema validation
)

Text-to-Speech

# Generate speech from text
audio_path = ai_manager.generate_speech(
    text="Hello, this is a test",
    voice="nova",  # Optional, uses config default
    output_path="./output/speech.wav"
)

Speech-to-Text

# Transcribe audio file
transcript = ai_manager.transcribe_audio(
    audio_path="./audio/recording.wav"
)

# Transcribe audio data (numpy array or bytes)
transcript = ai_manager.transcribe_audio(
    audio_data=audio_array
)

Image Generation

# Generate image with FLUX PRO
image_path = ai_manager.generate_image(
    prompt="A beautiful sunset over mountains",
    file_name="sunset",
    file_type="png",
    width=1024,
    height=768,
    resize=True,  # Resize to exact dimensions
    crop=False    # Crop to exact dimensions
)

Video Generation

# Generate video from text prompt
video_path = ai_manager.generate_video(
    prompt="A cat playing with a ball of yarn",
    duration=10,  # 5, 10, 15, or 20 seconds
    aspect_ratio="16:9"  # "16:9", "9:16", "1:1", "4:3", "3:4"
)

# Generate video from image (when supported)
video_path = ai_manager.generate_video_from_image(
    image_path="./images/cat.jpg",
    prompt="Make the cat move and play",
    duration=5
)

Music Generation

# Generate single music track
music_path = ai_manager.generate_music(
    prompt="Upbeat electronic dance music with synth",
    duration=30,
    temperature=1.0,
    output_format="wav"
)

# Generate music with continuation
music_path = ai_manager.generate_music(
    prompt="Continue this melody with strings",
    continuation_audio="./music/intro.wav",
    duration=30
)

# Generate music chain (each continues from previous)
music_files = ai_manager.generate_music_chain(
    prompts=[
        "Gentle piano intro",
        "Add strings and build intensity",
        "Climax with full orchestra",
        "Gentle outro"
    ],
    duration=30  # per segment
)

# Generate variations of a theme
variations = ai_manager.generate_music_variations(
    base_prompt="Classical piano melody",
    variations=[
        "in minor key",
        "with jazz influences",
        "as a waltz",
        "with electronic elements"
    ],
    duration=30
)

Prompt Management

Create prompt files in your configured prompt_folder:

Standard Prompt (analyze.txt)

Analyze the following text: {text}

System/User Prompt Pair

summarize.system.txt:

You are a professional summarizer.

summarize.user.txt:

Summarize this document: {document}

Using Prompts

# The prompt name matches the filename without extension
response = ai_manager.chat(
    prompt_name="analyze",
    data={"text": "Some text to analyze"}
)

Schema Validation

Create schema example files in your schema_folder with .schema.txt extension:

extract_entities.schema.txt:

{
  "entities": [
    {
      "name": "string",
      "type": "person|place|organization",
      "confidence": 0.95
    }
  ],
  "relationships": []
}

Using Schema Validation

# Automatic validation with retries
result = ai_manager.chat(
    prompt_name="extract_entities",
    data={"text": "Apple Inc. is located in Cupertino"},
    validate=True
)

# result will be the parsed JSON/YAML data, not raw text
print(result["entities"])  # [{"name": "Apple Inc.", "type": "organization", ...}]

Manual Validation

# Check if schema exists
if ai_manager.has_schema_for_prompt("extract_entities"):
    # Validate arbitrary response
    validation = ai_manager.validate_response_for_prompt(
        response='{"entities": [...]}',
        prompt_name="extract_entities"
    )
    if validation["valid"]:
        data = validation["data"]

Advanced Features

Get Available Prompts and Schemas

# List all loaded prompts
prompts = ai_manager.get_prompts()

# List prompts with schemas
schema_prompts = ai_manager.get_schema_prompts()

# List all available schemas
schemas = ai_manager.get_available_schemas()

Custom Schema Validation

# Add schema programmatically
ai_manager.add_schema("custom_format", {
    "type": "object",
    "properties": {
        "result": {"type": "string"},
        "confidence": {"type": "number"}
    },
    "required": ["result"]
})

# Validate data against custom schema
validation = ai_manager.validate_data(
    data={"result": "success", "confidence": 0.9},
    schema_name="custom_format"
)

Error Handling

All methods return None or error dictionaries on failure:

# Check for failures
response = ai_manager.chat("my_prompt", data={})
if response is None:
    print("Chat generation failed")

# With validation, errors are returned as dict
result = ai_manager.chat("extract", data={}, validate=True)
if isinstance(result, dict) and "error" in result:
    print(f"Validation failed: {result['error']}")
    print(f"After {result['attempts']} attempts")

Best Practices

  1. Configuration: Store API keys in environment variables and load them in your YAML config
  2. Prompts: Use descriptive filenames for prompts (e.g., analyze_sentiment.txt)
  3. Schemas: Provide clear example structures in .schema.txt files
  4. Error Handling: Always check return values for None or error dictionaries
  5. Resource Management: The manager handles file operations and API clients internally

License

MIT

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wl_ai_manager-0.1.32.tar.gz (27.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

wl_ai_manager-0.1.32-py3-none-any.whl (25.8 kB view details)

Uploaded Python 3

File details

Details for the file wl_ai_manager-0.1.32.tar.gz.

File metadata

  • Download URL: wl_ai_manager-0.1.32.tar.gz
  • Upload date:
  • Size: 27.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for wl_ai_manager-0.1.32.tar.gz
Algorithm Hash digest
SHA256 9756cea04540605c75cbc9508c4f9ff437b0755b2efa67dc6d066959a01f6643
MD5 0519c2614cdb739ff50ceaef0660e1b0
BLAKE2b-256 77f8e7a1042873f5771e2498e58ebf77895644fe44f716e18b982df9edad074b

See more details on using hashes here.

File details

Details for the file wl_ai_manager-0.1.32-py3-none-any.whl.

File metadata

  • Download URL: wl_ai_manager-0.1.32-py3-none-any.whl
  • Upload date:
  • Size: 25.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for wl_ai_manager-0.1.32-py3-none-any.whl
Algorithm Hash digest
SHA256 563f11ed3f4d9ad02954fafc629dfab20cfda385c7257b7804e5f9b73689677e
MD5 818f9f01f0844f58f430af535d895563
BLAKE2b-256 20cb97ef7fb5282cd1d8ae87e1dc287e83019275ee40e946b625301d40a2aac2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page