Skip to main content

A simple and elegant Python Mini SDK for Google Gemini AI.

Project description

PyPI Downloads Tests

Dracula 🧛

A simple, elegant Python library and Mini SDK for Google Gemini with powerful features. Built for developers who want to integrate AI into their projects without dealing with complex API setup.

Installation

pip install dracula-ai

Quick Start

from dracula import Dracula, GeminiModel
from dotenv import load_dotenv
import os

load_dotenv()

ai = Dracula(api_key=os.getenv("GEMINI_API_KEY"))
response = ai.chat("Hello, who are you?")
print(response)

Parameters

Parameter Type Default Description
api_key str required Your Google Gemini API key
model GeminiModel or str GeminiModel.FLASH Gemini model to use (default: gemini-2.5-flash)
max_messages int 10 Maximum number of messages to remember
prompt str "You are a helpful assistant." System prompt
temperature float 1.0 Response creativity (0.0 - 2.0)
max_output_tokens int 8192 Maximum response length
db_path str None Custom path to the SQLite memory database
stats_db_path str None Custom path to the SQLite stats database
session_id str None Isolates this instance's history in a shared database. Auto-generated UUID if not provided
cache_ttl int 0 Seconds to cache identical responses. 0 disables caching
cache_db_path str None Custom path to the SQLite cache database
token_budget int None Max cumulative tokens before BudgetExceededException is raised
auto_compress bool False Automatically summarize old history instead of hard-trimming
compress_ratio float 0.8 Fraction of max_messages at which compression triggers
compress_keep_turns int 4 Number of recent messages preserved verbatim after compression
language str "English" Language for responses
logging bool False Enable or disable logging
log_level str "DEBUG" Logging level
log_file str None Path to log file
log_max_bytes int 5MB Maximum log file size
log_backup_count int 5 Number of backup log files
tools list None List of tools to register
max_retries int 3 Maximum number of retry attempts for failed requests
retry_delay float 1.0 Base delay in seconds for exponential backoff

Features

Note: All examples below assume you have loaded your environment variables using load_dotenv() and os.

💬 Text Chat

The most basic feature of Dracula. Send a message to Gemini and get a response back. Every message you send and every response you receive is automatically stored in memory, so Gemini always knows the context of your conversation.

ai = Dracula(api_key=os.getenv("GEMINI_API_KEY"))
response = ai.chat("What is Python?")
print(response)

🌊 Streaming

Normally, Dracula waits for Gemini to finish generating the full response before returning it. Streaming changes this behavior — instead of waiting, you receive the response word by word as it is being generated, just like ChatGPT does. This is especially useful for long responses or when you want a more interactive feel in your app.

for chunk in ai.stream("Tell me a long story."):
    print(chunk, end="", flush=True)

🧠 SQLite Conversation Memory

Dracula automatically remembers the conversation history so Gemini can refer back to previous messages. Since v0.8.0, all history is persisted in a high-performance SQLite database, preventing memory bloat and ensuring your AI never loses context, even if your program restarts.

ai.chat("My name is Ahmet.")
response = ai.chat("What is my name?")
print(response)  # It remembers! ✅

ai.clear_memory()  # Wipe memory

💾 Save & Load History

By default, conversation history only exists while your program is running. Once you stop the program, the history is lost. Save & Load History solves this by letting you save the conversation to a JSON file and reload it later, so your AI can continue right where it left off — even in a completely new run of your program.

ai.save_history("conversation.json")

# Later, in a new run of your program:
ai.load_history("conversation.json")

📜 Pretty Print History

get_history() returns the raw conversation history as a list of dictionaries, which can be hard to read. print_history() formats the same data into a clean, human-readable layout with clear labels for each message, making it much easier to follow the conversation at a glance.

ai.print_history()

🎭 System Prompt

The system prompt is a set of instructions you give to Gemini before the conversation starts. It defines the AI's personality, role, and behavior for the entire conversation. For example, you can tell it to act as a pirate, a chef, a formal assistant, or anything else you can imagine. The user will never see this prompt — it works silently in the background.

ai = Dracula(
    api_key=os.getenv("GEMINI_API_KEY"),
    prompt="You are a pirate who answers everything dramatically."
)

# You can also change it anytime during the conversation:
ai.set_prompt("You are now a formal assistant.")

🌡️ Temperature Control

Temperature controls how creative and random Gemini's responses are. A low temperature (close to 0.0) makes responses more focused, predictable, and factual — great for technical questions. A high temperature (close to 2.0) makes responses more creative, surprising, and varied — great for storytelling or brainstorming. The default value of 1.0 is a balanced middle ground.

ai = Dracula(api_key=os.getenv("GEMINI_API_KEY"), temperature=0.2)  # Focused
ai = Dracula(api_key=os.getenv("GEMINI_API_KEY"), temperature=1.8)  # Creative

# You can also change it anytime:
ai.set_temperature(0.5)

📏 Max Output Tokens

Tokens are small chunks of text — roughly one token per word. max_output_tokens controls the maximum length of Gemini's responses. If you want short, concise answers set it low. If you want long, detailed responses set it high. The default is 8192 which is large enough for most use cases.

ai = Dracula(api_key=os.getenv("GEMINI_API_KEY"), max_output_tokens=256)  # Short
ai = Dracula(api_key=os.getenv("GEMINI_API_KEY"), max_output_tokens=8192) # Long

# You can also change it anytime:
ai.set_max_output_tokens(512)

🌍 Response Language

By default Gemini responds in whatever language the user writes in. The language feature overrides this behavior and forces Gemini to always respond in a specific language, regardless of what language the user writes in. Use "Auto" to let Gemini detect the language automatically.

ai = Dracula(api_key=os.getenv("GEMINI_API_KEY"), language="Turkish")
response = ai.chat("Hello!")
print(response)  # Merhaba! ✅

# Auto detect language
ai = Dracula(api_key=os.getenv("GEMINI_API_KEY"), language="Auto")

# Change it anytime:
ai.set_language("Spanish")

🤖 GeminiModel Enum

Instead of typing model names as strings which can cause typos and break when Google updates model names, Dracula provides a GeminiModel enum with all available models. You can still use strings for advanced use cases, but the enum is the recommended approach.

from dracula import Dracula, GeminiModel

ai = Dracula(api_key=os.getenv("GEMINI_API_KEY"), model=GeminiModel.FLASH)
ai = Dracula(api_key=os.getenv("GEMINI_API_KEY"), model=GeminiModel.PRO)
ai = Dracula(api_key=os.getenv("GEMINI_API_KEY"), model=GeminiModel.FLASH_LITE)

# Change model anytime:
ai.set_model(GeminiModel.PRO)

# Discover all available models in real time:
print(ai.list_available_models())

🛠️ Function Calling / Tools

Function calling is one of the most powerful features of Dracula. It lets you give Gemini access to your own Python functions — like checking the weather, searching the web, querying a database, or anything else you can write in Python. Gemini will automatically decide when and how to call your functions based on the user's message. Use auto_call=True to let Dracula handle everything automatically, or auto_call=False to handle tool calls yourself.

from dracula import Dracula, tool, ToolResult, GeminiModel

@tool(description="Get the current weather for a city")
def get_weather(city: str) -> str:
    return f"It's 25°C and sunny in {city}"

@tool(description="Search the web for information")
def search_web(query: str) -> str:
    return f"Search results for: {query}"

ai = Dracula(
    api_key=os.getenv("GEMINI_API_KEY"),
    tools=[get_weather, search_web]
)

# Auto call — Dracula handles everything automatically
response = ai.chat("What's the weather in Istanbul?")
print(response)
# "The weather in Istanbul is currently 25°C and sunny!"

# Manual call — you handle the tool call yourself
result = ai.chat("What's the weather in Ankara?", auto_call=False)
if result.requires_tool_call:
    print(f"Tool: {result.tool_name}")   # get_weather
    print(f"Args: {result.tool_args}")   # {"city": "Ankara"}

# Add tools after initialization
ai.add_tool(search_web)

# List registered tools
print(ai.list_tools())  # ["get_weather", "search_web"]

🛡️ Auto-Retry & Resilience

Network instability, rate limits (429), and temporary server errors (500, 503) are common when working with cloud AI models. Dracula includes a built-in, smart retry mechanism using exponential backoff. If a request fails due to a temporary issue, Dracula will automatically pause and try again, increasing the wait time between each attempt. This works seamlessly under the hood for both synchronous and asynchronous operations.

# Retries are handled automatically out of the box
ai = Dracula(
    api_key=os.getenv("GEMINI_API_KEY"),
    max_retries=3,      # Maximum number of retry attempts (default: 3)
    retry_delay=1.0     # Base delay in seconds, doubles each attempt (default: 1.0)
)

# If the Google API is temporarily overloaded, Dracula will smoothly 
# retry the connection in the background before raising an exception.
response = ai.chat("What is the meaning of life?")

📊 SQLite Usage Stats

Dracula automatically tracks how many messages you've sent and received, and how many characters were exchanged in total. These stats are safely written to a local database (~/.dracula/stats.db), so they accumulate securely over time without risking data corruption.

print(ai.get_stats())
# {
#   "total_messages": 5,
#   "total_responses": 5,
#   "total_characters_sent": 120,
#   "total_characters_received": 3400
# }

ai.reset_stats()

🛠️ Utility & Advanced Features

🪙 Token Counting

Keep track of your API usage and context limits easily.

# Count tokens in a specific text
tokens = ai.count_tokens("Hello, Dracula!")
print(f"Tokens: {tokens}")

# Count total tokens in current history (including system prompt & tools)
total_tokens = ai.count_history_tokens() # Or await ai.count_history_tokens() for Async
print(f"Total session usage: {total_tokens} tokens")

📤 Export to Markdown

Export your conversation history into a beautifully formatted Markdown file, perfect for documentation or personal notes.

ai.export_to_markdown("my_awesome_chat.md")

🗄️ Custom Database Paths

For portable applications, you can specify exact locations for both memory and stats databases.

ai = Dracula(
    api_key="your-key",
    db_path="./data/my_memory.db",
    stats_db_path="./data/my_stats.db"
)

📝 Logging

Dracula has a built-in logging system that you can turn on or off. By default logging is completely silent. When enabled, it shows detailed information about what Dracula is doing internally — useful for debugging. You can also save logs to a file with automatic log rotation, which means when the log file gets too big, a new one is created automatically.

# Enable terminal logging
ai = Dracula(api_key=os.getenv("GEMINI_API_KEY"), logging=True)

# Enable logging with a specific level
ai = Dracula(api_key=os.getenv("GEMINI_API_KEY"), logging=True, log_level="WARNING")

# Enable file logging with rotation
ai = Dracula(
    api_key=os.getenv("GEMINI_API_KEY"),
    logging=True,
    log_file="dracula.log",
    log_max_bytes=1 * 1024 * 1024,  # 1MB per file
    log_backup_count=3               # keep 3 backups
)

# Change log level anytime
ai.set_log_level("ERROR")

🔗 Chainable Methods

Instead of calling each setter method on a separate line, chainable methods let you combine multiple settings into a single, clean line of code.

ai.set_prompt("You are a chef.").set_temperature(0.9).set_language("Turkish")

🧹 Context Manager

A context manager lets you use Dracula with Python's with statement. The benefit is automatic cleanup — when the with block ends, Dracula automatically clears the memory and resets the stats.

with Dracula(api_key=os.getenv("GEMINI_API_KEY")) as ai:
    ai.chat("Hello!")
    ai.print_history()
# Memory and stats automatically reset here ✅

🎭 Role Playing Mode

Dracula comes with a set of built-in personas that you can switch between instantly. Each persona has its own predefined prompt, temperature, and language settings.

print(ai.list_personas())
# ['assistant', 'pirate', 'chef', 'shakespeare', 'scientist', 'comedian']

ai.set_persona("pirate")
print(ai.chat("Hello, who are you?"))
# Arrr, I be a fearsome pirate! 🏴‍☠️

🖥️ Desktop Chat UI

Dracula comes with a ready-made PyQt6 desktop chat UI that you can use in your Windows apps. It supports dark and light themes, markdown rendering, and syntax highlighting for code blocks. (Requires UI installation: pip install dracula-ai[ui]) The built-in chat interface now includes a full attachment system:

  1. Click the 📎 Button to select a file.
  2. See the selected file name above the input field.
  3. Click the file name to clear the attachment.
  4. Send both text and files simultaneously.
from dracula import Dracula, launch

ai = Dracula(api_key=os.getenv("GEMINI_API_KEY"))

launch(ai, title="My AI App", theme="dark")   # Dark theme
launch(ai, title="My AI App", theme="light")  # Light theme

👁️ Multimodal & Vision Support (New in v0.9.0)

Dracula can now "see" and "read" files! You can pass images, PDFs, or other documents directly to the chat.

🐍 Python Usage

from dracula import Dracula

ai = Dracula(os.getenv("GEMINI_API_KEY"))

# Analyze an image
response = ai.chat(
    message="What do you see in this photo?", 
    filepath="landscape.jpg"
)
print(response)

# Analyze a PDF document
summary = ai.chat(
    message="Summarize this document for me.", 
    filepath="report.pdf"
)

🖥️ CLI Tool

Dracula comes with a built-in CLI tool that lets you chat with Gemini directly from the terminal, or launch the desktop UI, without writing any code.

dracula chat "Hello!"
dracula chat "Tell me a joke" --persona comedian
dracula chat "Merhaba" --language Turkish --stream
dracula ui  # Launches the PyQt6 Desktop App
dracula list-personas
dracula stats
dracula --version

⚡ Async Support with AsyncDracula

For async applications like Discord bots, FastAPI, and Telegram bots, use AsyncDracula instead of Dracula. It has all the same features with full, non-blocking async support backed by aiosqlite.

Note: On AsyncDracula, the following methods are async and must be awaited: chat(), stream(), set_prompt(), set_language(), set_model(), set_persona(), add_tool(), print_history(), save_history(), load_history(), export_to_markdown(), clear_memory(), get_history(), get_stats(), reset_stats(), count_tokens(), count_history_tokens().

import asyncio
from dracula import AsyncDracula, tool

@tool(description="Get the weather for a city")
async def get_weather(city: str) -> str:
    return f"25°C and sunny in {city}"

async def main():
    async with AsyncDracula(
        api_key=os.getenv("GEMINI_API_KEY"),
        tools=[get_weather]
    ) as ai:
        response = await ai.chat("What's the weather in Istanbul?")
        print(response)

        async for chunk in ai.stream("Tell me a story."):
            print(chunk, end="", flush=True)

        # Setter and utility methods are also async on AsyncDracula
        await ai.set_language("Turkish")
        await ai.set_persona("pirate")
        await ai.print_history()

asyncio.run(main())

👥 Multi-User Session Isolation

When building bots or web apps you often need one Dracula instance per user. Pass a session_id to keep each user's history fully isolated in a single shared database — no bleeding between conversations.

from dracula import AsyncDracula

# Each user gets their own isolated conversation history
user_a = AsyncDracula(api_key=os.getenv("GEMINI_API_KEY"), session_id="user-001")
user_b = AsyncDracula(api_key=os.getenv("GEMINI_API_KEY"), session_id="user-002")

# Or use a per-request session (e.g. Discord user ID, FastAPI request ID)
async def handle_message(user_id: str, message: str) -> str:
    ai = AsyncDracula(
        api_key=os.getenv("GEMINI_API_KEY"),
        session_id=user_id,          # scopes history to this user
        db_path="./shared.db"        # all users share one database file
    )
    await ai._ensure_initialized()
    return await ai.chat(message)

🤖 Discord Bot Example

Thanks to async support, building an AI-powered Discord bot with Dracula takes just a few lines of code.

import discord
from discord.ext import commands
from dracula import AsyncDracula, tool
from dotenv import load_dotenv
import os

load_dotenv()

bot = commands.Bot(command_prefix="!", intents=discord.Intents.all())
ai = AsyncDracula(api_key=os.getenv("GEMINI_API_KEY"))

@bot.command()
async def chat(ctx, *, message: str):
    response = await ai.chat(message)
    await ctx.send(response)

bot.run(os.getenv("DISCORD_TOKEN"))

🌐 FastAPI Example

Dracula works perfectly with FastAPI thanks to async support.

from fastapi import FastAPI
from dracula import AsyncDracula
from dotenv import load_dotenv
import os

load_dotenv()

app = FastAPI()
ai = AsyncDracula(api_key=os.getenv("GEMINI_API_KEY"))

@app.get("/chat")
async def chat(message: str):
    response = await ai.chat(message)
    return {"response": response}

@app.get("/stream")
async def stream(message: str):
    from fastapi.responses import StreamingResponse
    return StreamingResponse(
        ai.stream(message),
        media_type="text/plain"
    )

⚡ Power Features (New in v1.0.0)

🗜️ Smart Context Compression

By default, when history reaches max_messages, the oldest messages are simply deleted. Smart compression replaces them with a short AI-generated summary instead, so no context is truly lost. Compression fires automatically when history fills to compress_ratio * max_messages (default 80%), keeping compress_keep_turns recent turns verbatim.

ai = Dracula(
    api_key=os.getenv("GEMINI_API_KEY"),
    max_messages=20,
    auto_compress=True,          # Enable smart compression
    compress_ratio=0.8,          # Compress when 80% full (default)
    compress_keep_turns=4        # Keep last 4 messages verbatim (default)
)

# Have a long conversation — old turns are summarized, not deleted
for i in range(30):
    ai.chat(f"Message {i}")

📐 Structured Output / JSON Mode

Ask Gemini to return data in a strict schema. Pass a Pydantic BaseModel subclass to get a validated model instance back, or pass a raw JSON-schema dict to get a plain Python dict. No schema = no change to existing behaviour.

from pydantic import BaseModel
from dracula import Dracula

class Recipe(BaseModel):
    name: str
    ingredients: list[str]
    steps: list[str]

ai = Dracula(api_key=os.getenv("GEMINI_API_KEY"))

# Returns a validated Recipe instance
recipe = ai.chat("Give me a pasta recipe.", schema=Recipe)
print(recipe.name)
print(recipe.ingredients)

# Or use a raw dict schema
data = ai.chat("What is 2+2?", schema={"type": "object", "properties": {"answer": {"type": "integer"}}})
print(data["answer"])  # 4

🪝 Middleware / Hook System

Register transform functions that run before or after every chat() call. before_chat hooks receive the message and may return a replacement string. after_chat hooks receive (message, reply) and may return a replacement reply. Both sync and async hooks work in AsyncDracula.

ai = Dracula(api_key=os.getenv("GEMINI_API_KEY"))

@ai.before_chat
def add_context(message):
    return f"[FORMAL] {message}"

@ai.after_chat
def log_response(message, reply):
    print(f"[{len(reply)} chars replied]")
    # Return None to keep the reply unchanged, or return a new string to replace it

response = ai.chat("Hello!")
# Gemini receives: "[FORMAL] Hello!"

💾 Response Caching (SQLite-backed, TTL)

Cache identical responses to save API calls and money. Caching is keyed on (session_id, message) — the same question in the same session returns the cached answer instantly without hitting the API. Set cache_ttl=0 (default) to disable caching entirely.

ai = Dracula(
    api_key=os.getenv("GEMINI_API_KEY"),
    cache_ttl=3600,              # Cache responses for 1 hour
    cache_db_path="./cache.db"  # Optional custom path
)

r1 = ai.chat("What is the capital of France?")  # Calls the API
r2 = ai.chat("What is the capital of France?")  # Returns cached response instantly
assert r1 == r2

💰 Token Budget & Cost Tracking

Cap cumulative token usage to avoid surprise bills. Once total_input_tokens + total_output_tokens >= token_budget, the next chat() or stream() call raises BudgetExceededException. Check your estimated cost at any time with estimated_cost().

from dracula import Dracula, BudgetExceededException

ai = Dracula(
    api_key=os.getenv("GEMINI_API_KEY"),
    token_budget=10_000  # Stop after 10,000 tokens
)

try:
    while True:
        ai.chat("Tell me something interesting.")
except BudgetExceededException as e:
    print(f"Budget reached: {e}")

# Check cost at any time
print(f"Estimated cost: ${ai.estimated_cost():.4f}")

# Stats now include token counts
stats = ai.get_stats()
print(stats["total_input_tokens"])
print(stats["total_output_tokens"])

🌿 Conversation Branching (Fork)

Create an independent copy of the current conversation at any point. The fork shares the same database file but uses a new session_id, so both branches can evolve in completely different directions without affecting each other — great for exploring "what if" paths.

ai = Dracula(api_key=os.getenv("GEMINI_API_KEY"))
ai.chat("Let's design a web app.")
ai.chat("It should be a social network.")

# Branch off here
branch_a = ai.fork()
branch_b = ai.fork()

# Each branch evolves independently
branch_a.chat("Make it Twitter-like.")
branch_b.chat("Make it LinkedIn-like.")

# Original is untouched
print(len(ai.get_history()))      # 4 messages
print(len(branch_a.get_history())) # 5 messages (4 copied + 1 new)
print(len(branch_b.get_history())) # 5 messages (4 copied + 1 new)

# AsyncDracula.fork() is async
async def branch_example():
    ai = AsyncDracula(api_key=os.getenv("GEMINI_API_KEY"))
    await ai._ensure_initialized()
    branch = await ai.fork(session_id="my-branch")

Error Handling

Dracula provides custom exceptions so you can handle different types of errors separately and give your users clear, meaningful error messages.

from dracula import (
    ValidationException,
    ChatException,
    InvalidAPIKeyException,
    PersonaException,
    ToolException,
    BudgetExceededException,
)

try:
    ai = Dracula(api_key="", temperature=5.0)
    ai.chat("Hello")
except ValidationException as e:
    print(f"Validation error: {e}")
except InvalidAPIKeyException as e:
    print(f"API key error: {e}")
except ChatException as e:
    print(f"Chat error: {e}")
except PersonaException as e:
    print(f"Persona error: {e}")
except ToolException as e:
    print(f"Tool error: {e}")
except BudgetExceededException as e:
    print(f"Budget exceeded: {e}")

Known Issues

AsyncDracula — aiohttp connector warning

When using AsyncDracula, you may see this warning after your program ends:

Exception ignored in: <function BaseApiClient._get_aiohttp_session...>
AttributeError: 'NoneType' object has no attribute 'from_iterable'

This is a known bug in the google-genai library and is not caused by Dracula. It does not affect functionality in any way.

Getting Your API Key

  1. Go to https://aistudio.google.com
  2. Sign in with your Google account
  3. Click "Get API Key"
  4. Store it safely in a .env file:
GEMINI_API_KEY=your-api-key

License

MIT License — feel free to use this in your own projects!

Author

Suleyman Ibis

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dracula_ai-1.0.1.tar.gz (65.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dracula_ai-1.0.1-py3-none-any.whl (52.0 kB view details)

Uploaded Python 3

File details

Details for the file dracula_ai-1.0.1.tar.gz.

File metadata

  • Download URL: dracula_ai-1.0.1.tar.gz
  • Upload date:
  • Size: 65.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for dracula_ai-1.0.1.tar.gz
Algorithm Hash digest
SHA256 d9a031e00ba31feb0d9ab00947b0edeb840c88d2c0111d9cf8b36d1e678fd253
MD5 35ae264b41ed8e3c6c6d8c166cb3013e
BLAKE2b-256 b381ba35fec875af5f7f44f1b23379520d31821802a7a377d68d03490bbb0312

See more details on using hashes here.

File details

Details for the file dracula_ai-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: dracula_ai-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 52.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for dracula_ai-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 111d2b0fbfbb381fcfc80b5d62a95b0e659f4057c00c227eb1170a293f619e41
MD5 a9d9ba20781e0c12562a4d959f1761aa
BLAKE2b-256 53d788d1723e14fd7b1111cfbd217440f39df7d2429b282594b7c6b0674aeed6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page