Skip to main content

DocuMind SDK - lightweight AI document generation for PPTX, DOCX, Markdown, XLSX, and HWPX

Project description

DocuMind SDK

Korean: README.ko.md

Lightweight Python SDK for AI document generation.

Install documind when you want to generate native document files directly from Python.

Installation

pip install documind

Optional provider extras:

pip install "documind[anthropic]"
pip install "documind[bedrock,image-bedrock]"
pip install "documind[gemini]"

Quick Start

import asyncio
from documind import generate_document, init

init(
    llm_provider="openai",
    openai_api_key="sk-...",
    default_llm_model="gpt-4o",
    storage_local_path="./outputs",
)

async def main():
    result = await generate_document(
        query="Create an AI document automation proposal.",
        document_type="pptx",
        locale="en",
    )
    print(result.output_path)

asyncio.run(main())

SDK API Reference

Public imports:

from documind import (
    DocuMind,
    GenerationRequest,
    GenerationResult,
    ImageAttachment,
    TemplateInput,
    configure,
    generate_document,
    init,
    stream_document,
)

Engine Configuration

Use init(**config) for process-wide defaults, or pass the same values to DocuMind(...) for one engine instance.

Option Type / Example Description
llm_provider "openai" Provider route. Supported values include openai, anthropic, azure, bedrock, gcp_vertex, gemini, ollama, vllm, custom.
use_default_models True If true, agents use the default model names below.
default_llm_model "gpt-4o" Text generation model.
default_vlm_model "gpt-4o" Accepted for vision-model compatibility.
default_image_model "dall-e-3" Image asset model. If unavailable, generation continues without image assets.
storage_local_path "./outputs" Output directory for generated documents, HTML previews, and image assets.
log_level / log_file "INFO" / path Logging controls.
openai_api_key "sk-..." OpenAI credential.
openai_base_url URL OpenAI-compatible endpoint override.
anthropic_api_key string Anthropic credential. Requires documind[anthropic].
google_api_key string Gemini API key. Requires documind[gemini].
gcp_project_id, gcp_location, google_application_credentials strings Vertex AI settings. Requires documind[gcp-vertex].
aws_profile, aws_region, aws_access_key_id, aws_secret_access_key, aws_session_token, aws_role_arn strings Bedrock settings. Requires documind[bedrock] or documind[image-bedrock].
azure_openai_api_key, azure_openai_endpoint, azure_openai_api_version, azure_openai_deployment strings Azure OpenAI settings. Requires documind[azure].
custom_llm_base_url, custom_llm_api_key, custom_llm_model_name strings OpenAI-compatible custom provider settings.
preload_icons False Optional icon asset warm-up.
icon_preload_limit integer Optional icon warm-up limit.

Document Generation

generate_document(...) is the simplest one-shot helper. For repeated calls, create DocuMind(...) once and call engine.generate(...).

from documind import DocuMind, ImageAttachment, TemplateInput

engine = DocuMind(
    llm_provider="openai",
    openai_api_key="sk-...",
    storage_local_path="./outputs",
)

result = await engine.generate(
    query="Create a customer onboarding deck.",
    format="pptx",
    locale="en",
    template=TemplateInput(path="./template.pptx"),
    images=[
        ImageAttachment(path="./product.png", description="Product screenshot"),
    ],
    needs_research=False,
)
Argument Type / Default Description
query required string Natural-language document goal.
format "pptx" Output format for engine.generate. One of pptx, docx, xlsx, md, hwp.
document_type optional Alias for format; useful when adapting external input models.
template_id optional string Compatibility hook for template identifiers.
template path, bytes, dict, TemplateInput Optional native/template file.
images list of paths, bytes, dicts, ImageAttachment Optional image evidence for planning and visual references.
session_id optional string Caller-provided correlation/session ID.
locale "ko" Locale hint. Output language is also inferred from query.
needs_research None True forces research, False skips it, None lets DocuMind infer intent.
preload_icons optional bool Per-call icon warm-up toggle.
**options dict Forward-compatible pipeline options. Stable SDK callers should prefer the explicit arguments above.

Structured Input

from documind import GenerationRequest

generation_input = GenerationRequest(
    query="Create a weekly report.",
    document_type="docx",
    locale="en",
    needs_research=False,
    options={"template_id": "internal-template-id"},
)

result = await engine.generate_from_request(generation_input)
GenerationRequest field Type / Default Description
query required string Natural-language document goal.
document_type "pptx" pptx, docx, xlsx, md, or hwp.
template None Path, bytes, dict, or TemplateInput.
images [] Image evidence list.
session_id None Correlation/session ID.
locale "ko" Locale hint.
needs_research None Research routing toggle.
stream False Streaming marker for external adapters. Use generate_stream for SDK streaming.
options {} Additional pipeline options, including template_id.
Helper type Fields
TemplateInput path, content, filename
ImageAttachment path, content, filename, mime_type, role="content_reference", description

GenerationResult exposes success, output_path, output_bytes, document_type, mime_type, fidelity_scores, slide_count, errors, metadata, and to_dict().

Supported Formats

Format Output
pptx Native PowerPoint file
docx Native Word file
xlsx Native Excel workbook
md Markdown document
hwp HWPX document

Streaming

from documind import DocuMind, GenerationRequest

engine = DocuMind(llm_provider="openai", openai_api_key="sk-...")
generation_input = GenerationRequest(query="Create a weekly report.", document_type="docx")

async for event in engine.generate_stream(generation_input):
    print(event.to_sse())

Build From Source

cd packages/documind
python scripts/sync_runtime.py
python -m build

Publishing instructions are in PUBLISHING.md and PUBLISHING.ko.md.

License

Apache-2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

documind-0.2.0.tar.gz (251.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

documind-0.2.0-py3-none-any.whl (270.3 kB view details)

Uploaded Python 3

File details

Details for the file documind-0.2.0.tar.gz.

File metadata

  • Download URL: documind-0.2.0.tar.gz
  • Upload date:
  • Size: 251.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for documind-0.2.0.tar.gz
Algorithm Hash digest
SHA256 c77ddd0e6cb29e20a1ad1218d8e5e4591de2a0fde9a47cfcf27962ce1592808b
MD5 d6d6d736022f66585e62f31f388c26e3
BLAKE2b-256 ff09d2317c9b7fd8877d239dc85d42e1329d45c59871565cde3eb39651d52b4b

See more details on using hashes here.

File details

Details for the file documind-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: documind-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 270.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for documind-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8dad0ef669699c7feebc9e46a8da35b1f30d6e5d4b509e6bcb03e8d8ce6d2081
MD5 8059730086d06c864e05e2ea27ddc9e0
BLAKE2b-256 95744cf1ccc3af8e76aa03de714edd3067bff4b004fb5be50fb8634bdc6bd9f8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page