Skip to main content

Semantic context for your LLMs — generated automatically

Project description

official project PyPI version License

Databao Context Engine

Semantic context for your LLMs — generated automatically.
No more copying schemas. No manual documentation. Just accurate answers.

WebsiteQuickstartDocsDiscord


What is Databao Context Engine?

Databao Context Engine is a Python library that automatically generates governed semantic context from your databases, BI tools, documents, and spreadsheets.

Use it with any LLM to deliver accurate, context-aware answers — without copying schemas or writing documentation by hand.

You can add Databao Context Engine as a standard Python dependency in your code or via Databao CLI (coming soon).

Your data sources → Context Engine → Unified semantic graph → Any LLM

Why choose Databao Context Engine?

Feature What it means for you
Auto-generated context Extracts schemas, relationships, and semantics automatically
Runs locally Your data never leaves your environment
MCP integration Works with Claude Desktop, Cursor, and any MCP-compatible tool
Multiple sources Databases, dbt projects, spreadsheets, documents
Built-in benchmarks Measure and improve context quality over time
LLM agnostic OpenAI, Anthropic, Ollama, Gemini — use any model
Governed & versioned Track, version, and share context across your team
Dynamic or static Serve context via MCP server or export as artifact

Installation

Databao Context Engine is available on PyPI and can be installed with uv, pip, or another package manager.

Using uv

uv add databao-context-engine

Using pip

pip install databao-context-engine

Supported data sources

  • Athena
  • BigQuery
  • ClickHouse
  • DuckDB
  • MSSQL
  • MySQL
  • PostgreSQL
  • Snowflake
  • SQLite
  • dbt projects
  • PDF files
  • Markdown and text files

Supported LLMs

Provider Configuration
Ollama languageModel: OLLAMA: runs locally, free

Quickstart

1. Create a domain

# Initialize the domain in a temporary directory
from databao_context_engine import init_dce_domain
from pathlib import Path
import tempfile

domain_manager = init_dce_domain(Path(tempfile.mkdtemp()))

# Or use an existing project
from databao_context_engine import DatabaoContextDomainManager

domain_manager = DatabaoContextDomainManager(domain_dir=Path("domain_dir"))

2. Configure data sources

from databao_context_engine import (
    CheckDatasourceConnectionResult,
    DatasourceConnectionStatus,
    DatasourceId,
    DatasourceType,
)

# Create a new datasource
postgres_datasource_id = domain_manager.create_datasource_config(
    DatasourceType(full_type="postgres"),
    datasource_name="my_postgres_datasource",
    config_content={
        "connection": {"host": "localhost", "user": "dev", "password": "pass"}
    },
).datasource.id

# Check the connection to the datasource is valid
check_result: dict[DatasourceId, CheckDatasourceConnectionResult] = domain_manager.check_datasource_connection()

assert len(check_result) == 1
assert check_result[postgres_datasource_id].connection_status == DatasourceConnectionStatus.VALID

3. Build context

build_result = domain_manager.build_context()

assert len(build_result) == 1
assert build_result[0].datasource_id == postgres_datasource_id
assert build_result[0].datasource_type == DatasourceType(full_type="postgres")
assert build_result[0].context_file_path.is_file()

4. Use the built contexts

Create a context engine

# Switch to the engine if you're already using a domain_manager
context_engine = domain_manager.get_engine_for_domain()

# Or directly create a context engine from the path to your DCE domaint
from databao_context_engine import DatabaoContextEngine

context_engine = DatabaoContextEngine(domain_dir=Path("path/to/project"))

Get all built contexts

# Switch to the engine to use the context built
all_built_contexts = context_engine.get_all_contexts()
assert len(all_built_contexts) == 1
assert all_built_contexts[0].datasource_id == postgres_datasource_id

print(all_built_contexts[0].context)

Search in built contexts

# Run a vector similarity search
results = context_engine.search_context("my search query")

print(f"Found {len(results)} results for query")
print(
    "\n\n".join(
        [f"{str(result.datasource_id)}\n{result.context_result}" for result in results]
    )
)

Contributing

We’d love your help! Here’s how to get involved:

  • Star this repo — it helps others find us!
  • 🐛 Found a bug? Open an issue
  • 💡 Have an idea? We’re all ears — create a feature request
  • 👍 Upvote issues you care about — helps us prioritize
  • 🔧 Submit a PR
  • 📝 Improve docs — typos, examples, tutorials — everything helps!

New to open source? No worries! We're friendly and happy to help you get started. 🌱

For more details, see CONTRIBUTING.

📄 License

Apache 2.0 — use it however you want. See the LICENSE file for details.


Like Databao Context Engine? Give us a ⭐ — it means a lot!

WebsiteDocsDiscord

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

databao_context_engine-0.7.1.dev1.tar.gz (132.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

databao_context_engine-0.7.1.dev1-py3-none-any.whl (210.2 kB view details)

Uploaded Python 3

File details

Details for the file databao_context_engine-0.7.1.dev1.tar.gz.

File metadata

File hashes

Hashes for databao_context_engine-0.7.1.dev1.tar.gz
Algorithm Hash digest
SHA256 dd58cbd284990f334c7c768b5ecb25ac63bd8cafae1022082c7e5c4d4847e771
MD5 6a93dc0b31dad7aea5677348a43cde80
BLAKE2b-256 529c6e11b09afbad26edf9326a53ffd4d384e86825a3f7bfc5eab36b252a61c0

See more details on using hashes here.

File details

Details for the file databao_context_engine-0.7.1.dev1-py3-none-any.whl.

File metadata

File hashes

Hashes for databao_context_engine-0.7.1.dev1-py3-none-any.whl
Algorithm Hash digest
SHA256 cbe2ded07c733c2da509a23810f27e9535ee9763d1be059354e77a4f8df24bd5
MD5 23ca3bfb8c1b90ea7bd0faa1b49396a5
BLAKE2b-256 8f07429773157c29576b347672a9433d3fa35482bd201827c6c6d764e9d25c87

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page