FastAPI portal for browsing and editing DC43 contracts and datasets
Project description
dc43-contracts-app
A FastAPI application that surfaces the dc43 governance experience. It relies on shared service clients to interact with contract, governance, and data product backends, and bundles HTML templates plus static assets for local demos and packaged deployments.
Features
- Browse and edit contracts, datasets, and data products backed by any dc43 service implementation.
- Inspect governance metrics alongside dataset and contract records with interactive trend charts so operators can review historical observations without leaving the UI.
- Export integration helper bundles to bootstrap Spark or Delta pipelines.
- Embed a documentation-driven chat assistant powered by LangChain and Gradio so teams can query the Markdown guides that ship with dc43.
Documentation chat assistant
The docs chat surface reuses off-the-shelf components—LangChain for retrieval augmented generation and Gradio for the UI—so the repository does not have to maintain bespoke chat widgets. To enable it:
-
Install the optional extra:
pip install "dc43-contracts-app[docs-chat]" # Working from a source checkout? the root demo extra already pulls the # assistant stack – run `pip install --no-cache-dir -e ".[demo]"` once and # skip additional `dc43-contracts-app[docs-chat]` installs to avoid pip # dependency conflicts.
Mixing both commands in the same environment (for example running the editable install and then invoking the PyPI extra) causes pip to report conflicting requirements because they reference the same local package.
-
Provide an API key via the configured environment variable (defaults to
OPENAI_API_KEY) or set an inline secret withdocs_chat.api_keyin a private TOML file. When you rely onembedding_provider = "huggingface"the same OpenAI key is still used for chat completions, but embeddings no longer require an external service. -
Toggle the feature in
contracts-app.toml:[docs_chat] enabled = true provider = "openai" model = "gpt-4o-mini" embedding_provider = "huggingface" # Opt into "openai" to reuse hosted embeddings. embedding_model = "" # Defaults to sentence-transformers/all-MiniLM-L6-v2. api_key_env = "OPENAI_API_KEY" # api_key = "sk-your-api-key" # optional inline secret stored outside git # code_paths = ["~/project/sibling-module"] # optional extra source directories # reasoning_effort = "medium" # use with OpenAI `o4-*` models
api_key_envrecords the name of the variable that stores your API key. Load the secret separately (for example viadirenv,dotenv, a.envfile passed todc43-demo --env-file, or a shellexport OPENAI_API_KEY=...). -
Pass the config path to the demo launcher so the loader picks up your changes:
dc43-demo --config /path/to/contracts-app.toml
The legacy
export DC43_CONTRACTS_APP_CONFIG=/path/to/contracts-app.tomlworkflow still works when you prefer a global environment variable. -
Restart the dc43 app. The assistant indexes Markdown under
docs/and the source trees insrc//packages/from your dc43 checkout by default and ignores parent directories (for example~/src). Overridedocs_chat.docs_path,docs_chat.code_paths, ordocs_chat.index_pathwhen the repository lives elsewhere.Embeddings are requested in small batches during the initial index build so the default Hugging Face workflow runs locally without tripping OpenAI's token limits. Point the assistant at large documentation or source directories without juggling manual chunk sizes. Prefer managed embeddings? set
embedding_provider = "openai"and specify a compatibleembedding_model. The docs-chat extra already installslangchain-huggingfaceandsentence-transformers, so leavingembedding_modelempty keeps thesentence-transformers/all-MiniLM-L6-v2default.The FastAPI application now kicks off the documentation index warm-up as it loads the configuration so the one-off downloads and FAISS persistence happen in the background while the UI comes online. Cached manifests are reused across restarts until the docs or code change, and if a prompt arrives mid warm-up the chat surface explains that it is waiting for the cached index to finish building before continuing.
While users wait for an answer the chat UI streams progress updates—loading documentation, embedding batches, and querying OpenAI—before presenting the final, cited response. Programmatic callers receive the same step list in the JSON payload under a new
stepsfield.For deployments that want the index ready before the app starts, run
dc43-docs-chat-index --config /path/to/contracts-app.tomlas part of your build pipeline. The CLI shares the same configuration loader and writes the FAISS cache todocs_chat.index_path(or the workspace default) so runtime warm-ups immediately reuse the persisted manifest.
The contracts setup wizard mirrors these settings via the Documentation assistant module. Pick
the Gradio option under the User experience group to populate [docs_chat] in the exported
dc43-contracts-app.toml and surface the assistant alongside other deployment assets.
The assistant is purposefully constrained to dc43 setup and usage. When a prompt strays outside that scope the response reminds the requester to come back with a dc43-specific question so the chat surface stays focused on project guidance.
Programmatic callers can POST to /api/docs-chat/messages with a JSON payload
({"message": "...", "history": [...]}) and receive answers plus cited
sources. The embedded Gradio UI is mounted at /docs-chat/assistant and the
HTML entry point lives at /docs-chat.
Environment variables
| Variable | Purpose |
|---|---|
DC43_CONTRACTS_APP_BACKEND_URL |
Remote backend URL when not running in embedded mode. |
DC43_CONTRACTS_APP_DOCS_CHAT_ENABLED |
Override the docs_chat.enabled flag (1, true, etc.). |
DC43_CONTRACTS_APP_DOCS_CHAT_PROVIDER |
Provider identifier (currently openai). |
DC43_CONTRACTS_APP_DOCS_CHAT_MODEL |
Chat completion model to request. |
DC43_CONTRACTS_APP_DOCS_CHAT_EMBEDDING_PROVIDER |
Embedding backend used to build the FAISS index (openai or huggingface). |
DC43_CONTRACTS_APP_DOCS_CHAT_EMBEDDING_MODEL |
Embedding model used to build the vector index. Leave empty when relying on the default Hugging Face model. |
DC43_CONTRACTS_APP_DOCS_CHAT_API_KEY_ENV |
Name of the environment variable that stores the provider API key. |
DC43_CONTRACTS_APP_DOCS_CHAT_API_KEY |
Inline provider API key used when you prefer not to rely on environment variables. |
DC43_CONTRACTS_APP_DOCS_CHAT_PATH |
Override the directory that contains Markdown documentation. |
DC43_CONTRACTS_APP_DOCS_CHAT_INDEX |
Directory where the LangChain/FAISS index is stored. |
DC43_CONTRACTS_APP_DOCS_CHAT_CODE_PATHS |
Comma/: separated list of extra source directories to index. |
DC43_CONTRACTS_APP_DOCS_CHAT_REASONING_EFFORT |
Reasoning hint (low/medium/high) for OpenAI o4/o1 models. |
Combine these overrides with existing workspace and backend settings to tailor the dc43 app to your deployment environment.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dc43_contracts_app-0.41.0.0.tar.gz.
File metadata
- Download URL: dc43_contracts_app-0.41.0.0.tar.gz
- Upload date:
- Size: 409.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
253113f412ac1e95c86a53866abf1c60b0260984d266264177474bfc16b3c4db
|
|
| MD5 |
9dcba2b45fc39722b594a719054d289b
|
|
| BLAKE2b-256 |
6d44067a93646ef909f2dbe081db2c7eff9383941da44d41a60e097303927ffa
|
File details
Details for the file dc43_contracts_app-0.41.0.0-py3-none-any.whl.
File metadata
- Download URL: dc43_contracts_app-0.41.0.0-py3-none-any.whl
- Upload date:
- Size: 406.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b1fc229ebac68168c70b3fbb1f7b800c889b78c2e12574916d9f0296f408ae89
|
|
| MD5 |
e8d58c3857d5b96f738eb970d3e3782e
|
|
| BLAKE2b-256 |
3c9d2eb313946ecb797687fce641ab8b2b4e228b682238a9883561e404de2ea3
|