Repository-grounded KMDS helper that analyzes project artifacts and builds a KMDS knowledge graph
Project description
KMDS Data Helper: Repo Architect Framework
A modular, multi-persona framework for analyzing data science repositories. Uses local LLMs (via Ollama) to synthesize insights from documentation, data schemas, and Jupyter notebooks.
📂 Project Structure
KMDS-Helper follows a strict modular architecture to separate concerns:
src/kmds_data_helper/: Core logic modules (Config, Processing, LLM, Engine).documents/: Project documentation (.pdf, .txt).data/: Physical data assets (CSVs) - isolated from output.notebooks/: Experimental code (.ipynb).output/: Isolated directory for generated reports.
🛠️ Installation & Setup
- Environment: Ensure you are using the local virtual environment.
source .venv/bin/activate
- LLM Engine: Requires Ollama running locally with the
qwen2.5-coder:7bmodel. - Dependencies:
pip install rich ollama dataprofiler pymupdf4llm nbformat pyyaml
⚙️ Configuration
The framework is controlled by kmds_config.yaml in the root directory. You can toggle persona behaviors (Scientist, Tech Lead, Architect) and pathing without changing Python code.
🚀 Usage
Run the main orchestrator from the project root:
uv run uvicorn api:app --reload
##🧪 Tests
uv run pytest tests/test_personas.py
📦 Packaged Usage (v1)
This first version assumes a fixed repository structure. A user can install the package, run the knowledge-graph aggregator in a cloned repo, and produce a KMDS knowledge graph.
Required folders in the cloned repo
documents/notebooks/data_dictionary/output/
Expected helper output artifacts
At least one of these files should exist in output/:
full_service_report.jsonkmds_summary.jsonkmds_strategic_summary.json
Install
From the project root:
pip install -e .
Generate knowledge graph from helper outputs
kmds-kb --workspace . --project-file project_knowledge_graph.xml --mode auto
The command validates the required folders, ingests the helper output artifacts, and writes:
project_knowledge_graph.xml
Adapter command (direct use)
You can also run the output adapter directly for a single file:
kmds-analyze --input output/full_service_report.json --project-file project_knowledge_graph.xml --create-project --workflow-name kmds_project_workflow --mode auto
Backward-compatible template script
If you are using the template script path, this remains supported:
python kb_aggregator.py --workspace . --project-file project_knowledge_graph.xml --mode auto
Common failure messages
- Missing folder(s): one or more required directories are absent.
- No helper output files found: none of the expected JSON artifacts are present in
output/. - Project file already exists in create mode: rerun with update mode or choose a new target path.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file kmds_data_helper-0.3.0.tar.gz.
File metadata
- Download URL: kmds_data_helper-0.3.0.tar.gz
- Upload date:
- Size: 20.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
60ca5678be7182f5664ea8e14e158352f9aa6a90ca7a226a8035560f1ab06cbd
|
|
| MD5 |
a8043b294bf8ee3ec7c27b1399747680
|
|
| BLAKE2b-256 |
cba71a3a89ac66af8b490f5b4275d2a56044c66da1b3d865889749290ec11630
|
File details
Details for the file kmds_data_helper-0.3.0-py3-none-any.whl.
File metadata
- Download URL: kmds_data_helper-0.3.0-py3-none-any.whl
- Upload date:
- Size: 21.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7fad3523552d9c576febd22688b25e14c52f2b41d6a9cb1e37730f4db5e17001
|
|
| MD5 |
3afe4f3d8f3422f05775982f3217ebb0
|
|
| BLAKE2b-256 |
f2968ab5a2030fc0ed18d8b22ae5214a801fcca6e28c0ab64ff38e9cfd81b681
|