Skip to main content

Repository-grounded KMDS helper that analyzes project artifacts and builds a KMDS knowledge graph

Project description


KMDS Data Helper: Repo Architect Framework

A modular, multi-persona framework for analyzing data science repositories. Uses local LLMs (via Ollama) to synthesize insights from documentation, data schemas, and Jupyter notebooks.

📂 Project Structure

KMDS-Helper follows a strict modular architecture to separate concerns:

  • src/kmds_data_helper/: Core logic modules (Config, Processing, LLM, Engine).
  • documents/: Project documentation (.pdf, .txt).
  • data/: Physical data assets (CSVs) - isolated from output.
  • notebooks/: Experimental code (.ipynb).
  • output/: Isolated directory for generated reports.

🛠️ Installation & Setup

  1. Environment: Ensure you are using the local virtual environment.
    source .venv/bin/activate
    
  2. LLM Engine: Requires Ollama running locally with the qwen2.5-coder:7b model.
  3. Dependencies:
    pip install rich ollama dataprofiler pymupdf4llm nbformat pyyaml
    

⚙️ Configuration

The framework is controlled by kmds_config.yaml in the root directory. You can toggle persona behaviors (Scientist, Tech Lead, Architect) and pathing without changing Python code.

🚀 Usage

Run the main orchestrator from the project root:

python3 main.py

📦 Packaged Usage (v1)

This first version assumes a fixed repository structure. A user can install the package, run the knowledge-graph aggregator in a cloned repo, and produce a KMDS knowledge graph.

Required folders in the cloned repo

  • documents/
  • notebooks/
  • data_dictionary/
  • output/

Expected helper output artifacts

At least one of these files should exist in output/:

  • full_service_report.json
  • kmds_summary.json
  • kmds_strategic_summary.json

Install

From the project root:

pip install -e .

Generate knowledge graph from helper outputs

kmds-kb --workspace . --project-file project_knowledge_graph.xml --mode auto

The command validates the required folders, ingests the helper output artifacts, and writes:

  • project_knowledge_graph.xml

Adapter command (direct use)

You can also run the output adapter directly for a single file:

kmds-analyze --input output/full_service_report.json --project-file project_knowledge_graph.xml --create-project --workflow-name kmds_project_workflow --mode auto

Backward-compatible template script

If you are using the template script path, this remains supported:

python kb_aggregator.py --workspace . --project-file project_knowledge_graph.xml --mode auto

Common failure messages

  • Missing folder(s): one or more required directories are absent.
  • No helper output files found: none of the expected JSON artifacts are present in output/.
  • Project file already exists in create mode: rerun with update mode or choose a new target path.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kmds_data_helper-0.1.0.tar.gz (19.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

kmds_data_helper-0.1.0-py3-none-any.whl (20.7 kB view details)

Uploaded Python 3

File details

Details for the file kmds_data_helper-0.1.0.tar.gz.

File metadata

  • Download URL: kmds_data_helper-0.1.0.tar.gz
  • Upload date:
  • Size: 19.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for kmds_data_helper-0.1.0.tar.gz
Algorithm Hash digest
SHA256 901fc213b5b0b29aa833ac9f883bdfc4b9d3a0e179561353d1c39e6d53cf1cdb
MD5 2adc04223a2c6878b95ba44912ef1638
BLAKE2b-256 3c23552418be667a1072f243f79507a8beb1c6cbb960ad5124a84923274cb3a6

See more details on using hashes here.

File details

Details for the file kmds_data_helper-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: kmds_data_helper-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 20.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for kmds_data_helper-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 55bdcfd08ce349be838cd75ad91caf06ad4be9001adb61c59097b4855d3b621d
MD5 42c826eeb62ed5d15efec9604107c9d6
BLAKE2b-256 5afec9a035c0b3aa7f48dd2db96a457a6bb52a30357a425f3ccc866ad9f6fa53

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page