Skip to main content

General-purpose knowledge graph extraction framework

Project description

k-extract

Extract knowledge graphs from any codebase or documentation. Point it at your repos, describe what you're trying to understand, and get a graph out.

Quick Start

1. Define what to extract

uvx k-extract init ./my-repo ./another-repo

This walks you through:

  • Describing your problem ("I need to understand my testing inventory and coverage gaps")
  • Reviewing a proposed ontology (entity types, relationship types)
  • Refining until you're satisfied

Produces extraction.yaml — your complete extraction config.

2. Run the extraction

uvx k-extract run --config extraction.yaml

Outputs graph.jsonl. Ctrl-C anytime — re-run to resume where you left off.

3. Load into kartograph

The output is kartograph-compatible JSONL. Feed it to kartograph's mutation endpoint to query your graph.

Requirements

  • uv (or Python 3.12+ with pip install k-extract)
  • An Anthropic API key (or Vertex AI credentials) — set via environment variables
  • Model configured via environment (e.g., ANTHROPIC_MODEL=claude-sonnet-4-6)

Configuration

extraction.yaml is human-readable and fully editable. It contains:

  • problem_statement — what you're trying to understand
  • data_sources — paths to your repos/data
  • ontology — entity and relationship types to extract
  • prompts — the exact instructions agents receive (generated, but editable)
  • output — where results go (graph.jsonl, extraction.db)

Edit any field, re-run. Changing the config invalidates previous results — use --force to start fresh.

CLI Reference

uvx k-extract init <path> [<path> ...]           # Interactive ontology design
uvx k-extract run --config <yaml>                # Run extraction (resumes by default)
uvx k-extract run --config <yaml> --force        # Discard previous results, start fresh
uvx k-extract jobs --config <yaml>               # Inspect job state
uvx k-extract jobs --config <yaml> --status failed  # See failed jobs

How It Works

  1. init scans your data, proposes an ontology based on your problem statement, and generates agent prompts
  2. run batches files into jobs sized to the model's context window, then launches parallel agents
  3. Each agent reads source files, extracts entities/relationships via tool calls, and commits to a shared store
  4. Results stream to graph.jsonl as jobs complete

License

Apache-2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

k_extract-0.2.0.tar.gz (283.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

k_extract-0.2.0-py3-none-any.whl (70.5 kB view details)

Uploaded Python 3

File details

Details for the file k_extract-0.2.0.tar.gz.

File metadata

  • Download URL: k_extract-0.2.0.tar.gz
  • Upload date:
  • Size: 283.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for k_extract-0.2.0.tar.gz
Algorithm Hash digest
SHA256 b2fa9fb047bec0df30f135062200ee7f89e1192c764e5f65d0879222a4a494ed
MD5 6930a74ec6997a73db695b1b9a004b81
BLAKE2b-256 62b183ea88a802c0bb11d31d1d1ae72b18e41fe21cc48de609f0be81c1468bf8

See more details on using hashes here.

Provenance

The following attestation bundles were made for k_extract-0.2.0.tar.gz:

Publisher: release.yml on jsell-rh/k-extract

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file k_extract-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: k_extract-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 70.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for k_extract-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 bc81cc0713cf4f9c8ce49a5a336c0d48615bbfb6e594a1db0a487d253235b40a
MD5 6cd8fef763ff2140c611ffb57c3b6302
BLAKE2b-256 33a1d39b266a8a4f2904d9a4b3c494d1e941597c07eee4436a36d0e92c568d25

See more details on using hashes here.

Provenance

The following attestation bundles were made for k_extract-0.2.0-py3-none-any.whl:

Publisher: release.yml on jsell-rh/k-extract

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page