Skip to main content

Core Source / Extractor protocols and concrete implementations for the digestkit ecosystem

Project description

English | 日本語

digestkit-core

CI License: Apache 2.0 Python Ruff

Neutral core library providing Source / Extractor protocols and reusable concrete implementations shared by digestkit (human-facing 1:1 digest pipeline) and rag-ingest (machine-facing 1:N ingestion pipeline).

digestkit-core is deliberately kept free of LLM, vector-store, and notification dependencies so it stays usable across both consumers.

What's inside

Module Provides
digestkit_core.protocols Source, Extractor (runtime_checkable Protocols)
digestkit_core.types Item, Digest, DigestkitError, FailureInfo, ...
digestkit_core.sources.local_directory LocalDirectorySource (filesystem glob)
digestkit_core.sources.notion_database NotionDatabaseSource (Notion DB query + ack callbacks)
digestkit_core.extractors.pdf PDFExtractor + ExtractionError
digestkit_core.extractors.webpage WebPageExtractor (httpx + trafilatura)

Installation

Note: digestkit-core is not yet published to PyPI. Install from the umbrella repository's main branch using a git URL until the first release.

pip install "digestkit-core @ git+https://github.com/koki-nakamura22/inboxkit.git@main#subdirectory=packages/digestkit-core"

For uv projects:

[project]
dependencies = ["digestkit-core>=0.1,<0.2"]

[tool.uv.sources]
digestkit-core = { git = "https://github.com/koki-nakamura22/inboxkit.git", subdirectory = "packages/digestkit-core", branch = "main" }

End users will typically not depend on digestkit-core directly. Installing digestkit or rag-ingest pulls it in automatically.

Neutrality contract

digestkit-core is forbidden to depend on:

  • LLM clients (litellm, provider SDKs)
  • Vector stores (sqlite-vec, ...)
  • Notification systems (SMTP, Slack SDK)
  • digestkit or rag-ingest themselves (reverse-direction dependency)

This is enforced in CI via .github/workflows/digestkit-core-inspection.yml. The rationale is documented in ADR-0003.

Contributing

See the umbrella CONTRIBUTING.md for development setup, lint / format / typecheck targets, and the pre-commit hook.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

digestkit_core-0.1.0.tar.gz (16.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

digestkit_core-0.1.0-py3-none-any.whl (16.1 kB view details)

Uploaded Python 3

File details

Details for the file digestkit_core-0.1.0.tar.gz.

File metadata

  • Download URL: digestkit_core-0.1.0.tar.gz
  • Upload date:
  • Size: 16.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for digestkit_core-0.1.0.tar.gz
Algorithm Hash digest
SHA256 123bc887f759df1712380399e2bddc8a41968b28f336f59e726e0342d1d43bf7
MD5 018cc0642695e25e3834e9bb9ee5f9ae
BLAKE2b-256 561482dbdb2d363b49989cb21d52c6d1d72a5c770626ace755020102d80ebf18

See more details on using hashes here.

Provenance

The following attestation bundles were made for digestkit_core-0.1.0.tar.gz:

Publisher: publish.yml on koki-nakamura22/inboxkit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file digestkit_core-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: digestkit_core-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 16.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for digestkit_core-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7cb500c9ce3d15760bf8b1820d5a9047e30b2b9569b3d10dc7ed991736384f41
MD5 74384392ba2bfbec7c3a630f48d5042b
BLAKE2b-256 1dd39f47a94738606775974b0a70269d838938ff57fc5d8e91641a5a74282b5d

See more details on using hashes here.

Provenance

The following attestation bundles were made for digestkit_core-0.1.0-py3-none-any.whl:

Publisher: publish.yml on koki-nakamura22/inboxkit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page