Skip to main content

Little Heta first-time initialization CLI

Project description

Little Heta

Little Heta banner

English · 简体中文 · 繁體中文 · 日本語 · 한국어 · Español · Português · Français · Deutsch

PyPI Python versions License: MIT KnowledgeXLab

Little Heta is a local CLI knowledge infrastructure for personal documents, agent memory, and document intelligence. It turns PDFs, Office files, images, audio, code, HTML, Markdown, and notes into a stable Markdown wiki, adds semantic vector retrieval, and lets agents reuse distilled knowledge through a memory layer.

Install

Install from PyPI:

pip install little-heta

From a local checkout:

pip install -e .

For development:

pip install -e ".[dev]"

The package installs the heta command:

heta --help

Initialize

Run the first-time setup:

heta init

You need to prepare:

  • An LLM API key for one provider: Qwen, ChatGPT, or Gemini.
  • Optional MinerU access for PDF and Office parsing. Apply or learn more at MinerU.

heta init writes config and workspace data under:

~/.heta/

It also installs the Little Heta agent skill automatically into:

~/.codex/skills/heta
~/.claude/skills/heta

Use with Codex and Claude Code

After heta init, Codex and Claude Code can discover the Little Heta skill globally. The skill tells the agent when to use:

heta ask "..."
heta query "..."
heta recall "..."
heta remember "..."

You can refresh or reinstall the skill at any time:

heta skill

For other agent frameworks, copy these two files:

~/.heta/skills/heta/SKILL.md
~/.heta/skills/heta/COMMANDS.md

What You Get

Most personal knowledge bases eventually become a /raw folder: papers, slides, screenshots, audio clips, code files, notes, and half-finished drafts all pile up together. A normal agent can read those files directly, but every question pays the same cost again: open the index, guess which page matters, read long pages, and spend tokens rediscovering context it already found before.

Little Heta turns that pile into a persistent agent workspace:

  • Wiki foundation: raw files are compiled into stable Markdown pages with numeric page ids, clean [[Wiki Links]], and Git history.
  • Vector Wiki: each page is chunked by Markdown structure, so heta query can jump to the right section instead of relying only on sparse index.md summaries.
  • Memory-first retrieval: heta ask stores distilled KB insights after expensive lookups, allowing later questions to reuse prior KB understanding instead of repeating the same deep wiki traversal.
  • Synchronized memory + KB management: memory stays tied to the evolving wiki. When KB content changes, related memories can be invalidated to prevent stale cached insights from drifting away from the source of truth.
  • Agent reuse: larger teams and multi-agent workflows benefit because useful KB discoveries can be reused across later questions, sessions, and agents.

Retrieval quality depends heavily on corpus structure. In corpora where important details are buried deep inside long wiki pages and poorly represented by summaries, index-only wiki navigation can suffer severe retrieval collapse. In our initial stress scenarios, Vector Wiki and memory-backed retrieval improved answer accuracy by roughly 1.25x-5x+, with some cases recovering from 0% to 100% accuracy.

Memory-backed reuse used 82.1% fewer tokens than index-only wiki query and answered 1.14x faster even in a small-file setting. This gap is expected to grow in larger or messier workspaces, because index-only wiki navigation scales with the number and length of pages an agent may need to inspect, while memory-backed reuse resolves repeated questions from previously distilled insights. The main extra cost is the first pass that creates the reusable insight.

Core CLI

The main commands are:

  • heta init: set up providers, workspace, and agent skills.
  • heta status: show provider, MinerU, wiki, memory, and space status.
  • heta insert: add files or folders to the knowledge base.
  • heta query: ask a read-only question against inserted documents.
  • heta ask: answer using memory and the document KB together.
  • heta remember: save a fact, decision, or preference.
  • heta recall: retrieve saved memory.
  • heta clean: remove generated wiki pages and vector DB while keeping raw files.
  • heta vector: turn document vector indexing on, off, or show status.
  • heta insert-planning: turn smart insert planning on, off, or show status.
  • heta mem-show: inspect stored KB memories.
  • heta mem-clean: erase memory data.
  • heta skill: install or refresh agent skills.

Detailed command docs:

Supported Files

Little Heta can insert:

  • Markdown and text: .md, .markdown, .txt
  • PDF and Office: .pdf, .doc, .docx, .ppt, .pptx, .xls, .xlsx
  • Images: .png, .jpg, .jpeg, .webp, .gif, .bmp
  • Audio and video transcripts: .mp3, .wav, .m4a, .flac, .ogg, .mp4
  • Code and config files: .py, .js, .ts, .tsx, .jsx, .java, .go, .rs, .cpp, .c, .h, .hpp, .sh, .sql, .yaml, .yml, .json, .toml
  • HTML: .html, .htm

PDF and Office parsing require MinerU. Images and audio/video require a multimodal or transcription-capable LLM provider.

Workspace

Runtime data lives under:

~/.heta/

Important paths:

~/.heta/heta.yaml                              config
~/.heta/workspace/kb/raw                       archived source files
~/.heta/workspace/kb/wiki/index.md            wiki entry index
~/.heta/workspace/kb/wiki/pages/              generated Markdown wiki pages
~/.heta/workspace/kb/wiki/log.md              wiki operation log
~/.heta/workspace/kb/db/wiki_vectors.sqlite3  local wiki vector database
~/.heta/workspace/mem/mem.sqlite3             local memory database
~/.heta/skills/heta/                          portable Little Heta agent skill

Development

Run tests:

pytest

Project layout:

src/heta/          CLI, config, assistants, memory, and KB implementation
docs/              user and technical documentation
tests/             unit tests
pyproject.toml     package metadata and dependencies

Community

If Little Heta is useful to you, please consider giving the project a star. If you run into bugs, rough edges, or missing workflows, open an issue and tell us what happened.

License

Little Heta is released under the MIT License. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

little_heta-0.1.0.tar.gz (103.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

little_heta-0.1.0-py3-none-any.whl (112.5 kB view details)

Uploaded Python 3

File details

Details for the file little_heta-0.1.0.tar.gz.

File metadata

  • Download URL: little_heta-0.1.0.tar.gz
  • Upload date:
  • Size: 103.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for little_heta-0.1.0.tar.gz
Algorithm Hash digest
SHA256 7ab9d44deef8975db2f5908b85eb7954ea9e4c7ac2ab5afc069fb2848dda769c
MD5 11b904715f6c878e6ea8ab204987f65b
BLAKE2b-256 b352a1988cf8a9fa45e0575977a77297a2eac1ee84a1390fddb20b49e7c77f46

See more details on using hashes here.

Provenance

The following attestation bundles were made for little_heta-0.1.0.tar.gz:

Publisher: pypi-publish.yml on KnowledgeXLab/Little_Heta

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file little_heta-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: little_heta-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 112.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for little_heta-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 af993632fc7f6a87943835026c680a58bcd4cb602b46e517c9b7bc04a0cf78c3
MD5 e42958caa35f83c3b7fad3eaf0cf7920
BLAKE2b-256 bc7d4889a0d3ce6be439abe7bacefd39e97fbc684a4c7676b3f79f137fc1b252

See more details on using hashes here.

Provenance

The following attestation bundles were made for little_heta-0.1.0-py3-none-any.whl:

Publisher: pypi-publish.yml on KnowledgeXLab/Little_Heta

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page