Little Heta first-time initialization CLI
Project description
Little Heta
English · 简体中文 · 繁體中文 · 日本語 · 한국어 · Español · Português · Français · Deutsch
Little Heta is a local CLI knowledge infrastructure for personal documents, agent memory, and document intelligence. It turns PDFs, Office files, images, audio, code, HTML, Markdown, and notes into a stable Markdown wiki, adds semantic vector retrieval, and lets agents reuse distilled knowledge through a memory layer.
Install
Install from PyPI:
pip install little-heta
From a local checkout:
pip install -e .
For development:
pip install -e ".[dev]"
The package installs the heta command:
heta --help
Initialize
Run the first-time setup:
heta init
You need to prepare:
- An LLM API key for one provider: Qwen, ChatGPT, or Gemini.
- Optional MinerU access for PDF and Office parsing. Apply or learn more at MinerU.
heta init writes config and workspace data under:
~/.heta/
It also installs the Little Heta agent skill automatically into:
~/.codex/skills/heta
~/.claude/skills/heta
Use with Codex and Claude Code
After heta init, Codex and Claude Code can discover the Little Heta skill
globally. The skill tells the agent when to use:
heta ask "..."
heta query "..."
heta recall "..."
heta remember "..."
You can refresh or reinstall the skill at any time:
heta skill
For other agent frameworks, copy these two files:
~/.heta/skills/heta/SKILL.md
~/.heta/skills/heta/COMMANDS.md
What You Get
Most personal knowledge bases eventually become a /raw folder: papers,
slides, screenshots, audio clips, code files, notes, and half-finished drafts
all pile up together. A normal agent can read those files directly, but every
question pays the same cost again: open the index, guess which page matters,
read long pages, and spend tokens rediscovering context it already found before.
Little Heta turns that pile into a persistent agent workspace:
- Wiki foundation: raw files are compiled into stable Markdown pages with
numeric page ids, clean
[[Wiki Links]], and Git history. - Vector Wiki: each page is chunked by Markdown structure, so
heta querycan jump to the right section instead of relying only on sparseindex.mdsummaries. - Memory-first retrieval:
heta askstores distilled KB insights after expensive lookups, allowing later questions to reuse prior KB understanding instead of repeating the same deep wiki traversal. - Synchronized memory + KB management: memory stays tied to the evolving wiki. When KB content changes, related memories can be invalidated to prevent stale cached insights from drifting away from the source of truth.
- Agent reuse: larger teams and multi-agent workflows benefit because useful KB discoveries can be reused across later questions, sessions, and agents.
Retrieval quality depends heavily on corpus structure. In corpora where important details are buried deep inside long wiki pages and poorly represented by summaries, index-only wiki navigation can suffer severe retrieval collapse. In our initial stress scenarios, Vector Wiki and memory-backed retrieval improved answer accuracy by roughly 1.25x-5x+, with some cases recovering from 0% to 100% accuracy.
Memory-backed reuse used 82.1% fewer tokens than index-only wiki query and answered 1.14x faster even in a small-file setting. This gap is expected to grow in larger or messier workspaces, because index-only wiki navigation scales with the number and length of pages an agent may need to inspect, while memory-backed reuse resolves repeated questions from previously distilled insights. The main extra cost is the first pass that creates the reusable insight.
Core CLI
The main commands are:
heta init: set up providers, workspace, and agent skills.heta status: show provider, MinerU, wiki, memory, and space status.heta insert: add files or folders to the knowledge base.heta query: ask a read-only question against inserted documents.heta ask: answer using memory and the document KB together.heta remember: save a fact, decision, or preference.heta recall: retrieve saved memory.heta clean: remove generated wiki pages and vector DB while keeping raw files.heta vector: turn document vector indexing on, off, or show status.heta insert-planning: turn smart insert planning on, off, or show status.heta mem-show: inspect stored KB memories.heta mem-clean: erase memory data.heta skill: install or refresh agent skills.
Detailed command docs:
Supported Files
Little Heta can insert:
- Markdown and text:
.md,.markdown,.txt - PDF and Office:
.pdf,.doc,.docx,.ppt,.pptx,.xls,.xlsx - Images:
.png,.jpg,.jpeg,.webp,.gif,.bmp - Audio and video transcripts:
.mp3,.wav,.m4a,.flac,.ogg,.mp4 - Code and config files:
.py,.js,.ts,.tsx,.jsx,.java,.go,.rs,.cpp,.c,.h,.hpp,.sh,.sql,.yaml,.yml,.json,.toml - HTML:
.html,.htm
PDF and Office parsing require MinerU. Images and audio/video require a multimodal or transcription-capable LLM provider.
Workspace
Runtime data lives under:
~/.heta/
Important paths:
~/.heta/heta.yaml config
~/.heta/workspace/kb/raw archived source files
~/.heta/workspace/kb/wiki/index.md wiki entry index
~/.heta/workspace/kb/wiki/pages/ generated Markdown wiki pages
~/.heta/workspace/kb/wiki/log.md wiki operation log
~/.heta/workspace/kb/db/wiki_vectors.sqlite3 local wiki vector database
~/.heta/workspace/mem/mem.sqlite3 local memory database
~/.heta/skills/heta/ portable Little Heta agent skill
Development
Run tests:
pytest
Project layout:
src/heta/ CLI, config, assistants, memory, and KB implementation
docs/ user and technical documentation
tests/ unit tests
pyproject.toml package metadata and dependencies
Community
If Little Heta is useful to you, please consider giving the project a star. If you run into bugs, rough edges, or missing workflows, open an issue and tell us what happened.
License
Little Heta is released under the MIT License. See LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file little_heta-0.1.0.tar.gz.
File metadata
- Download URL: little_heta-0.1.0.tar.gz
- Upload date:
- Size: 103.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7ab9d44deef8975db2f5908b85eb7954ea9e4c7ac2ab5afc069fb2848dda769c
|
|
| MD5 |
11b904715f6c878e6ea8ab204987f65b
|
|
| BLAKE2b-256 |
b352a1988cf8a9fa45e0575977a77297a2eac1ee84a1390fddb20b49e7c77f46
|
Provenance
The following attestation bundles were made for little_heta-0.1.0.tar.gz:
Publisher:
pypi-publish.yml on KnowledgeXLab/Little_Heta
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
little_heta-0.1.0.tar.gz -
Subject digest:
7ab9d44deef8975db2f5908b85eb7954ea9e4c7ac2ab5afc069fb2848dda769c - Sigstore transparency entry: 1566393542
- Sigstore integration time:
-
Permalink:
KnowledgeXLab/Little_Heta@ff3e16d05f93f10efc4dfc2ce34c3bfbef195479 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/KnowledgeXLab
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi-publish.yml@ff3e16d05f93f10efc4dfc2ce34c3bfbef195479 -
Trigger Event:
release
-
Statement type:
File details
Details for the file little_heta-0.1.0-py3-none-any.whl.
File metadata
- Download URL: little_heta-0.1.0-py3-none-any.whl
- Upload date:
- Size: 112.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
af993632fc7f6a87943835026c680a58bcd4cb602b46e517c9b7bc04a0cf78c3
|
|
| MD5 |
e42958caa35f83c3b7fad3eaf0cf7920
|
|
| BLAKE2b-256 |
bc7d4889a0d3ce6be439abe7bacefd39e97fbc684a4c7676b3f79f137fc1b252
|
Provenance
The following attestation bundles were made for little_heta-0.1.0-py3-none-any.whl:
Publisher:
pypi-publish.yml on KnowledgeXLab/Little_Heta
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
little_heta-0.1.0-py3-none-any.whl -
Subject digest:
af993632fc7f6a87943835026c680a58bcd4cb602b46e517c9b7bc04a0cf78c3 - Sigstore transparency entry: 1566393577
- Sigstore integration time:
-
Permalink:
KnowledgeXLab/Little_Heta@ff3e16d05f93f10efc4dfc2ce34c3bfbef195479 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/KnowledgeXLab
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi-publish.yml@ff3e16d05f93f10efc4dfc2ce34c3bfbef195479 -
Trigger Event:
release
-
Statement type: