Zero-Click AEO toolkit: crawl -> ingest -> index -> search, composing markitdown + turbovec + headroom.
Project description
aeo-kit
Zero-Click AEO toolkit — turn any document, file, or website into LLM-readable
Markdown + an llms.txt outline, then index and search it locally. A thin,
clean composition of best-in-class open-source parts; it does not reinvent
conversion, compression, or vector indexing.
crawl → ingest → index → search
(site) (md + (turbovec) (top-k)
llms.txt)
Why
"AI Engine Optimization" (AEO/GEO) starts with getting messy real-world content
into a form a model can use. aeo-kit composes:
- markitdown (MIT) — PDF/DOCX/PPTX/XLSX/HTML/CSV → Markdown
- headroom-ai (Apache-2.0, optional) — compress verbose tool/RAG context 60–95%
- turbovec — air-gapped quantized vector index
- a tiny zero-dependency crawler (no AGPL, no hosted API key)
Everything runs locally. No API keys required.
Install
pip install aeo-kit # core
pip install "aeo-kit[compress]" # + headroom-ai for context compression
CLI
# 1) crawl a site (local interlinked html or a live URL) -> markdown
aeo-crawl ./site/index.html --out build/site
aeo-crawl https://example.com --max-pages 5 --out build/site
# 2) ingest any file / folder / URL -> per-doc md + site-level llms.txt
aeo-ingest ./company_docs --out build/aeo --compress
# 3) local retrieval over the ingested docs (TF-IDF -> turbovec)
aeo-search build/site "consumption tax filing" --k 3
Library
from adapters import markitdown_aeo as mk
conv = mk.convert("page.html")
print(mk.aeo_extract(conv)) # llms.txt seed from the heading structure
Notes
- Crawler politeness: HTTP mode is same-domain only, respects
robots.txt, rate-limits, and is bounded by--max-pages/--max-depth. - Search scope:
aeo-searchuses TF-IDF (lexical) retrieval over turbovec — no embedding model / network required. Swap in a sentence-transformer for semantic search; the turbovec layer is unchanged. - Compression:
headroom-aiprotects user messages and compresses tool/log/RAG content; clean short prose may compress little, by design.
Develop
pip install -e .
python experiment.py # end-to-end real run -> poc/out/
python audit.py # deterministic checks (exit 0 = all pass)
License
MIT (this toolkit). Bundled dependencies are installed separately and retain
their own licenses — see THIRD_PARTY_NOTICES.md.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file aeo_kit-0.1.0.tar.gz.
File metadata
- Download URL: aeo_kit-0.1.0.tar.gz
- Upload date:
- Size: 15.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dde6691ae15a6fb9573818057856c3ee8541cf27a76ca3dc1b44efe4a04c03e4
|
|
| MD5 |
0cbf023afe97cf21148b841af697482c
|
|
| BLAKE2b-256 |
0e56675ac32c15a90f73c484403b0d537a862392039e0ecdd57f8fe008fc179d
|
File details
Details for the file aeo_kit-0.1.0-py3-none-any.whl.
File metadata
- Download URL: aeo_kit-0.1.0-py3-none-any.whl
- Upload date:
- Size: 16.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6f19dd75a174b984f6f9a6a5ee416e7a685f3bbf544f1f046495506f343c9345
|
|
| MD5 |
f2b48ea89319ad54051e54acb05d8625
|
|
| BLAKE2b-256 |
e96f394a3d6944c81146caf5116279e0c16875e007b90f616894d2d3cbb5e5e4
|