Skip to main content

Retrieval-augmented study engine: document parsing, structure-aware chunking, hybrid retrieval, LLM backends, and generated study tools.

Project description

studyengine

Retrieval-augmented study engine. Document parsing, structure-aware chunking, hybrid retrieval, LLM backends, and generated study tools (summaries, quizzes, flashcards, close reading).

Install

pip install studyengine

Python 3.11+.

Usage

Point storage at an app-owned directory once, then ingest and retrieve.

import studyengine

studyengine.configure("./storage")  # creates chroma/, embed_cache.sqlite, covers/, markdown/

from studyengine.parser import parse
from studyengine.chunker import chunk_document
from studyengine import vectorstore
from studyengine.retriever import retrieve

doc = parse("paper.pdf")
chunks = chunk_document(doc.document)
vectorstore.add_chunks(doc_id="paper", chunks=chunks)

hits = retrieve("what is the main claim?", doc_ids=["paper"])

Modules

Module Purpose
parser PDF/DOCX/text parsing via docling
sections Heading-aware section detection
chunker Structure-aware, token-budgeted chunking
embedder Sentence-transformer embeddings with on-disk cache
vectorstore Chroma-backed storage and hybrid query
retriever Dense + BM25 retrieval fused with RRF
composer Prompt assembly within a context budget
summarizer Per-section summaries
quiz Question generation and grading
close_reading Scoped chat, comprehension, and "go deeper" streams
llm Anthropic and Ollama backends behind a common interface

Configuration

configure(root) sets the storage layout. Backends and model choices read from environment variables (LLM_BACKEND, ANTHROPIC_MODEL, EMBED_MODEL, OLLAMA_BASE_URL, and others in studyengine.config).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

studyengine-0.1.0.tar.gz (30.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

studyengine-0.1.0-py3-none-any.whl (40.1 kB view details)

Uploaded Python 3

File details

Details for the file studyengine-0.1.0.tar.gz.

File metadata

  • Download URL: studyengine-0.1.0.tar.gz
  • Upload date:
  • Size: 30.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for studyengine-0.1.0.tar.gz
Algorithm Hash digest
SHA256 ab6e3b69538fe38fc6a0211a475f9bbb4eb7799db8ad30b5f31808661a3b8bf2
MD5 98b401f925e809bd3bbe3f4849c32a81
BLAKE2b-256 03010278ac69658a376f486f3ad704994b3b04e46dd8a2e048fcc1c725f29e70

See more details on using hashes here.

File details

Details for the file studyengine-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: studyengine-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 40.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for studyengine-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4594b9167d51bc5b2e46237c055b0c96f5af6d2827b3f1c9be336b308e236a36
MD5 7e95a457bb82ec03ba0d070ffd188b72
BLAKE2b-256 b3c93d955b4de87febf83245ccd0759c2b2d6f87b61175f9cac471f323d3e270

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page