Skip to main content

An LLM-based Retrieval-Augmented Generation (RAG) component for education, supporting problem solving, academic literature analysis, and knowledge Q&A.

Project description

EduRAG

An LLM-based Retrieval-Augmented Generation (RAG) component for education, supporting problem solving, academic literature analysis, and knowledge Q&A.

Features

  • Two RAG Modes: SimpleRAG (simple & efficient) and AgenticRAG (intelligent reasoning)
  • Multi-LLM Support: OpenAI GPT series, Google Gemini, Ollama local models
  • Multiple Document Formats: PDF, DOCX, DOC, TXT, Markdown
  • Teacher Persona Customization: Custom teacher name, subject, grade level, teaching style
  • Multi-turn Conversation: Context-aware continuous Q&A
  • Vector Store Persistence: Avoid repeated embedding, fast knowledge base loading

Installation

pip install edurag

Install optional dependencies:

# Use AgenticRAG (based on LangGraph)
pip install edurag[agentic]

# Use Google Gemini
pip install edurag[gemini]

# Use Ollama local models
pip install edurag[ollama]

# Install all optional dependencies
pip install edurag[all]

Quick Start

Basic Usage

from edurag import SimpleRAG

# Initialize
rag = SimpleRAG(api_key="your-openai-api-key")

# Load documents
rag.load_documents("textbook.pdf")

# Ask questions
answer = rag.ask("What is the main content of this document?")
print(answer)

Custom Teacher Persona

from edurag import SimpleRAG, TeacherProfile

# Create teacher profile
teacher = TeacherProfile(
    name="Mr. Wang",
    subject="High School Physics",
    grade_level="Grade 12",
    teaching_style="Focus on concept understanding, good at explaining abstract principles with real-life examples",
    introduction="20 years of teaching experience, physics competition coach"
)

# Initialize RAG
rag = SimpleRAG(
    api_key="your-openai-api-key",
    teacher_profile=teacher
)

# Load textbooks
rag.load_documents([
    "physics_chapter1.pdf",
    "mechanics_topics.docx"
])

# Ask - AI will respond as Mr. Wang with his teaching style
answer = rag.ask("Why is the acceleration of free fall constant?")

Using Different LLMs

# OpenAI
rag = SimpleRAG(
    api_key="sk-xxx",
    llm_provider="openai",
    llm_model="gpt-4o"
)

# Google Gemini
rag = SimpleRAG(
    api_key="your-google-key",
    llm_provider="gemini",
    llm_model="gemini-pro"
)

# Ollama local model (no API Key required)
rag = SimpleRAG(
    llm_provider="ollama",
    llm_model="llama3"
)

Persistent Vector Store

# First time: auto-save vector store
rag = SimpleRAG(
    api_key="sk-xxx",
    vectorstore_path="./my_knowledge_base"
)
rag.load_documents("./documents/")

# Later: load existing store (skip embedding)
rag = SimpleRAG.from_vectorstore(
    vectorstore_path="./my_knowledge_base",
    api_key="sk-xxx"
)

AgenticRAG (Intelligent Reasoning)

Suitable for complex problems. The Agent autonomously decides whether to retrieve and supports multi-step reasoning.

from edurag import AgenticRAG, TeacherProfile

teacher = TeacherProfile(
    name="Mr. Wang",
    subject="High School Physics",
    grade_level="Grade 12",
    teaching_style="Good at analogies, explains with real-life examples"
)

rag = AgenticRAG(
    api_key="sk-xxx",
    teacher_profile=teacher
)

rag.load_documents(["physics_textbook.pdf"])

# Agent automatically decides whether to retrieve, supports multi-step reasoning
answer = rag.ask("Compare the similarities and differences of Newton's three laws")

# View reasoning process
result = rag.ask_with_steps("Explain the law of conservation of momentum")
print(result["steps"])  # View Agent's reasoning steps
print(result["answer"]) # Final answer

SimpleRAG vs AgenticRAG:

Feature SimpleRAG AgenticRAG
Retrieval Always retrieves Agent decides
Reasoning Single-step Multi-step
Speed Faster Slower
Cost Lower Higher
Use Case Simple Q&A Complex analysis

API Reference

SimpleRAG

Main RAG class providing document loading and Q&A functionality.

Initialization Parameters

Parameter Type Default Description
api_key str None LLM API key
llm_provider str "openai" LLM provider: openai/gemini/ollama
llm_model str "gpt-4o" Model name
teacher_profile TeacherProfile None Teacher persona configuration
config EduRAGConfig None Full configuration object

Methods

  • load_documents(sources): Load documents into knowledge base
  • ask(question): Ask and get answer
  • ask_with_sources(question): Ask and return answer with source documents
  • search(query, top_k): Directly search relevant documents
  • clear_history(): Clear conversation history
  • save_vectorstore(path): Save vector store

TeacherProfile

Teacher persona configuration class.

from edurag import TeacherProfile

teacher = TeacherProfile(
    name="Mr. Li",              # Teacher name
    subject="Mathematics",       # Teaching subject
    grade_level="Middle School", # Grade level
    teaching_style="...",        # Teaching style
    introduction="...",          # Introduction (optional)
    language="English"           # Response language
)

EduRAGConfig

Full configuration class for advanced customization.

from edurag import EduRAGConfig

config = EduRAGConfig(
    llm_provider="openai",
    llm_model="gpt-4o",
    api_key="sk-xxx",
    temperature=0.7,           # Generation temperature
    chunk_size=1000,           # Document chunk size
    chunk_overlap=200,         # Chunk overlap
    retrieval_top_k=4,         # Number of retrieval results
    vectorstore_path=None      # Vector store path
)

Preset Teacher Templates

from edurag.prompt.teacher_profile import PRESET_TEACHERS

# Available presets
teacher = PRESET_TEACHERS["physics_senior"]    # High School Physics
teacher = PRESET_TEACHERS["math_college"]      # College Mathematics
teacher = PRESET_TEACHERS["english_junior"]    # Middle School English
teacher = PRESET_TEACHERS["chemistry_senior"]  # High School Chemistry

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

edurag-0.1.3.tar.gz (11.5 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

edurag-0.1.3-py3-none-any.whl (21.5 kB view details)

Uploaded Python 3

File details

Details for the file edurag-0.1.3.tar.gz.

File metadata

  • Download URL: edurag-0.1.3.tar.gz
  • Upload date:
  • Size: 11.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.0

File hashes

Hashes for edurag-0.1.3.tar.gz
Algorithm Hash digest
SHA256 b1eb69c34f894c098fadca1fb7d63b396323b5dc8d6d2bd3efbc2b86b6ee9df1
MD5 09c8a66348dd4667a29aabd5417d7da1
BLAKE2b-256 e100173324724f7ff6f53914f2c562986584083e72773a8dfd77bacd939a8ac5

See more details on using hashes here.

File details

Details for the file edurag-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: edurag-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 21.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.0

File hashes

Hashes for edurag-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 23a9013ac432cfcdbfe7b908416f8bc8e45f69ee9f52c032f2ca5b839fd5ff5b
MD5 d758cad8fca9e729fad3f4e1c195154a
BLAKE2b-256 affe3de281c51da90a43a1a85a0d09196aa903e2931e099afe7fbcc3308bee3b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page