Offline Development Assistant powered by Ollama and ChromaDB

These details have not been verified by PyPI

Project links

Project description

GangDan (纲担)

LLM-powered knowledge management and teaching assistant with offline support.

GangDan (纲担) — Principled and Accountable.

Chat Panel

Overview

GangDan is a local-first, offline programming assistant powered by Ollama and ChromaDB. It combines RAG-based knowledge management with teaching assistance tools, all running entirely on your machine — no cloud APIs required.

System Architecture

Features

Knowledge Management

Unified Literature Search — Search arXiv, bioRxiv, medRxiv, Semantic Scholar, CrossRef, OpenAlex, DBLP, PubMed, and GitHub in one interface. AI-powered query refinement with automatic translation and synonym expansion.
Batch Operations — Multi-select, select-all, batch convert (PDF/HTML/TeX to Markdown with image and formula preservation), batch add to knowledge base. Sort by relevance, date, or title.
Smart Renaming — Downloaded papers automatically renamed to citation format: Author et al. (Year) - Title.pdf
LLM-Generated Wiki — Build structured wiki pages from knowledge base content with cross-KB concept linking. Like Wikipedia for your documents.
Image Gallery — Browse and search images stored in knowledge bases with context and source attribution.
Document Manager — One-click download and indexing of 30+ library docs (Python, Rust, Go, JS, CUDA, Docker, etc.). Upload custom docs, batch operations, GitHub repo search, web search to KB.
Custom Knowledge Base Upload — Upload your own Markdown (.md) and plain text (.txt) documents to create named knowledge bases with automatic indexing.

Teaching Assistant

Question Generator — MCQ, short answer, fill-in-the-blank, true/false from KB content.
Guided Learning — Auto-extract knowledge points, generate interactive lessons with Q&A.
Deep Research — Multi-phase research pipeline: topic decomposition → RAG research → comprehensive reports.
Lecture Maker — Generate structured lecture content from KB materials.
Exam Generator — Create complete exam papers with answer keys from KB content.
Literature Review & Paper Writer — Generate academic reviews and papers from KB content.

Core Features

RAG Chat — Streaming chat with knowledge base retrieval and web search. Strict KB mode ensures grounded answers.
Cross-Lingual Search — Automatically detects query and document languages, enabling cross-lingual RAG (e.g., query English documents in Chinese).
Citation References — Each response automatically includes source document references for verification.
AI Command Assistant — Natural language → shell commands, draggable to terminal.
Built-in Terminal — Run commands with stdout/stderr display directly in the browser.
Conversation Save/Load — JSON export/import for session continuity.
10-Language UI — Chinese, English, Japanese, French, Russian, German, Italian, Spanish, Portuguese, Korean.
Dark/Light Theme — Full theme support with CSS variables.
Offline by Design — Runs entirely on your machine. No cloud APIs required.

Feature Map

Multi-Provider LLM Support

GangDan supports a separated mode: local Ollama for chat/embedding/reranking, with optional external LLM providers for deep research and paper writing.

Provider System

Provider	API Type	Use Case
Ollama (local)	ollama	Chat, Embedding, Reranking
DashScope	OpenAI-compatible	Deep Research, Paper Writing
MiniMax	OpenAI-compatible	Deep Research
Bailian Coding	Anthropic-compatible	Deep Research
OpenAI / DeepSeek / Moonshot	OpenAI-compatible	Deep Research
Custom	OpenAI-compatible	Any compatible API

CLI

Streaming chat (gangdan chat "question"), interactive REPL (gangdan cli)
KB operations, doc management, config, conversation persistence
AI command generation, shell execution with safety checks
Rich terminal output with formatted tables and syntax highlighting

Screenshots

Chat	Terminal

Documentation	Settings

Upload Documents	KB Scope Selection

Strict KB Chat with Citations

The above screenshot demonstrates Strict KB Mode in action: after selecting a specific knowledge base, the system retrieves content only from that KB and automatically appends a reference list at the end of each response, citing the source documents.

Load Conversation	Conversation Loaded

Save your chat as a JSON file and load it anytime to continue the conversation.

RAG Pipeline

The complete pipeline from document ingestion to retrieval:

Document Ingestion — Download from GitHub repositories or upload custom files (.rst, .py, .html, .cpp, .md)
Format Conversion — Automatic conversion to unified Markdown format
Sliding Window Chunking — Fixed-size segmentation with configurable overlap (default: 800 chars, 150 overlap)
Vector Embedding — nomic-embed-text model via Ollama API (768-dim vectors, 500-char truncation)
Vector Storage — ChromaDB with HNSW indexing and cosine similarity
Query Retrieval — Top-K search with distance filtering (threshold 1.5), deduplication, and context construction

Chunking Strategy

The sliding window approach ensures contextual continuity across chunk boundaries. Key parameters:

Parameter	Default	Range	Description
CHUNK_SIZE	800 chars	100-2000	Characters per chunk
CHUNK_OVERLAP	150 chars	N/A	Overlap between consecutive chunks
MIN_CHUNK	50 chars	N/A	Minimum chunk length threshold

Requirements

Python 3.10+
Ollama running locally (default http://localhost:11434)
Chat model (e.g. ollama pull qwen2.5)
Embedding model (e.g. ollama pull nomic-embed-text)

Installation

Method 1: Install from PyPI (Recommended)

pip install gangdan
gangdan                    # Web GUI
gangdan cli                # Interactive CLI
gangdan --port 8080        # Custom port

Method 2: Install from Source

git clone https://github.com/cycleuser/GangDan.git
cd GangDan
pip install -e .
gangdan

Open http://127.0.0.1:5000 in your browser.

Ollama Setup

ollama serve
ollama pull qwen2.5
ollama pull nomic-embed-text

Project Structure

GangDan/
├── pyproject.toml
├── README.md / README_CN.md
├── gangdan/
│   ├── __init__.py / __main__.py
│   ├── cli.py / cli_app.py          # CLI entry + REPL
│   ├── app.py                       # Flask backend
│   ├── learning_routes.py           # Learning module blueprint
│   ├── preprint_routes.py           # Preprint search + convert
│   ├── research_routes.py           # Paper search
│   ├── kb_routes.py                 # Custom KB management
│   ├── export_routes.py             # Export API
│   ├── core/                        # Shared modules
│   │   ├── config.py                # Config, i18n, translations
│   │   ├── ollama_client.py         # Ollama API
│   │   ├── chroma_manager.py        # ChromaDB
│   │   ├── vector_db.py             # Multi-backend vector DB
│   │   ├── kb_manager.py            # Custom KB CRUD
│   │   ├── conversation.py          # Chat history
│   │   ├── doc_manager.py           # Doc download/index
│   │   ├── wiki_builder.py          # LLM wiki generation
│   │   ├── preprint_fetcher.py      # Preprint search
│   │   ├── preprint_converter.py    # HTML/TeX/PDF → MD
│   │   ├── pdf_converter.py         # PDF → MD (marker/mineru/docling)
│   │   ├── export_manager.py        # Batch convert/export
│   │   ├── web_searcher.py          # Web search
│   │   └── ...
│   ├── templates/index.html         # Main SPA template
│   └── static/{css,js}/             # Frontend assets
├── tests/                           # Test suite
├── images/                          # Screenshots
├── diagrams/                        # Architecture diagrams (SVG)
└── removed/                         # Deprecated files

Configuration

All settings through the Settings tab: Ollama URL, chat/embedding/reranker models, proxy, context length, output language, vector DB type, LLM provider selection, and API keys.

Testing

pip install pytest pytest-cov
pytest tests/ -v
pytest tests/ --cov=gangdan

Academic Paper

For a detailed empirical study of the RAG pipeline and chunking strategies, see Article.md / Article_CN.md.

License

GPL-3.0-or-later. See LICENSE for details.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.0.40

May 19, 2026

1.0.39

May 19, 2026

1.0.38

May 18, 2026

1.0.37

May 18, 2026

1.0.36

May 13, 2026

1.0.35

May 12, 2026

1.0.34

May 12, 2026

1.0.33

May 11, 2026

1.0.32

May 11, 2026

1.0.31

May 8, 2026

1.0.30

May 8, 2026

1.0.29

May 8, 2026

This version

1.0.28

May 7, 2026

1.0.27

May 7, 2026

1.0.26

May 7, 2026

1.0.25

May 2, 2026

1.0.24

May 2, 2026

1.0.23

May 1, 2026

1.0.22

Apr 29, 2026

1.0.21

Apr 26, 2026

1.0.20

Apr 26, 2026

1.0.19

Apr 26, 2026

1.0.18

Apr 26, 2026

1.0.17

Apr 26, 2026

1.0.16

Apr 25, 2026

1.0.15

Apr 25, 2026

1.0.14

Mar 27, 2026

1.0.13

Mar 26, 2026

1.0.12

Mar 20, 2026

1.0.11

Mar 18, 2026

1.0.10

Mar 12, 2026

1.0.9

Mar 12, 2026

1.0.8

Mar 11, 2026

1.0.7

Mar 9, 2026

1.0.6

Mar 7, 2026

1.0.5

Mar 7, 2026

1.0.3

Feb 21, 2026

1.0.2

Feb 21, 2026

1.0.1

Feb 21, 2026

1.0.0

Feb 21, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gangdan-1.0.28.tar.gz (407.1 kB view details)

Uploaded May 7, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

gangdan-1.0.28-py3-none-any.whl (408.2 kB view details)

Uploaded May 7, 2026 Python 3

File details

Details for the file gangdan-1.0.28.tar.gz.

File metadata

Download URL: gangdan-1.0.28.tar.gz
Upload date: May 7, 2026
Size: 407.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for gangdan-1.0.28.tar.gz
Algorithm	Hash digest
SHA256	`36abe07f551d92aa90fa2646f9757517cba48fc55d2b380c4b8a15e0f80b1958`
MD5	`3310301944b81ee1b36e2c266ed595cc`
BLAKE2b-256	`acd2c97d70f1e0ce19b9af86592bd4e66311e1d802f4a353fa71de99cf5d71a7`

See more details on using hashes here.

File details

Details for the file gangdan-1.0.28-py3-none-any.whl.

File metadata

Download URL: gangdan-1.0.28-py3-none-any.whl
Upload date: May 7, 2026
Size: 408.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for gangdan-1.0.28-py3-none-any.whl
Algorithm	Hash digest
SHA256	`63dbdaf772de03534476400cd8a6744a344d6ac1bfa0f65b1a45b769bea85024`
MD5	`023c44c93c98a5c9e6d6150d795a5269`
BLAKE2b-256	`ef4009955df42a99cd4262d4b150e5a380ef122b9ce5b09662f433b79add169e`

See more details on using hashes here.

gangdan 1.0.28

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

GangDan (纲担)

Overview

Features

Knowledge Management

Teaching Assistant

Core Features

Multi-Provider LLM Support

CLI

Screenshots

RAG Pipeline

Chunking Strategy

Requirements

Installation

Method 1: Install from PyPI (Recommended)

Method 2: Install from Source

Ollama Setup

Project Structure

Configuration

Testing

Academic Paper

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes