CLI utility that summarizes single files into teaching briefs using DSPy
Project description
dspyteach – DSPy File Teaching Analyzer
DSPy-powered CLI that analyzes source files (one or many) and produces teaching briefs
Each run captures:
- an overview of the file and its major sections
- key teaching points, workflows, and pitfalls highlighted in the material
- a polished markdown brief suitable for sharing with learners
The implementation mirrors the multi-file tutorial (tutorials/multi-llmtxt_generator) but focuses on per-file inference. The program is split into:
dspy_file/signatures.py– DSPy signatures that define inputs/outputs for each stepdspy_file/file_analyzer.py– the main DSPy module that orchestrates overview, teaching extraction, and report composition. It now wraps the final report stage withdspy.Refine, pushing for 450–650+ word briefs.dspy_file/file_helpers.py– utilities for loading files and rendering the markdown briefdspy_file/analyze_file_cli.py– command line entry point that configures the local model and prints results. It can walk directories, apply glob filters, and batch-generate briefs.
Requirements
- Python 3.10-3.12+
- DSPy installed in the environment
- A language-model backend. You can choose between:
- Ollama (default): run it locally with the model
hf.co/unsloth/Qwen3-4B-Instruct-2507-GGUF:Q6_K_XLpulled. - LM Studio (OpenAI-compatible): start the LM Studio server (
lms server start) and download a model such asqwen3-4b-instruct-2507@q6_k_xl. - Any other OpenAI-compatible endpoint: point the CLI at a hosted provider by supplying an API base URL and key (defaults to
gpt-5).
- Ollama (default): run it locally with the model
- (Optional)
.envfile for DSPy configuration.dotenvloads variables such asDSPYTEACH_PROVIDER,DSPYTEACH_MODEL,DSPYTEACH_API_BASE,DSPYTEACH_API_KEY, andOPENAI_API_KEY.
Example output
[example-data after running a few passes]
Install the Python dependencies if you have not already: you dont need all of these commands to correctly install
I added multiple install commands and will cleanup later
uv init
uv venv -p 3.12
source .venv/bin/activate
uv pip install dspy python-dotenv
uv sync
will add options to use your preferred model of choice later
ollama pull hf.co/unsloth/Qwen3-4B-Instruct-2507-GGUF:Q6_K_XL
uv pip install dspyteach
Configure the language model
The CLI now supports configurable OpenAI-compatible providers in addition to the default Ollama runtime. You can override the backend via CLI options or environment variables:
# Use LM Studio's OpenAI-compatible server with its default port
dspyteach path/to/project \
--provider lmstudio \
--model qwen3-4b-instruct-2507@q6_k_xl \
--api-base http://localhost:1234/v1
# Environment variable alternative (e.g. inside .env)
export DSPYTEACH_PROVIDER=lmstudio
export DSPYTEACH_MODEL=qwen3-4b-instruct-2507@q6_k_xl
export DSPYTEACH_API_BASE=http://localhost:1234/v1
dspyteach path/to/project
LM Studio must expose its local server before you run the CLI. Start it from the Developer tab inside the LM Studio app or via lms server start (see docs/lm-studio-provider.md for details); otherwise the CLI will exit early with a connection warning.
WSL note: When LM Studio runs on Windows but dspyteach runs from WSL, toggle Serve on local network in LM Studio's Developer settings so the API binds to 0.0.0.0. Then point --api-base at the Windows host IP (for example http://<host-ip>:1234/v1) instead of localhost.
For hosted OpenAI-compatible services, set --provider openai, supply --api-base if needed, and pass an API key either through --api-key, DSPYTEACH_API_KEY, or the standard OPENAI_API_KEY. To keep a local Ollama model running after the CLI finishes, add --keep-provider-alive.
Usage
Run the CLI to extract a teaching brief from a single file:
dspyteach path/to/your_file
You can also point the CLI at a directory. The tool will recurse by default:
dspyteach path/to/project --glob "**/*.py" --glob "**/*.md"
Use --non-recursive to stay in the top-level directory, add --glob repeatedly to narrow the target set, and pass --raw to print the raw DSPy prediction object instead of the formatted report.
Command examples
-
Personal Example
{ dt --provider lmstudio -m refactor ./dspy_file/ -ed "prompts/, data/" ;}
-
Single file (default settings)
dspyteach docs/example.md -
Directory with multiple glob filters – quote globs so the shell does not expand them:
dspyteach ./course-notes --glob "**/*.py" --glob "**/*.md"
-
Skip subdirectories entirely – combine with other flags as needed:
dspyteach ./repo --non-recursive --glob "*.md"
-
Exclude generated folders – pass one
--exclude-dirsper path or provide a comma-separated list with no extra spaces:dspyteach ./dspy_file --exclude-dirs prompts/ --exclude-dirs data/ dspyteach ./dspy_file --exclude-dirs "prompts/,data/"
❌
dspyteach ./dspy_file -ed prompts/, data/fails withunrecognized arguments: data/because the second path is not attached to-ed. -
Refactor template generation – switch modes and optionally choose a bundled prompt by name:
dspyteach ./repo --mode refactor --prompt refactor_prompt_template
-
Custom prompt file – works only in refactor mode; ignored otherwise:
dspyteach ./repo --mode refactor --prompt ./my_prompts/api-hardening.md
-
Silent raw output for scripting – useful when piping into other tools:
dspyteach src/module.py --raw > /tmp/module.teaching.txt
-
WSL to LM Studio on Windows – pair the earlier WSL note with a concrete host example:
dspyteach ./notes \ --provider lmstudio \ --api-base http://<windows-host-ip>:1234/v1 \ --model qwen3-4b-instruct-2507@q6_k_xl
Need to double-check files before the model runs? Add --confirm-each (alias --interactive) to prompt before every file, accepting with Enter or skipping with n.
To omit specific subdirectories entirely, pass one or more --exclude-dirs options. Each value can list comma-separated relative paths (for example --exclude-dirs "build/,venv/" --exclude-dirs data/raw). The analyzer ignores any files whose path begins with the provided prefixes.
Prefer short flags? The common options include -r (--raw), -m (--mode), -nr (--non-recursive), -g (--glob), -i (--confirm-each), -ed (--exclude-dirs), and -o (--output-dir). Mix and match them as needed.
Refactor files/dirs
Want to scaffold refactor prompt templates instead of teaching briefs? Switch the mode:
dspyteach path/to/project --mode refactor --glob "**/*.md"
Additional Information
The CLI reuses the same file resolution pipeline but feeds each document through the bundled dspy-file_refactor-prompt_template.md instructions (packaged under dspy_file/prompts/), saving .refactor.md files alongside the teaching reports. Teaching briefs remain the default (--mode teach), so existing workflows continue to work unchanged.
When multiple templates live in dspy_file/prompts/, the refactor mode surfaces a picker so you can choose which one to use. You can also point at a specific template explicitly with -p/--prompt, passing either a bundled name (-p refactor_prompt_template) or an absolute path to your own Markdown prompt.
Each run only executes the analyzer for the chosen mode. When you pass --mode refactor the teaching inference pipeline stays idle, and you can alias the command (for example alias dspyrefactor='dspyteach --mode refactor') if you prefer refactor templates to be the default in your shell.
To change where reports land, supply --output-dir /path/to/reports. When omitted the CLI writes to dspy_file/data/ next to the module. Every run prints the active model name and the resolved output directory before analysis begins so you can confirm the environment at a glance. For backwards compatibility the installer also registers dspy-file-teaching as an alias.
Each analyzed file is saved under the chosen directory with a slugged name (e.g. src__main.teaching.md or src__main.refactor.md). If a file already exists, the CLI appends a numeric suffix to avoid overwriting previous runs.
The generated brief is markdown that mirrors the source material:
- Overview paragraphs for quick orientation
- Section-by-section bullets capturing the narrative
- Key concepts, workflows, pitfalls, and references learners should review
- A
dspy.Refinewrapper keeps retrying until the report clears a length reward (defaults scale to ~50% of the source word count, with min/max clamps), so the content tends to be substantially longer than a single LM call. - If a model cannot honour DSPy's structured-output schema, the CLI prints a
Structured output fallbacknotice and heuristically parses the textual response so you still get usable bullets.
Behind the scenes the CLI:
- Loads environment variables via
python-dotenv. - Configures DSPy with the provider selected via CLI or environment variables (Ollama by default).
- Resolves all requested files, reads contents, runs the DSPy
FileTeachingAnalyzermodule, and prints a human-friendly report for each. - Persists each report to the configured output directory so results are easy to revisit.
- Stops the Ollama model when appropriate so local resources are returned to the pool.
Extending
- Adjust the
TeachingReportsignature or add new chains indspy_file/file_analyzer.pyto capture additional teaching metadata. - Customize the render logic in
dspy_file.file_helpers.render_predictionif you want richer CLI output or structured JSON. - Tune
TeachingConfiginsidefile_analyzer.pyto raisemax_tokens, adjust theRefineword-count reward, or add extra LM kwargs. - Add more signatures and module stages to capture additional metadata (e.g., security checks) and wire them into
FileAnalyzer.
Releasing
Maintainer release steps live in docs/RELEASING.md.
Troubleshooting
- If the program cannot connect to Ollama, verify that the server is running on
http://localhost:11434and the requested model has been pulled. - When you see
ollama command not found, ensure theollamabinary is on yourPATH. - For encoding errors, the helper already falls back to
latin-1, but you can add more fallbacks infile_helpers.read_file_contentif needed.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dspyteach-0.1.4b1.tar.gz.
File metadata
- Download URL: dspyteach-0.1.4b1.tar.gz
- Upload date:
- Size: 26.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.9.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
64ac0ff1270912ecde4907a6c623bcc3909454ef7a635444a201cf4f50320b6d
|
|
| MD5 |
b7421c66649485a86c15e650dccb967e
|
|
| BLAKE2b-256 |
3b59f73382d803bc3223f69fcde843b739a0b4cc3bddbb61c0488315f04f6bfd
|
File details
Details for the file dspyteach-0.1.4b1-py3-none-any.whl.
File metadata
- Download URL: dspyteach-0.1.4b1-py3-none-any.whl
- Upload date:
- Size: 24.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.9.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cd79a027bb3bf6ea2ef34580e61e64782fa966c83820f4a107186b1df70d266d
|
|
| MD5 |
e5944d3e00c1765dac7abfc0c63130a5
|
|
| BLAKE2b-256 |
fc536b18a80bd2cfa7dc5b4582f52ad2f6f82af6800e72faa282249a2e2ff19f
|