Skip to main content

CLI utility that summarizes single files into teaching briefs using DSPy

Project description

dspyteach – DSPy File Teaching Analyzer

PyPI Downloads Python License TestPyPI Tests CI

This folder contains a DSPy-powered CLI that analyzes source files (one or many) and produces teaching briefs. Each run captures:

  • an overview of the file and its major sections
  • key teaching points, workflows, and pitfalls highlighted in the material
  • a polished markdown brief suitable for sharing with learners

The implementation mirrors the multi-file tutorial (tutorials/multi-llmtxt_generator) but focuses on per-file inference. The program is split into:

  • dspy_file/signatures.py – DSPy signatures that define inputs/outputs for each step
  • dspy_file/file_analyzer.py – the main DSPy module that orchestrates overview, teaching extraction, and report composition. It now wraps the final report stage with dspy.Refine, pushing for 450–650+ word briefs.
  • dspy_file/file_helpers.py – utilities for loading files and rendering the markdown brief
  • dspy_file/analyze_file_cli.py – command line entry point that configures the local model and prints results. It can walk directories, apply glob filters, and batch-generate briefs.

Requirements

  • Python 3.12+
  • DSPy installed in the environment
  • Ollama running locally with the model hf.co/Mungert/osmosis-mcp-4b-GGUF:Q4_K_M available
  • (Optional) .env file for any additional DSPy configuration; dotenv is loaded automatically

Install the Python dependencies if you have not already: you dont need all of these commands to correctly install

I added multiple install commands and will cleanup later

uv init

uv venv -p 3.12
source .venv/bin/activate
uv pip install dspy python-dotenv
uv sync
ollama pull hf.co/Mungert/osmosis-mcp-4b-GGUF:Q4_K_M
uv pip install dspyteach
# install the package locally (editable or regular)
uv pip install -e .

Usage

Run the CLI to extract a teaching brief from a single file:

dspyteach path/to/your_file

You can also point the CLI at a directory. The tool will recurse by default:

dspyteach path/to/project --glob "**/*.py" --glob "**/*.md"

Use --non-recursive to stay in the top-level directory, add --glob repeatedly to narrow the target set, and pass --raw to print the raw DSPy prediction object instead of the formatted report.

Need to double-check files before the model runs? Add --confirm-each (alias --interactive) to prompt before every file, accepting with Enter or skipping with n.

To omit specific subdirectories entirely, pass one or more --exclude-dirs options. Each value can list comma-separated relative paths (for example --exclude-dirs "build/,venv/" --exclude-dirs data/raw). The analyzer ignores any files whose path begins with the provided prefixes.

To change where reports land, supply --output-dir /path/to/reports. When omitted the CLI writes to dspy_file/data/ next to the module. Every run prints the active model name and the resolved output directory before analysis begins so you can confirm the environment at a glance. For backwards compatibility the installer also registers dspy-file-teaching as an alias.

Each analyzed file is saved under the chosen directory with a slugged name (e.g. src__main.teaching.md). If a file already exists, the CLI appends a numeric suffix to avoid overwriting previous runs.

The generated brief is markdown that mirrors the source material:

  • Overview paragraphs for quick orientation
  • Section-by-section bullets capturing the narrative
  • Key concepts, workflows, pitfalls, and references learners should review
  • A dspy.Refine wrapper keeps retrying until the report clears a length reward (defaults scale to ~50% of the source word count, with min/max clamps), so the content tends to be substantially longer than a single LM call.
  • If a model cannot honour DSPy's structured-output schema, the CLI prints a Structured output fallback notice and heuristically parses the textual response so you still get usable bullets.

Behind the scenes the CLI:

  1. Loads environment variables via python-dotenv.
  2. Configures DSPy with the same local Ollama model used in the tutorial.
  3. Resolves all requested files, reads contents, runs the DSPy FileTeachingAnalyzer module, and prints a human-friendly report for each.
  4. Persists each report to the configured output directory so results are easy to revisit.
  5. Attempts to stop the Ollama model when finished, mirroring the fail-safe logic from the tutorial.

Extending

  • Adjust the TeachingReport signature or add new chains in dspy_file/file_analyzer.py to capture additional teaching metadata.
  • Customize the render logic in dspy_file.file_helpers.render_prediction if you want richer CLI output or structured JSON.
  • Tune TeachingConfig inside file_analyzer.py to raise max_tokens, adjust the Refine word-count reward, or add extra LM kwargs.
  • Add more signatures and module stages to capture additional metadata (e.g., security checks) and wire them into FileAnalyzer.

Packaging & Publishing

The repository is configured for standard Python packaging via pyproject.toml and the setuptools backend. A typical release flow with uv looks like:

# (optional) bump the version before you publish
uv version --bump patch

# build the source distribution and wheel; artifacts land in dist/
uv build --no-sources

# publish to PyPI (or TestPyPI) once you have an API token
UV_PUBLISH_TOKEN=... uv publish

If you want to stage a release first, point uv publish --index testpypi at the alternate index configured in pyproject.toml.

To install the package from a freshly built artifact:

pip install dist/dspyteach-0.1.1-py3-none-any.whl

Once the project is on PyPI, users can install it directly:

pip install dspyteach

After installation, the dspyteach console script (plus the legacy dspy-file-teaching alias) is available in any environment so you can run analyses outside of this repository or integrate the tool into CI jobs.

CI Publishing

GitHub Actions users can trigger .github/workflows/publish-testpypi.yml to build and push the current checkout to TestPyPI. The workflow:

  • Checks out the repository (ensuring pyproject.toml is present as required by uv publish).
  • Installs uv with Python 3.12.
  • Runs uv build --no-sources from the repository root.
  • Publishes with uv publish --index testpypi dist/* using the TEST_PYPI_TOKEN secret.

See the uv publishing guide for the official note about requiring a checkout when using --index.

Troubleshooting

  • If the program cannot connect to Ollama, verify that the server is running on http://localhost:11434 and the requested model has been pulled.
  • When you see ollama command not found, ensure the ollama binary is on your PATH.
  • For encoding errors, the helper already falls back to latin-1, but you can add more fallbacks in file_helpers.read_file_content if needed.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dspyteach-0.1.2b2.tar.gz (17.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dspyteach-0.1.2b2-py3-none-any.whl (15.4 kB view details)

Uploaded Python 3

File details

Details for the file dspyteach-0.1.2b2.tar.gz.

File metadata

  • Download URL: dspyteach-0.1.2b2.tar.gz
  • Upload date:
  • Size: 17.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.9.0

File hashes

Hashes for dspyteach-0.1.2b2.tar.gz
Algorithm Hash digest
SHA256 19b3f8ee6456d0e7f7b4e033749156665aef74ece925707b119bfe920ad8b91d
MD5 bd400773f6d1485bf51081f55521e2f8
BLAKE2b-256 a29e16c425dfd380a8dafaa3ca188becfc634fb3f320529bde892bee0818438e

See more details on using hashes here.

File details

Details for the file dspyteach-0.1.2b2-py3-none-any.whl.

File metadata

File hashes

Hashes for dspyteach-0.1.2b2-py3-none-any.whl
Algorithm Hash digest
SHA256 b48947990693739fb769f7ca150ea7c2b485b81d6721ee34953ae0698d26c3c7
MD5 5d53bd47d9d7136c1e354f6aaa3430d5
BLAKE2b-256 c4bb1d659e449ddefeb48b711ae173bc89834069bb23629950823f2c2a7f65e5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page