Skip to main content

AI-assisted CLI for organizing files.

Project description

CI PyPI - Version PyPI - Python Version GitHub License

Dorgy

dorgy logo

dorgy is an AI-assisted command line toolkit that keeps growing collections of files tidy. The project already ships ingestion, classification, organization, watch, search, and undo workflows while we continue to flesh out the roadmap captured in SPEC.md. For a deeper dive into how the components fit together, see the architecture overview.

Why Dorgy?

  • Hands-off organization – classify, rename, and relocate files using DSPy-backed language models plus fast heuristic fallbacks.
  • Continuous monitoring – watch directories, batch changes, and export machine-readable summaries for downstream automation.
  • Rich undo and audit history – track every operation in .dorgy/ so reorganizations remain reversible.
  • Per-collection search stores – Chromadb indexes live under <collection>/.dorgy/chroma, keeping semantic search portable without touching global config.
  • Extensible foundation – configuration is declarative, tests are automated via uv, and the roadmap is public.

Installation

PyPI (recommended)

# Using pip
pip install dorgy

# Using uv
uv pip install dorgy

From source

Clone the repository when you plan to contribute or work off the bleeding edge:

# Clone the repository
git clone https://github.com/bryaneburr/dorgy.git
cd dorgy

# Sync dependencies (includes dev extras)
uv sync

# Optional: install an editable build
uv pip install -e .

Quickstart

# Inspect available commands
uv run dorgy --help

# Organize a directory in place (dry run first)
uv run dorgy org ./documents --dry-run
uv run dorgy org ./documents

# Monitor a directory and emit JSON batches
uv run dorgy watch ./inbox --json --once

# Undo the latest plan
uv run dorgy undo ./documents --dry-run
uv run dorgy status ./documents --json

CLI Highlights

  • dorgy org – batch ingest files, classify them, and apply structured moves with progress bars, summary/quiet toggles, and JSON payloads.
  • dorgy watch – reuse the same pipeline in a long-running service; guard destructive deletions behind --allow-deletions.
  • dorgy mv – move or rename tracked files while preserving state history.
  • dorgy status / dorgy undo – inspect prior plans, audit history, and restore collections when needed.
  • Per-run search togglesdorgy org/dorgy watch support --with-search/--without-search to control Chromadb indexing so collections stay portable and automation stays in sync.
  • dorgy search – query collection metadata and Chromadb-backed document content. Use --search for semantic similarity (requires the Chromadb index), --contains for substring matches, --init-store to rebuild .dorgy/chroma without re-running org, and --drop-store to disable indexing while keeping state lookups available. JSON output now includes document_id, optional scores, and snippets for automation consumers.
  • Search-aware movesdorgy mv keeps Chromadb metadata aligned with renamed files, so semantic search results stay accurate after refactors.
  • Configuration commandsdorgy config view|set|edit expose the full settings model.

All commands accept --json for machine-readable output and share standardized error payloads so automation can script around them.


Search Workflow

Chromadb indexes live beside collection state under <collection>/.dorgy/chroma with a manifest at <collection>/.dorgy/search.json. Dorgy now builds search indexes automatically during dorgy org and watch batches; pass --without-search if you need to skip indexing for a run. Query results with:

  • dorgy search <path> --contains "phrase" – performs substring filtering against stored document text while still honouring tag/category/date filters.
  • dorgy search <path> --search "phrase" – issues a semantic similarity lookup via Chromadb embeddings (requires the local index to be initialised).
  • dorgy search <path> --init-store [--contains ...] – regenerates the Chromadb store from existing files/descriptors without re-running org, emitting notes for missing previews.
  • dorgy search <path> --reindex – drops any existing Chromadb data and rebuilds the store in-place using current collection files.
  • dorgy search <path> --drop-store – removes .dorgy/chroma, disables search in state metadata, and falls back to state-only filtering.

Both human-readable and JSON modes surface persistent document_ids, optional similarity scores, and snippets sourced from the Chromadb payload. When the index is unavailable, the CLI emits actionable errors guiding operators to initialize or rebuild the store.


Configuration Essentials

  • The primary config file lives at ~/.dorgy/config.yaml; environment variables follow DORGY__SECTION__KEY.
  • processing governs ingestion behaviour (batch sizes, captioning, concurrency, size limits). processing.process_images is enabled by default to capture multimodal captions stored in .dorgy/vision.json.
  • organization controls renaming and conflict strategies (append number, timestamp, skip) and timestamp preservation. Automatic renaming is disabled by default (organization.rename_files: false) so classification runs remain non-destructive unless you opt in.
  • cli toggles defaults for quiet/summary modes, Rich progress indicators, and move conflict handling (legacy configs may still surface cli.search_default_limit, but new installs should use the search block instead).
  • search governs Chromadb-backed indexing (default result limits, whether org/watch auto-maintain the store, and optional embedding function overrides). Stores live beside state.json under <collection>/.dorgy/chroma; indexing is enabled by default, but you can pass --without-search (or set config values) to skip runs, use dorgy search --init-store to rebuild indexes, and dorgy search --drop-store to remove them safely.
  • Chromadb telemetry is disabled by default because the client is instantiated with anonymized_telemetry=False. If you intentionally want to opt into telemetry, override the setting (for example by exporting CHROMADB_TELEMETRY_ENABLED=1 before running commands).
  • Watch services share the organization pipeline and respect processing.watch.allow_deletions unless --allow-deletions is passed.
  • LLM models are configured through the llm block. The default target is openai/gpt-5; provide any LiteLLM-compatible identifier (for example openai/gpt-4o-mini or openrouter/gpt-4o-mini:free) via llm.model, supply llm.api_key/llm.api_base_url when required, and set DORGY_USE_FALLBACKS=1 only when explicitly exercising heuristic classifiers in development.

LLM Model Configuration

Configure language models through the llm block using uv run dorgy config set llm.<field> <value> or by editing ~/.dorgy/config.yaml. Supply the exact LiteLLM/DSPy model string (<provider>/<model>[:variant]) via llm.model. The CLI also respects environment variables such as DORGY__LLM__MODEL, DORGY__LLM__API_KEY, and DORGY__LLM__API_BASE_URL.

Common configurations (substitute your own model identifiers and credentials as needed):

  • OpenAI

    uv run dorgy config set llm.model openai/gpt-4o
    uv run dorgy config set llm.api_key "$OPENAI_API_KEY"
    

    YAML equivalent:

    llm:
      model: openai/gpt-4o
      api_key: sk-...
    
  • Anthropic

    uv run dorgy config set llm.model anthropic/claude-3-5-sonnet-20240620
    uv run dorgy config set llm.api_key "$ANTHROPIC_API_KEY"
    
  • xAI (Grok) via OpenRouter

    uv run dorgy config set llm.model openrouter/grok-1
    uv run dorgy config set llm.api_key "$OPENROUTER_API_KEY"
    
  • Google Gemini

    uv run dorgy config set llm.model google/gemini-1.5-pro
    uv run dorgy config set llm.api_key "$GOOGLE_API_KEY"
    
  • Local / Custom Gateway

    uv run dorgy config set llm.model ollama/llama3
    uv run dorgy config set llm.api_base_url http://localhost:11434/v1
    

    When llm.api_base_url is set (e.g., Ollama, LM Studio, vLLM, or self-hosted gateways), dorgy sends requests directly to that endpoint and skips API-key enforcement.


Automation & Release Tasks

We ship an Invoke task collection that wraps the uv toolchain so day-to-day automation stays consistent:

  • uv run invoke sync – install dependencies (dev extras by default).
  • uv run invoke tests / uv run invoke lint / uv run invoke ci – mirror the CI workflow locally.
  • uv run invoke release – bump the version, commit pyproject.toml/uv.lock, rebuild artifacts, publish, and tag.
  • uv run invoke release --dry-run --push-tag – preview the full release plan without modifying anything.
  • uv run invoke tag-version – create (and optionally push) an annotated git tag.

Release Workflow

  1. Ensure the working tree is clean and CI passes locally:
    uv run invoke ci
    
  2. Perform a dry run when validating credentials or reviewing the plan:
    uv run invoke release --dry-run --push-tag --token "$TEST_PYPI_TOKEN" \
        --index-url https://test.pypi.org/legacy/ --skip-existing
    
  3. Publish to PyPI (commits the version bump, pushes the tag when requested):
    export PYPI_TOKEN="pypi-AgEN..."
    uv run invoke release --push-tag --token "$PYPI_TOKEN"
    
    Use --index-url/--skip-existing for TestPyPI dry runs, or --tag-prefix "" if you prefer unprefixed tags.
  4. Update SPEC.md/notes/STATUS.md with release notes, open a PR from feature/release-prep, and merge once GitHub Actions succeeds.

Roadmap

  • SPEC.md tracks implementation phases and current status (Phase 9 – Distribution & Release Prep is underway; Phase 5.8 – Vision-Enriched Classification recently wrapped).
  • notes/STATUS.md logs day-to-day progress, blockers, and next actions.
  • Module-specific coordination details live in src/dorgy/**/AGENTS.md.

Upcoming milestones include the image-only OCR follow-up for vision pipelines, CLI ergonomics work in Phase 6, and continued distribution/release automation.


Contributing

We welcome issues and pull requests while the project matures. A few guidelines keep things predictable:

  • Environment – install dependencies with uv sync and run commands via uv run ....
  • Pre-commit – install hooks (uv run pre-commit install) and run uv run pre-commit run --all-files before pushing.
  • Branching – create feature branches named feature/<scope> and keep them rebased until ready for review.
  • Testing – the default pre-commit stack runs Ruff (lint/format/imports), MyPy, and uv run pytest.
  • Documentation – follow Google-style docstrings and update relevant AGENTS.md files when adding automation-facing behaviours or integrations.
  • Coordination – flag changes that impact the CLI contract, watch automation, or external integrations directly in the associated module AGENTS.md.

For release-specific work, use the branch/review workflow documented above and ensure TestPyPI validation is complete before tagging.


Community & Support

  • File issues and feature requests at github.com/bryaneburr/dorgy/issues.
  • Join the discussion via GitHub Discussions (coming soon) or reach out through issues for contributor onboarding.
  • If you build automations on top of dorgy, let us know—roadmap priorities are community driven.

Authors

  • Codex (ChatGPT-5 based agent) – primary implementation and tactical design across ingestion, classification, organization, and tooling.
  • Bryan E. Burr (@bryaneburr) – supervisor, editor, and maintainer steering project direction and release planning.

License

Released under the MIT License. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dorgy-0.4.1.tar.gz (957.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dorgy-0.4.1-py3-none-any.whl (118.1 kB view details)

Uploaded Python 3

File details

Details for the file dorgy-0.4.1.tar.gz.

File metadata

  • Download URL: dorgy-0.4.1.tar.gz
  • Upload date:
  • Size: 957.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.19

File hashes

Hashes for dorgy-0.4.1.tar.gz
Algorithm Hash digest
SHA256 4aaf4b28ed3c0931afc371b3de064d2911ddb96d2951e8e859f4d0aab69044d3
MD5 c5ba51db024d9d6ce2aa134bec2dff24
BLAKE2b-256 357c1f95556e2628a47f7f8794f43973da95e26f44acb4ebf71c39395602f9ad

See more details on using hashes here.

File details

Details for the file dorgy-0.4.1-py3-none-any.whl.

File metadata

  • Download URL: dorgy-0.4.1-py3-none-any.whl
  • Upload date:
  • Size: 118.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.19

File hashes

Hashes for dorgy-0.4.1-py3-none-any.whl
Algorithm Hash digest
SHA256 71ac449a5c9b29913e26037c708cad88c4870bc2c34f006b991a3fe3c2d42b0c
MD5 aa3dda15b5778d1b47723938f34a8a00
BLAKE2b-256 fa90cd6f29e5ba8e0d22f2091146aa2834d2fc44fd7d3ca4de96385f4e2a1ce0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page