A tool for summarizing documents and code using AI

These details have not been verified by PyPI

License
- OSI Approved :: Apache Software License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

DocDog

Overview

DocDog is an AI-powered tool that automatically generates comprehensive README documentation for software projects. By analyzing the project's source code, configuration files, and existing documentation, DocDog can create a well-structured README file covering installation, usage, API documentation, examples, and more.

The tool aims to streamline the documentation process for developers, saving time and effort while ensuring accurate and up-to-date documentation that reflects the project's current state. With DocDog, you can focus on writing code while keeping your project's documentation in sync.

Features

Automatic README Generation: DocDog analyzes your project's codebase, configuration files, and existing documentation to generate a comprehensive README file.
Structured Documentation: The generated README follows a standardized structure, including sections for installation, usage, API documentation, examples, troubleshooting, and more.
Code Analysis: DocDog examines your code to extract relevant information, such as function signatures, docstrings, and code comments, to include in the documentation.
Configuration Options: Customize the documentation generation process by specifying configuration options, such as allowed file extensions, output directory, and more.
Parallel Processing: Leverage parallel processing for efficient chunking and analysis of large codebases.
Template Support: Use built-in or custom templates to control the structure and formatting of the generated README.
Reasoning Documentation: Optionally include the reasoning behind the generated content in a separate file (reasoning.md) for transparency and understanding the AI's decision-making process.

Installation

# Clone the repository
git clone https://github.com/duriantaco/docdog.git
cd docdog

# Create and activate a virtual environment
python3 -m venv venv
source venv/bin/activate

# Install DocDog
pip install .

Quick Start Guide

To generate a README for your project, navigate to your project's root directory and run:

docdog

This will analyze your project's files and generate a README.md file in the current directory.

Usage

usage: docdog [-h] [-o OUTPUT] [-m MODEL] [--reasoning] [-p PROMPT_TEMPLATE] [--max-iterations MAX_ITERATIONS] [--workers WORKERS] [--cache-size CACHE_SIZE]

DocDog - AI Document & Code Summarizer

optional arguments:
  -h, --help            show this help message and exit
  -o OUTPUT, --output OUTPUT
                        Output file path for the generated README (default: README.md)
  -m MODEL, --model MODEL
                        AI model to use for documentation generation (default: gpt-4o-mini)
  --reasoning           Include reasoning behind the generated content
  -p PROMPT_TEMPLATE, --prompt-template PROMPT_TEMPLATE
                        Path to a custom prompt template file
  --max-iterations MAX_ITERATIONS
                        Maximum number of iterations for the AI model (default: 15)
  --workers WORKERS, -w WORKERS
                        Number of worker threads (default: auto)
  --cache-size CACHE_SIZE
                        Size of the LRU cache (default: 128)

API Documentation

MCPTools

The MCPTools class provides a set of tools for interacting with the project's codebase, such as listing files, reading file contents, and batch reading multiple files. The class supports caching for improved performance and parallel processing for batch operations.

`init(project_root, max_workers=None, cache_size=128)`

Initializes the MCPTools instance.

project_root (str): The root directory of the project.
max_workers (int, optional): The maximum number of worker threads for parallel processing. If None, the number of workers is determined automatically.
cache_size (int, optional): The size of the LRU cache for caching file reads and listings. Default is 128.

`list_files(directory)`

Lists files in the specified directory within the project root, excluding ignored patterns.

directory (str): The directory path relative to the project root.
Returns: A string containing the list of files, with one file path per line.

`read_file(file_path)`

Reads the content of a file within the project root.

file_path (str): The file path relative to the project root.
Returns: A string containing the file content. For Python files, it includes the content, docstrings, and comments.

`batch_read_files(file_paths)`

Reads the contents of multiple files within the project root in parallel.

file_paths (list): A list of file paths relative to the project root.
Returns: A JSON string containing a list of dictionaries, where each dictionary represents a file with its content or error message.

`handle_tool_call(tool_name, tool_input)`

Handles tool calls from the AI assistant, dispatching to the appropriate tool based on the tool_name.

tool_name (str): The name of the tool to execute.
tool_input (dict): The input parameters for the tool.
Returns: The result of the tool execution.

Chunking

The chunking module provides functionality for splitting the project's files into chunks for efficient processing by the AI assistant.

`chunk_project(project_root, output_dir="chunks", config=None)`

Chunks the project's files into smaller files, splitting them based on token count or in parallel.

project_root (str): The root directory of the project.
output_dir (str, optional): The directory to store the chunked files. Default is "chunks".
config (dict, optional): A configuration dictionary containing chunking options. If None, default options are used.
Returns: A list of file paths for the created chunk files.

Other Modules

sanitize_prompt: A utility function for sanitizing prompts to prevent Unicode obfuscation and prompt injection attacks.
templates: Contains template files for the initial prompt, validation prompt, and reasoning instructions.

Configuration

DocDog can be configured using environment variables, command-line arguments, and a configuration file.

Environment Variables

ANTHROPIC_API_KEY: Your Anthropic API key. Required for DocDog to function.

Command-line Arguments

--output: Specify the output file path for the generated README (default: README.md).
--model: Set the AI model to use for documentation generation (default: gpt-4o-mini).
--reasoning: Include the reasoning behind the generated content in a separate file (reasoning.md).
--prompt-template: Path to a custom prompt template file.
--max-iterations: Set the maximum number of iterations for the AI model (default: 15).
--workers: Specify the number of worker threads for parallel processing (default: auto-detected).
--cache-size: Set the size of the LRU cache for caching file reads and listings (default: 128).

Configuration File

DocDog supports a configuration file (config.json) for additional settings. The default configuration is:

{
    "num_chunks": 5,
    "model": "gpt-4o-mini",
    "max_tokens": 5000,
    "temperature": 0.7,
    "verbose": false,
    "allowed_extensions": [
        ".txt", ".md", ".py", ".pdf", ".sh", ".json", ".yaml", ".ipynb",
        ".js", ".tsx", ".ts", "jsx", ".html", ".css", ".csv", ".xml",
        ".yml", ".sql", ".java", ".php", ".rb", ".c", ".cpp", ".h",
        ".hpp", ".cs", ".go", ".rs", ".swift", ".kt", ".m", ".pl",
        ".r", ".lua", ".sh", ".bash", ".zsh", ".ps1", ".psm1", ".psd1",
        ".ps1xml", ".pssc", ".psc1", ".pssc", ".pss1", ".pssm", ".pssc", ".pss"
    ]
}

You can create a config.json file in your project's root directory to override these settings.

Examples and Use Cases

Basic Usage

docdog

This will generate a README.md file in the current directory, analyzing all files in the project with the default configuration.

Specifying an Output File

docdog --output docs/PROJECT_README.md

This will generate the README file as docs/PROJECT_README.md instead of the default README.md.

Including Reasoning

docdog --reasoning

This will generate a reasoning.md file alongside the README.md, explaining the reasoning behind the generated content.

Using a Custom Prompt Template

docdog --prompt-template custom_prompt.txt

This will use the custom_prompt.txt file as the prompt template for the AI model, allowing you to customize the structure and content of the generated README.

Adjusting Configuration

You can create a config.json file in your project's root directory to adjust settings like the number of chunks, AI model, temperature, and allowed file extensions.

{
    "num_chunks": 10,
    "model": "gpt-4",
    "max_tokens": 6000,
    "temperature": 0.8,
    "allowed_extensions": [".py", ".md", ".txt", ".js"]
}

This configuration will create 10 chunks, use the gpt-4 model with a temperature of 0.8, limit the token count to 6000, and only analyze Python, Markdown, text, and JavaScript files.

Troubleshooting/FAQ

Error: `ANTHROPIC_API_KEY not found in environment variables`

Make sure you have set the ANTHROPIC_API_KEY environment variable with your valid Anthropic API key. You can set it temporarily in your shell session or add it to your shell configuration file (e.g., .bashrc, .zshrc).

export ANTHROPIC_API_KEY=your_api_key_here

Incomplete or Missing Information

If the generated README is missing important information or sections, it's likely due to the tool being unable to find relevant information in your project's files. Double-check that your source code and configuration files are up-to-date and well-documented (e.g., using docstrings, comments, and descriptive variable/function names).

Unsatisfactory README Quality

If the generated README quality is not satisfactory, you can try the following:

Increase the max_iterations option to allow the AI model more iterations for refining the output.
Use a more capable AI model (e.g., gpt-4) by setting the --model option.
Adjust the temperature setting in the config.json file to control the randomness and creativity of the generated text.
Provide a custom prompt template with more specific instructions tailored to your project.

Contributing

Contributions are welcome! If you encounter any issues or have suggestions for improvements, please open an issue or submit a pull request on the GitHub repository. See the CONTRIBUTING.md file for more details.

License

DocDog is released under the Apache 2.0 License.

Generated by DocDog on 2025-03-25

Project details

These details have not been verified by PyPI

License
- OSI Approved :: Apache Software License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

0.0.4

Apr 1, 2025

0.0.3

Mar 26, 2025

This version

0.0.2

Mar 25, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

docdog-0.0.2.tar.gz (24.7 kB view details)

Uploaded Mar 25, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

docdog-0.0.2-py3-none-any.whl (13.5 kB view details)

Uploaded Mar 25, 2025 Python 3

File details

Details for the file docdog-0.0.2.tar.gz.

File metadata

Download URL: docdog-0.0.2.tar.gz
Upload date: Mar 25, 2025
Size: 24.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.9.6

File hashes

Hashes for docdog-0.0.2.tar.gz
Algorithm	Hash digest
SHA256	`a8d6b9f54d29d9463e07b053b0fb5ba77eb5f0cec958171addc9ade13ac13efe`
MD5	`fd96adca1a27fb360d0540a9a20a3765`
BLAKE2b-256	`1119ab1e2a56f35f4e82ee75f9a644791f328f1394e07048a5852244b53269e6`

See more details on using hashes here.

File details

Details for the file docdog-0.0.2-py3-none-any.whl.

File metadata

Download URL: docdog-0.0.2-py3-none-any.whl
Upload date: Mar 25, 2025
Size: 13.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.9.6

File hashes

Hashes for docdog-0.0.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b72dde6a06f25b23df05615ce974d9cbc8adff46a968b81f55fed8f5082504e5`
MD5	`73866ffc4b6c25ecfe13502136e94636`
BLAKE2b-256	`ee210ca398bb0def556881b2a7b0e4106d7e18bf992fa6de389870f8301dc8c9`

See more details on using hashes here.

docdog 0.0.2

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

DocDog

Overview

Features

Installation

Quick Start Guide

Usage

API Documentation

MCPTools

__init__(project_root, max_workers=None, cache_size=128)

list_files(directory)

read_file(file_path)

batch_read_files(file_paths)

handle_tool_call(tool_name, tool_input)

Chunking

chunk_project(project_root, output_dir="chunks", config=None)

Other Modules

Configuration

Environment Variables

Command-line Arguments

Configuration File

Examples and Use Cases

Basic Usage

Specifying an Output File

Including Reasoning

Using a Custom Prompt Template

Adjusting Configuration

Troubleshooting/FAQ

Error: ANTHROPIC_API_KEY not found in environment variables

Incomplete or Missing Information

Unsatisfactory README Quality

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`init(project_root, max_workers=None, cache_size=128)`

`list_files(directory)`

`read_file(file_path)`

`batch_read_files(file_paths)`

`handle_tool_call(tool_name, tool_input)`

`chunk_project(project_root, output_dir="chunks", config=None)`

Error: `ANTHROPIC_API_KEY not found in environment variables`