autodocgenerator

This Project helps you to create docs for your projects

These details have not been verified by PyPI

Project description

Executive Navigation Tree

📂 Installation & Setup
⚙️ Manager & Core
🧩 Compression Utilities
🔧 Exceptions & Helpers
- exceptions
- markdown-anchor
🔗 Extraction & Parsing
📊 Data Processing
- data‑splitting
- data‑splitting‑engine
📝 Documentation Assembly
🚀 Progress & Interface
- base‑progress‑interface

Explain how install workflow with install.ps1 and install.sh scripts for install you should use links irm https://raw.githubusercontent.com/Drag-GameStudio/ADG/main/install.ps1 | iex for PowerShell and curl -sSL https://raw.githubusercontent.com/Drag-GameStudio/ADG/main/install.sh | bash for Linux‑based systems and also you have to add secret variable to GitHub Action GROCK_API_KEY with your API key from Grock docs https://grockdocs.com to make it work.

How to use the Manager class

The Manager class is instantiated in the autodocgenerator/auto_runner/run_file.py script with the following parameters:

Parameter	Type	Description (inferred from usage)
`project_path`	`str`	Path to the root of the project you want to document.
`project_settings`	`ProjectSettings`	Holds project‑specific metadata (name, additional info, etc.).
`sync_model`	`GPTModel`	Synchronous GPT model used for generating documentation.
`async_model`	`AsyncGPTModel`	Asynchronous GPT model (optional, can be used for async generation).
`ignore_files`	`list[str]`	List of file‑patterns that should be ignored during processing.
`progress_bar`	`BaseProgress` (e.g., `ConsoleGtiHubProgress`)	Progress‑bar implementation that displays generation status.
`language`	`str`	Language code for the generated docs (e.g., `"en"`).

Full example of usage

# example_usage.py
from autodocgenerator.manage import Manager
from autodocgenerator.engine.models.gpt_model import GPTModel, AsyncGPTModel
from autodocgenerator.preprocessor.settings import ProjectSettings
from autodocgenerator.ui.progress_base import ConsoleGtiHubProgress

# 1. Prepare required objects
project_path = "."                     # current directory (or any other path)
project_settings = ProjectSettings("MyProject")  # initialise with project name
# (add any additional info to `project_settings` if needed)

# 2. Initialise GPT models (API key is taken from autodocgenerator.engine.config.config)
sync_model = GPTModel(API_KEY)
async_model = AsyncGPTModel(API_KEY)

# 3. Define ignore patterns (can be extended)
ignore_list = [
    "*.pyo", "*.pyd", "*.pdb", "*.pkl", "*.log", "*.sqlite3", "*.db",
    "data", "venv", "env", ".venv", ".env", ".vscode", ".idea", "*.iml",
    ".gitignore", ".ruff_cache", ".auto_doc_cache", "*.pyc", "__pycache__",
    ".git", ".coverage", "htmlcov", "migrations", "*.md", "static",
    "staticfiles", ".mypy_cache"
]

# 4. Choose a progress bar implementation
progress = ConsoleGtiHubProgress()

# 5. Create the Manager instance
manager = Manager(
    project_path,
    project_settings,
    sync_model=sync_model,
    async_model=async_model,
    ignore_files=ignore_list,
    progress_bar=progress,
    language="en"
)

# 6. Run the documentation generation workflow
manager.generate_code_file()
manager.generate_global_info_file(use_async=False, max_symbols=8000)
manager.generete_doc_parts(use_async=False, max_symbols=5000)

# 7. Generate the final documentation using factories
# (doc_factory and intro_factory are obtained from autodocgenerator.auto_runner.config_reader)
from autodocgenerator.auto_runner.config_reader import read_config, Config
with open("autodocconfig.yml", "r", encoding="utf-8") as f:
    cfg_data = f.read()
cfg: Config = read_config(cfg_data)
doc_factory, intro_factory = cfg.get_doc_factory()

manager.factory_generate_doc(doc_factory)
manager.factory_generate_doc(intro_factory)

# 8. Retrieve the generated documentation
output = manager.read_file_by_file_key("output_doc")
print(output)   # or write it to README.md, etc.

Key points

All required parameters are supplied when constructing Manager.
After creation, invoke the sequence of methods shown above to generate code snippets, global info, documentation parts, and finally assemble the full document.
The example mirrors the exact flow used in autodocgenerator/auto_runner/run_file.py.

**autodocconfig.yml – available options**

The file is a plain YAML document that can contain the following top‑level keys, which are read by autodocgenerator.auto_runner.config_reader.read_config:

Key	Type	Description	Example
`ignore_files`	list of strings	File‑name patterns that the generator will skip while scanning the project. If omitted the default list from `Config.__init__` is used.	`ignore_files: ["*.log", "venv", ".git"]`
`language`	string	Language code for the generated documentation (default: `"en"`).	`language: "ru"`
`project_name`	string	Name of the project – used in the intro section and for overall context.	`project_name: "My Awesome Library"`
`project_additional_info`	mapping (key → string)	Arbitrary key‑value pairs that are added to `ProjectSettings`. They can be referenced by custom modules.	`project_additional_info:\n author: \"John Doe\"\n license: \"MIT\"`
`custom_descriptions`	list of strings	Each string becomes a `CustomModule` that will be processed by the documentation engine. Use them to request specific sections, explanations, or any custom text.	`custom_descriptions:\n - "explain how to install the library"\n - "provide usage example for Manager class"`

Minimal example

project_name: "My Project"
language: "en"

project_additional_info:
  description: "A short summary of the project."
  version: "0.1.0"

custom_descriptions:
  - "Explain the installation steps."
  - "Show an example of using the Manager class."

# optional, overrides the built‑in ignore list
ignore_files:
  - "*.tmp"
  - "build"

Only the keys you need must be present; missing keys fall back to the defaults defined in Config.

Package Initializer (`autodocgenerator/init.py`)

Responsibility
The __init__.py file marks the autodocgenerator directory as a Python package and executes a single side‑effect: it prints the literal string ADG to standard output whenever the package is imported.

Interactions

Importers – Any module that performs import autodocgenerator (directly or indirectly via sub‑modules such as autodocgenerator.auto_runner.run_file) will trigger the print.
No external dependencies – The file contains no imports, configuration reads, or runtime logic, so it does not rely on or affect other components (engine, factory, UI, etc.).

Key Logic Flow

Python evaluates the file during package import.
Executes print("ADG").
Returns control to the importer; the package’s sub‑modules become available.

Assumptions & Side Effects

Assumption – The package is imported in a context where writing to stdout is harmless (e.g., CLI tools, CI runs).
Side Effect – Unconditional console output may clutter logs or interfere with programs that capture stdout; it does not affect functional behavior.

Typical Usage

import autodocgenerator   # Triggers the "ADG" banner
from autodocgenerator.auto_runner import run_file
# Normal operation proceeds after the banner is printed

Recommendation
For library consumers, consider removing the print statement or guarding it behind a debug flag to avoid unwanted output in production environments.

`autodocgenerator.auto_runner.config_reader` – Configuration Loader

Responsibility
Parses a YAML‑style configuration file and builds a Config object that centralises all runtime settings required by the auto‑doc generation pipeline.

Interactions

Consumed by autodocgenerator.auto_runner.run_file (via read_config).
Supplies objects to the factory (DocFactory) and pre‑processor (ProjectSettings).
Does not touch the engine, UI or external services.

Key API

Member	Purpose
`Config`	Holds mutable defaults: `ignore_files`, `language`, `project_name`, `project_additional_info`, `custom_modules`.
`Config.set_language / set_project_name`	Fluent setters used while building the config.
`Config.add_ignore_file`	Extends the default ignore pattern list.
`Config.add_custom_module`	Registers a `CustomModule` (user‑provided description).
`Config.get_project_settings()`	Returns a `ProjectSettings` instance populated with the project name and any extra key/value info.
`Config.get_doc_factory()`	Creates two `DocFactory` instances – one for custom modules, another for built‑in intro modules (`IntroLinks`, optionally `IntroText`).
`read_config(file_data: str) -> Config`	Core parser: `yaml.safe_load` → fills `Config` fields, handling optional keys (`ignore_files`, `language`, `project_name`, `project_additional_info`, `custom_descriptions`).

Assumptions & Side Effects

Input YAML is well‑formed; missing keys fall back to sensible defaults (e.g., "en" for language, empty project name).
No I/O or network calls – pure data transformation.

`autodocgenerator.auto_runner.run_file` – Entry Point for Documentation Generation

Responsibility
Orchestrates the full documentation generation flow: loads configuration, instantiates models, creates a Manager, runs all generation steps, and returns the final assembled document.

Interactions

Imports Config and read_config from the sibling config_reader.
Instantiates GPTModel / AsyncGPTModel (engine).
Builds a Manager (core orchestration) with a ConsoleGtiHubProgress UI component.
Calls manager methods that rely on factories (DocFactory) and settings (ProjectSettings).

Key Function

def gen_doc(project_settings, ignore_list, project_path,
            doc_factory, intro_factory) -> str:
    """
    Executes the complete doc‑generation pipeline and returns the final
    markdown/text output.
    """

Creates sync/async LLM wrappers using the global API_KEY.
Constructs Manager with all required collaborators.
Sequentially triggers:
1. generate_code_file()
2. generate_global_info_file(use_async=False, max_symbols=8000)
3. generete_doc_parts(use_async=False, max_symbols=5000)
4. factory_generate_doc for both the custom and intro factories.
Returns manager.read_file_by_file_key("output_doc").

CLI Guard
When run as a script (python -m autodocgenerator.auto_runner.run_file) it reads autodocconfig.yml, builds the config, and prints the generated document.

Assumptions & Side Effects

API_KEY is available and valid; otherwise LLM calls will fail.
The progress UI writes to stdout/stderr, which is acceptable for interactive runs.
All file I/O is limited to the project directory (project_path).

`autodocgenerator.engine.init`

Responsibility
Package marker; currently empty, serving only to make autodocgenerator.engine a importable Python package. No runtime behavior is defined here.

Configuration constants & prompt templates

Responsibility – Provides the static textual prompts that drive the LLM agents used throughout the AutoDoc system.
Interaction – All higher‑level modules import these strings (e.g., BASE_SYSTEM_TEXT, BASE_PART_COMPLITE_TEXT, BASE_INTRODACTION_CREATE_TEXT, BASE_INTRO_CREATE, BASE_SETTINGS_PROMPT) and feed them to the language model when constructing system or user messages.
Key data – Multi‑line strings describing how snippets are analyzed, how documentation parts are generated, how navigation trees are built, and how project settings are memorised.

Environment loading & API key validation

import os
from dotenv import load_dotenv
load_dotenv()
API_KEY = os.getenv("API_KEY")
if API_KEY is None:
    raise Exception("API_KEY is not set in environment variables.")

Loads .env files, extracts API_KEY, and aborts early if missing.
Assumption – The runtime environment supplies a valid OpenAI (or compatible) API key; otherwise any LLM call will fail. No side effects besides environment variable access.

Supported model identifiers

MODELS_NAME = [
    "openai/gpt-oss-120b",
    "llama-3.3-70b-versatile",
    "openai/gpt-oss-safeguard-20b",
]

A hard‑coded list of model names the engine may select for generation. Other components (e.g., engine.models) reference this list to instantiate the appropriate LLM wrapper.

`get_BASE_COMPRESS_TEXT(start, power)` – Prompt generator for large snippets

Purpose – Returns a formatted instruction prompting the model to summarise a large code fragment and provide a strict usage example.
Parameters

start (int): Approximate maximum character count of the incoming snippet.
power (int): Divisor controlling the allowed summary length (~ start/power chars).

Returned value – A multi‑line string containing three sections: analysis request, length‑limited summary, and a precise Python usage example template.

Interaction – Called by the compression stage of the pipeline (e.g., when a file exceeds token limits) to produce a custom system prompt for the LLM.

Assumptions & side effects – Pure function; no I/O, only string interpolation.

Exceptions – ModelExhaustedException

ModelExhaustedException is raised when the shuffled list regen_models_name becomes empty, i.e. no fallback model is left. It inherits directly from Exception and carries a short doc‑string; no side‑effects.

Model hierarchy (`model.py`)

History – stores the conversation as a list of {role, content} dicts. The constructor injects the system prompt (BASE_SYSTEM_TEXT) unless None.
ParentModel – base for both sync and async models. It keeps the API key, a History instance, a shuffled copy of MODELS_NAME (regen_models_name) and an index (current_model_index) used for round‑robin fallback.
Model (sync) – implements:
- generate_answer – abstract placeholder overridden in concrete models.
- get_answer_without_history – forwards a raw message list to generate_answer.
- get_answer – records the user prompt, calls generate_answer, records the assistant reply, and returns it.
AsyncModel – async counterparts of the above methods.

Assumptions: MODELS_NAME is a non‑empty list; History can be shared safely because it contains only in‑memory data.

Concrete GPT models (`gpt_model.py`)

AsyncGPTModel (AsyncModel subclass) – creates an AsyncGroq client.
- generate_answer builds the message payload from history or a raw prompt, then loops over regen_models_name attempting client.chat.completions.create. On failure it prints the exception, advances current_model_index, and retries until a response is obtained or the list is exhausted (raising ModelExhaustedException). Returns the first choice’s content.
GPTModel – same logic but synchronous, using Groq.

Interaction: factories inject a Model (or AsyncModel) instance into modules; modules call model.get_answer… which internally uses the above generation logic.

Documentation factory core (`base_factory.py` & `general_modules.py`)

BaseModule (ABC) – contract for pluggable documentation generators; must implement generate(info, model).
DocFactory – aggregates BaseModule instances. generate_doc(info, model, progress):
1. Starts a sub‑task in BaseProgress.
2. Calls each module’s generate, concatenates results with double newlines.
3. Updates progress after every module and removes the sub‑task. Returns the final markdown string.
CustomModule (in general_modules.py) – a concrete BaseModule that:
- Splits the mixed code (info["code_mix"]) into ≤ 7000‑symbol chunks via split_data.
- Calls generete_custom_discription (typo intentional) with the chunks, the supplied model, a custom description string, and the target language.
- Returns the generated text.

Side‑effects – only console output on errors; all other state changes are confined to the History object and progress tracker.

Intro Modules – Generating the Documentation Introduction

The intro package supplies the final step of the documentation pipeline – creating the opening section that appears at the top of each generated page. It consists of two concrete BaseModule implementations that are invoked by the project‑and‑progress orchestrator after the core content has been collected.

`IntroLinks` – Link extraction & model‑driven phrasing

class IntroLinks(BaseModule):
    def generate(self, info: dict, model: Model):
        links = get_all_html_links(info.get("full_data"))
        print(links)                     # debugging aid
        intro_links = get_links_intro(links, model, info.get("language"))
        return intro_links

Responsibility – Pull every <a href> from the raw HTML (full_data), then ask the language model (model) to compose a short introductory paragraph that references those links in the target language.
Inputs – info["full_data"] (raw HTML string), info["language"] (ISO code), and a configured Model instance.
Outputs – A string (or markup) ready for insertion into the final document.
Side‑effects – Emits the extracted link list to stdout (useful during development).

`IntroText` – High‑level project summary

class IntroText(BaseModule):
    def generate(self, info: dict, model: Model):
        intro = get_introdaction(info.get("global_data"), model, info.get("language"))
        return intro

Responsibility – Ask the model to write a concise project overview based on the aggregated global_data (e.g., project name, goals, scope).
Inputs – info["global_data"] (structured summary dict), info["language"], and the same Model.
Outputs – A ready‑to‑display introductory text block.

Integration Flow

The orchestrator gathers info from previous modules (pre‑processor, extractor).
It instantiates IntroLinks and IntroText, feeding them the shared info and the active Model.
Their generate methods return the two pieces that are later concatenated and placed at the very top of the final documentation page, just before the progress tracker.

Both classes rely on the postprocess helpers (get_all_html_links, get_links_intro, get_introdaction) to keep the generation logic isolated from the underlying LLM calls. This design makes the intro stage easily replaceable or extendable without touching the rest of the pipeline.

Manager – Orchestrator of the ADG Pipeline

The Manager class is the high‑level coordinator that ties together every preprocessing, LLM‑generation, and post‑processing component of the Auto Doc Generator (ADG). It lives in autodocgenerator/manage.py and is the entry point used by the CLI script (the if __name__ == "__main__" block).

Responsibility

Prepare a cache directory (.auto_doc_cache) inside the target project.
Run each pipeline stage in order – code mixing, global‑info extraction, documentation chunk generation, and optional factory‑based enrichment (e.g., intro links).
Persist intermediate artefacts (code_mix.txt, global_info.md, output_doc.md) so later stages can be re‑run without re‑processing the whole repository.
Update the UI progress bar (BaseProgress / LibProgress) after every stage.

Interaction with Other Parts

Component	Interaction Point
`CodeMix` (`preprocessor/code_mix.py`)	`generate_code_file()` – builds a flat text dump of the repo.
Split‑/Compress utilities (`spliter.py`, `compressor.py`)	`generate_global_info_file()` (currently stubbed) would split the mix and compress it with the selected LLM.
Doc‑generation helpers (`spliter.gen_doc_parts`, `spliter.async_gen_doc_parts`)	`generete_doc_parts()` – creates the main documentation body.
Factory modules (`factory/base_factory.py`, `factory/modules/*`)	`factory_generate_doc()` – injects custom modules (e.g., `IntroLinks`, `CustomModule`).
LLM models (`engine/models/*`)	Passed to the above helpers as `sync_model` or `async_model`.
UI (`ui/progress_base.py`)	`progress_bar.update_task()` is called after each step.

Key Methods & Logic Flow

Method	Purpose	Important Parameters	Output / Side‑Effect
`__init__(project_directory, project_settings, sync_model, async_model, ignore_files, language, progress_bar)`	Initialise paths, store settings, create cache folder.	`project_directory`, `ignore_files`, `language`.	Creates `CACHE_FOLDER_NAME` directory.
`read_file_by_file_key(file_key)`	Convenience wrapper to read a cached artefact.	`file_key` (`"code_mix"`, `"global_info"`, `"output_doc"`).	Returns file contents as `str`.
`get_file_path(file_key)`	Builds absolute path for a cached file.	Same as above.	Returns path `str`.
`generate_code_file()`	Calls `CodeMix.build_repo_content` → writes `code_mix.txt`.	None.	Cached code mix file + progress update.
`generate_global_info_file(max_symbols, use_async)`	(Stub) would split `code_mix`, compress with LLM, and write `global_info.md`.	`max_symbols` limits chunk size, `use_async` selects model.	Currently writes placeholder `"ss"`; progress update.
`generete_doc_parts(max_symbols, use_async)`	Reads `global_info` & `code_mix`, then calls `gen_doc_parts` (sync) or `async_gen_doc_parts` (async) to produce the main markdown body.	Same as above.	Writes `output_doc.md`; progress update.
`factory_generate_doc(doc_factory)`	Supplies all artefacts to a `DocFactory`, receives additional markdown (e.g., intro links), prepends it to existing `output_doc.md`.	`doc_factory` – an instance of `DocFactory` with one or more modules.	Overwrites `output_doc.md` with enriched content; progress update.

Assumptions, Inputs & Outputs

Assumptions – The repository is accessible and the ignore list correctly filters unwanted files. The LLM models provided implement the Model / AsyncModel interfaces.
Inputs – Project root path, ProjectSettings (global description), optional LLM models, language code, ignore patterns.
Outputs – Three cached files in .auto_doc_cache and a final documentation markdown (output_doc.md). No external side‑effects beyond file I/O and optional LLM API calls.

Typical Usage (as shown in `main`)

manager = Manager(
    project_directory=r"C:\Path\To\Repo",
    project_settings=ProjectSettings("Auto Doc Generator")
        .add_info("global idea", "This project helps developers generate docs."),
    sync_model=GPTModel(API_KEY),
    async_model=AsyncGPTModel(API_KEY),
    ignore_files=ignore_list,
    progress_bar=LibProgress(progress),
    language="en"
)

# Run selected stages (uncomment as needed)
# manager.generate_code_file()
# manager.generate_global_info_file(use_async=True, max_symbols=5_000)
# manager.generete_doc_parts(use_async=True, max_symbols=4_000)

# Add an introductory links block via the factory
manager.factory_generate_doc(
    DocFactory(IntroLinks())
)

The manager can be extended by adding more modules to the DocFactory (e.g., CustomModule) to tailor the final documentation.

CodeMix – Repository‑wide source collector

The CodeMix class lives in autodocgenerator/preprocessor/code_mix.py.
Its sole responsibility is to traverse a project directory, filter out unwanted paths, and produce a single text artefact that contains:

A tree‑like listing of the repository structure.
The raw contents of every non‑ignored source file wrapped in <file path="…"> tags.

Interaction with the system

Manager.generate_code_file() creates a CodeMix instance (passing the project root and the global ignore_list) and calls build_repo_content().
The resulting file (code_mix.txt) becomes the first cached artefact that downstream stages (global‑info extraction, doc‑part generation) read via Manager.read_file_by_file_key.

Key API

Method	Purpose	Important details
`__init__(root_dir=".", ignore_patterns=None)`	Stores the absolute project root and the list of glob patterns used to skip files/folders.
`should_ignore(path: Path) -> bool`	Returns True if the relative path matches any ignore pattern (full path, basename, or any path component). Uses `fnmatch` for Unix‑style globbing.
`build_repo_content(output_file="repomix-output.txt")`	Writes two sections to `output_file`: * Repository Structure – indented tree built from `Path.rglob("")` respecting ignore rules. File payloads – for each kept file, writes a `<file path="…">` header followed by the file text (UTF‑8, errors ignored). Errors are logged inline.

Assumptions, inputs & outputs

Assumptions – The supplied root_dir exists and is readable; ignore patterns correctly describe files that should not appear in the documentation.
Inputs – root_dir (project path), ignore_patterns (list of glob strings).
Outputs – A single UTF‑8 text file (output_file) placed in the working directory; no side‑effects besides file I/O and console prints in the __main__ demo.

Usage excerpt (as used by the manager)

code_mix = CodeMix(root_dir=project_dir, ignore_patterns=ignore_list)
code_mix.build_repo_content("code_mix.txt")   # → cached artefact for later stages

The generated code_mix.txt is later consumed by the LLM‑driven pipeline to derive a high‑level overview and the final documentation.

Compressor – Core Pre‑processor

The compressor module reduces raw source‑code strings into concise summaries that can be fed to the LLM‑driven documentation pipeline. It works together with:

engine.models.gpt_model – provides synchronous (Model) and asynchronous (AsyncModel) wrappers around the LLM.
engine.config.config.get_BASE_COMPRESS_TEXT – returns a system‑prompt fragment that instructs the model how aggressively to compress (parameter compress_power).
ui.progress_base.BaseProgress – visualises work in the console.
settings.ProjectSettings – supplies the project‑specific system prompt (project_settings.prompt).

All functions return plain UTF‑8 strings or lists of strings; side‑effects are limited to progress‑bar updates and the final file write performed by the caller.

`compress(data, project_settings, model, compress_power) → str`

Purpose – Sends a single code block to the LLM with a compression prompt and returns the model’s answer.
Inputs
- data – raw code text.
- project_settings – contains prompt (system instruction).
- model – an instance of Model (synchronous).
- compress_power – integer controlling summary length.
Output – compressed text string.

`compress_and_compare(data, model, project_settings, compress_power=4, progress_bar=BaseProgress()) → List[str]`

Splits data (list of file texts) into chunks of size compress_power.
Calls compress for each element, concatenating results per chunk.
Returns a list whose length is ceil(len(data)/compress_power).
Updates progress_bar for each file processed.

Async variants (`async_compress`, `async_compress_and_compare`)

Mirrors the synchronous flow but runs compression calls concurrently, limited by an asyncio.Semaphore(4).
Accepts an AsyncModel and returns the same structures as their sync counterparts.
Progress updates happen inside the semaphore‑protected region.

`compress_to_one(data, model, project_settings, compress_power=4, use_async=False, progress_bar=BaseProgress()) → str`

Repeatedly compresses the list until a single aggregated summary remains.
Dynamically reduces compress_power to 2 when the list becomes small.
Chooses the async or sync pipeline based on use_async.
Returns the final consolidated description.

`generate_describtions_for_code(data, model, project_settings, progress_bar=BaseProgress()) → List[str]`

For each compressed code chunk, builds a detailed LLM prompt that asks for:
1. Main components,
2. Their purpose,
3. Parameters & types,
4. A copy‑pasteable usage example.
Sends the prompt via model.get_answer_without_history.
Returns a list of the generated documentation snippets.

Exceptions (`preprocessor/exceptions.py`)

The file is currently empty; the module reserves a namespace for future custom exception types (e.g., CompressionError, RateLimitExceeded). Adding specific exceptions will allow callers to distinguish LLM‑related failures from I/O issues.

Documentation – autodocgenerator.preprocessor (post‑processing & helper utilities)

`generate_markdown_anchor(header: str) → str`

Creates a GitHub‑style markdown anchor from a heading.

Normalises Unicode, lower‑cases, replaces spaces with “‑”, strips disallowed characters and collapses duplicate hyphens.
Returns the anchor prefixed with “#”.
Side‑effects: none – pure function.

`get_all_topics(data: str) → tuple[list[str], list[str]]`

Scans a generated markdown document for top‑level sections (## …).

Returns a tuple: (topics, links) where links are the anchors produced by generate_markdown_anchor.
Used by the final formatter to build a table‑of‑contents.

`get_all_html_links(data: str) → list[str]`

Extracts legacy HTML anchors (<a name="…">) from the document.

Ignores anchors longer than 25 characters (treated as noise).
Returns a list of markdown links (#anchor).

`get_links_intro(links: list[str], model: Model, language: str = "en") → str`

Builds a system‑prompt that asks the LLM to write an introductory paragraph for a list of section links.

Sends the prompt via model.get_answer_without_history.
Returns the raw LLM text.

`get_introdaction(global_data: str, model: Model, language: str = "en") → str`

Similar to get_links_intro but operates on the whole document text (global_data).

Uses the constant BASE_INTRO_CREATE as the system instruction.

`generete_custom_discription(splited_data: str, model: Model, custom_description: str, language: str = "en") → str`

Iterates over pre‑split code/document fragments until the LLM can produce a non‑empty, qualified answer for a user‑supplied custom_description.

Prompt enforces strict “use only the provided context” rules and asks for a title + <a name='…'> anchor.
If the LLM returns “!noinfo” or “No information found”, the loop continues; otherwise the result is returned.
Returns an empty string when no fragment yields information.

`ProjectSettings` (in settings.py)

Container for per‑project metadata that is injected into LLM system prompts.

Member	Description
`project_name` (str)	Human‑readable project identifier.
`info` (dict)	Arbitrary key/value pairs added via `add_info`.
`prompt` (property)	Concatenates `BASE_SETTINGS_PROMPT` with the project name and all `info` entries, producing the final system‑prompt string.

No side‑effects – the class only stores data.

`split_data(data: str, max_symbols: int) → list[str]` (partial implementation in spliter.py)

Intended to chunk a large markdown string into pieces that respect the LLM token limit (max_symbols).

Currently creates an empty split_objects list and begins to split on the marker "* The function will eventually return a list of string chunks, each ≤ max_symbols` characters, preserving file boundaries where possible.
At the moment it only initialises split_objects and splits the input on the sentinel `

### split_data(data: str, max_symbols: int) → list[str]
Chunk a large markdown source into pieces that fit the LLM token budget.

Splits on file‑level markers, then repeatedly breaks any chunk > 1.5 × max_symbols into two halves.
Re‑assembles pieces while keeping each ≤ 1.25 × max_symbols.
Returns a list of strings ready for LLM consumption.
Side‑effects: none – pure function.

### write_docs_by_parts(part: str, model: Model, global_info: str, prev_info: str | None = None, language: str = "en") → str
Builds a prompt (system‑language hint + BASE_PART_COMPLITE_TEXT + optional previous output) and calls model.get_answer_without_history.

Strips surrounding markdown fences (```).
Returns the raw LLM‑generated documentation for the supplied code fragment.

### async_write_docs_by_parts(...) → str
Async counterpart of write_docs_by_parts.

Executes the same prompt inside an asyncio.Semaphore to limit concurrency.
Calls async_model.get_answer_without_history and optionally fires update_progress.

### gen_doc_parts(full_code_mix, global_info, max_symbols, model, language, progress_bar)

Splits the whole source via split_data.
Iterates over chunks, invoking write_docs_by_parts sequentially, feeding the last 3000 chars of the previous answer as context (prev_info).
Updates a BaseProgress sub‑task after each chunk and concatenates all parts into the final markdown document.

### async_gen_doc_parts(...)

Mirrors gen_doc_parts but launches async_write_docs_by_parts for all chunks concurrently (default 4‑worker semaphore).
Aggregates results preserving order, updates progress via callbacks, and returns the combined documentation.

Interaction flow – split_data → (sync/async) write_docs_by_parts → gen_doc_parts/async_gen_doc_parts → final markdown. All functions are pure besides the LLM calls and progress updates.

Progress handling utilities – autodocgenerator/ui/progress_base.py

### BaseProgress (interface)
Abstract contract used by the documentation pipeline to report incremental work.

Methods
- create_new_subtask(name: str, total_len: int): allocate a sub‑task that will receive total_len update calls.
- update_task(): advance the currently active task by one step.
- remove_subtask(): discard the active sub‑task, causing subsequent calls to affect the parent task.
Assumptions – concrete subclasses implement the three methods; the class itself does nothing.

### LibProgress – Rich‑based visualizer

Constructor __init__(self, progress: Progress, total: int = 4)
- Receives a Rich Progress instance (shared UI object).
- Creates a base task “General progress” with total steps; stores its ID in _base_task.
create_new_subtask – registers a new Rich task and stores its ID in _cur_sub_task.
update_task – if a sub‑task exists, updates it; otherwise advances the base task.
remove_subtask – clears the stored sub‑task reference.
Side‑effects – updates the Rich live‑rendered progress bar shown to the user.

### ConsoleGtiHubProgress – fallback for CI / non‑TTY runs

Uses the lightweight ConsoleTask helper to emit plain‑text progress lines.
Keeps a single general task (gen_task) and an optional current sub‑task (curr_task).
create_new_subtask → spawns a new ConsoleTask.
update_task → calls progress() on the active task, falling back to the general one.
remove_subtask → discards the sub‑task reference.

Interaction with the rest of the system
Both progress classes are injected into the doc‑assembly functions (gen_doc_parts, async_gen_doc_parts). After each chunk is processed they call update_task() to move the visual indicator forward and remove_subtask() when a chunk finishes. The rest of the pipeline treats them as pure side‑effect objects; no return values are expected.

Typical usage

from rich.progress import Progress
progress = Progress()
pbar = LibProgress(progress, total=len(chunks))

for chunk in chunks:
    pbar.create_new_subtask("Chunk", total_len=len(chunk))
    # … generate docs for the chunk …
    pbar.update_task()
    pbar.remove_subtask()

The console implementation follows the same API, enabling the same pipeline to run in headless CI environments.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

1.6.6.3

Apr 16, 2026

1.6.6.1

Apr 11, 2026

1.6.6.0

Apr 11, 2026

1.6.5.9

Apr 11, 2026

1.6.4.9

Apr 9, 2026

1.6.4.7

Apr 9, 2026

1.6.3.7

Apr 5, 2026

1.6.3.5

Apr 4, 2026

1.6.3.4

Apr 4, 2026

1.6.3.1

Apr 4, 2026

1.6.0.9

Apr 3, 2026

1.6.0.8

Apr 3, 2026

1.6.0.6

Apr 2, 2026

1.6.0.5

Apr 2, 2026

1.6.0.4

Apr 2, 2026

1.6.0.3

Apr 2, 2026

1.6.0.2

Apr 2, 2026

1.6.0.1

Apr 2, 2026

1.6.0.0

Apr 2, 2026

1.5.9.9

Apr 2, 2026

1.4.9.6

Mar 22, 2026

1.4.9.5

Mar 20, 2026

1.4.9.2

Mar 20, 2026

1.4.9.1

Mar 20, 2026

1.4.9.0

Mar 20, 2026

1.1.9.0

Mar 20, 2026

1.1.8.9

Mar 20, 2026

1.1.8.8

Mar 20, 2026

1.0.6.8

Mar 19, 2026

1.0.6.6

Mar 19, 2026

1.0.5.6

Mar 18, 2026

1.0.5.0

Mar 18, 2026

1.0.4.0

Mar 18, 2026

1.0.3.9

Mar 18, 2026

1.0.3.5

Mar 18, 2026

1.0.3.3

Mar 18, 2026

0.9.3.1

Feb 7, 2026

0.9.3.0

Feb 7, 2026

0.9.2.8

Feb 6, 2026

0.9.2.7

Feb 6, 2026

0.9.2.5

Feb 5, 2026

0.9.0.4

Jan 28, 2026

0.9.0.3

Jan 28, 2026

0.9.0.2

Jan 28, 2026

0.9.0.1

Jan 28, 2026

0.9.0.0

Jan 28, 2026

0.8.9.9

Jan 27, 2026

0.8.9.8

Jan 27, 2026

0.8.9.7

Jan 27, 2026

0.8.9.6

Jan 27, 2026

0.8.9.5

Jan 26, 2026

0.8.9.1

Jan 26, 2026

0.8.9

Jan 26, 2026

0.8.8

Jan 26, 2026

0.8.7

Jan 26, 2026

0.8.6

Jan 26, 2026

0.8.5.9

Jan 26, 2026

0.8.5.8

Jan 26, 2026

0.8.5.7

Jan 26, 2026

0.8.5.6

Jan 26, 2026

0.8.5.4

Jan 26, 2026

0.8.5.3

Jan 26, 2026

0.8.5.2

Jan 26, 2026

0.8.5.1

Jan 26, 2026

0.8.5

Jan 25, 2026

0.8.4

Jan 25, 2026

0.8.3

Jan 25, 2026

0.8.1

Jan 25, 2026

0.8.0

Jan 25, 2026

0.7.9

Jan 25, 2026

0.7.6

Jan 25, 2026

0.7.5

Jan 25, 2026

This version

0.7.4

Jan 23, 2026

0.7.3

Jan 23, 2026

0.7.2

Jan 23, 2026

0.7.1

Jan 23, 2026

0.7.0

Jan 23, 2026

0.6.9

Jan 23, 2026

0.6.8

Jan 23, 2026

0.6.5

Jan 22, 2026

0.6.3

Jan 22, 2026

0.6.2

Jan 22, 2026

0.6.1

Jan 22, 2026

0.6.0

Jan 21, 2026

0.5.9

Jan 21, 2026

0.5.8

Jan 21, 2026

0.5.5

Jan 21, 2026

0.5.4

Jan 21, 2026

0.5.3

Jan 19, 2026

0.5.2

Jan 19, 2026

0.5.1

Jan 19, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

autodocgenerator-0.7.4.tar.gz (39.8 kB view details)

Uploaded Jan 23, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

autodocgenerator-0.7.4-py3-none-any.whl (33.9 kB view details)

Uploaded Jan 23, 2026 Python 3

File details

Details for the file autodocgenerator-0.7.4.tar.gz.

File metadata

Download URL: autodocgenerator-0.7.4.tar.gz
Upload date: Jan 23, 2026
Size: 39.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.3.1 CPython/3.12.12 Linux/6.11.0-1018-azure

File hashes

Hashes for autodocgenerator-0.7.4.tar.gz
Algorithm	Hash digest
SHA256	`fb0210d95b1a104f126b27c28723eeb87673644f0cd0c76468041111ec708497`
MD5	`f69912402a12f0f01227a4cbd33a073a`
BLAKE2b-256	`95b750cfcf39cbf67f9dbec65a3245d19777700ecb32e768d371ce5da629f266`

See more details on using hashes here.

File details

Details for the file autodocgenerator-0.7.4-py3-none-any.whl.

File metadata

Download URL: autodocgenerator-0.7.4-py3-none-any.whl
Upload date: Jan 23, 2026
Size: 33.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.3.1 CPython/3.12.12 Linux/6.11.0-1018-azure

File hashes

Hashes for autodocgenerator-0.7.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f753700cc98fb1743bdb876de3d544183527e4dd65d72ad0d43f811bb4b21c92`
MD5	`6b0f83b94cc41f939866a1cd2a745ebd`
BLAKE2b-256	`bdb4f0d48fe0eff6cb6e379f0a674c26672185d655a3e7e3479a53e271bc9cc9`

See more details on using hashes here.

autodocgenerator 0.7.4

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Executive Navigation Tree

Full example of usage

Package Initializer (autodocgenerator/__init__.py)

autodocgenerator.auto_runner.config_reader – Configuration Loader

autodocgenerator.auto_runner.run_file – Entry Point for Documentation Generation

autodocgenerator.engine.__init__

Configuration constants & prompt templates

Environment loading & API key validation

Supported model identifiers

get_BASE_COMPRESS_TEXT(start, power) – Prompt generator for large snippets

Exceptions – ModelExhaustedException

Model hierarchy (model.py)

Concrete GPT models (gpt_model.py)

Documentation factory core (base_factory.py & general_modules.py)

Intro Modules – Generating the Documentation Introduction

IntroLinks – Link extraction & model‑driven phrasing

IntroText – High‑level project summary

Integration Flow

Manager – Orchestrator of the ADG Pipeline

Responsibility

Interaction with Other Parts

Key Methods & Logic Flow

Assumptions, Inputs & Outputs

Typical Usage (as shown in __main__)

CodeMix – Repository‑wide source collector

Interaction with the system

Key API

Assumptions, inputs & outputs

Usage excerpt (as used by the manager)

Compressor – Core Pre‑processor

compress(data, project_settings, model, compress_power) → str

compress_and_compare(data, model, project_settings, compress_power=4, progress_bar=BaseProgress()) → List[str]

Async variants (async_compress, async_compress_and_compare)

compress_to_one(data, model, project_settings, compress_power=4, use_async=False, progress_bar=BaseProgress()) → str

generate_describtions_for_code(data, model, project_settings, progress_bar=BaseProgress()) → List[str]

Exceptions (preprocessor/exceptions.py)

generate_markdown_anchor(header: str) → str

get_all_topics(data: str) → tuple[list[str], list[str]]

get_all_html_links(data: str) → list[str]

get_links_intro(links: list[str], model: Model, language: str = "en") → str

get_introdaction(global_data: str, model: Model, language: str = "en") → str

generete_custom_discription(splited_data: str, model: Model, custom_description: str, language: str = "en") → str

ProjectSettings (in settings.py)

split_data(data: str, max_symbols: int) → list[str] (partial implementation in spliter.py)

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Package Initializer (`autodocgenerator/init.py`)

`autodocgenerator.auto_runner.config_reader` – Configuration Loader

`autodocgenerator.auto_runner.run_file` – Entry Point for Documentation Generation

`autodocgenerator.engine.init`

`get_BASE_COMPRESS_TEXT(start, power)` – Prompt generator for large snippets

Model hierarchy (`model.py`)

Concrete GPT models (`gpt_model.py`)

Documentation factory core (`base_factory.py` & `general_modules.py`)

Intro Modules – Generating the Documentation Introduction

`IntroLinks` – Link extraction & model‑driven phrasing

`IntroText` – High‑level project summary

Typical Usage (as shown in `main`)

`compress(data, project_settings, model, compress_power) → str`

`compress_and_compare(data, model, project_settings, compress_power=4, progress_bar=BaseProgress()) → List[str]`

Async variants (`async_compress`, `async_compress_and_compare`)

`compress_to_one(data, model, project_settings, compress_power=4, use_async=False, progress_bar=BaseProgress()) → str`

`generate_describtions_for_code(data, model, project_settings, progress_bar=BaseProgress()) → List[str]`

Exceptions (`preprocessor/exceptions.py`)

`generate_markdown_anchor(header: str) → str`

`get_all_topics(data: str) → tuple[list[str], list[str]]`

`get_all_html_links(data: str) → list[str]`

`get_links_intro(links: list[str], model: Model, language: str = "en") → str`

`get_introdaction(global_data: str, model: Model, language: str = "en") → str`

`generete_custom_discription(splited_data: str, model: Model, custom_description: str, language: str = "en") → str`

`ProjectSettings` (in settings.py)

`split_data(data: str, max_symbols: int) → list[str]` (partial implementation in spliter.py)