This Project helps you to create docs for your projects

These details have not been verified by PyPI

Project description

Auto‑Doc Generator

A layered, factory‑based, LLM‑driven Markdown documentation pipeline for any codebase.

1. Project Title

Auto‑Doc Generator – Layered + Factory + LLM‑Driven

2. Project Goal

To automatically produce a complete, readable README (or other Markdown artifacts) from the source code of a repository.
The tool parses the code, chunks it to stay within token limits, sends those fragments to a large‑language model (Groq or OpenAI), formats the generated text with reusable modules, and stitches the result into a single cohesive document. The solution is CI‑friendly and can be invoked from a local CLI or a GitHub Action.

3. Core Logic & Principles

3.1 Pipeline Overview

Phase	Action	Key Components
Configuration	Read `autodocconfig.yml` → `Config`, `StructureSettings`, custom‑module lists	`auto_runner/config_reader.py`
Entry Point	`run_file.__main__` calls `gen_doc(project_path, …)`	`auto_runner/run_file.py`
Repository Walk	Scan files, split code into manageable chunks	`manage.py` (preprocessor `spliter.py`, `compressor.py`)
LLM Interaction	Submit chunks to `GPTModel` (rotating keys, history, logging)	`engine/models/gpt_model.py`, `engine/config/config.py`
Doc Construction	Each `BaseModule` (e.g., `IntroText`, `CustomModule`) processes LLM output → Markdown section	`factory/base_factory.py`, `factory/modules/*`
Post‑processing	Optional re‑ordering, anchor extraction, intro sections, cache clearance	`postprocessor/*`
Persistence	Write `README.md`, logs, and cache	`Manager.save()` in `manage.py`

The pipeline is fully layered – each stage exposes a small, single‑purpose interface – and uses a factory pattern to iterate over a configurable list of doc modules.

3.2 LLM Wrapper

GPTModel (synchronous) / AsyncGPTModel (async) manage a pool of API keys and models.

Model rotation – ModelExhaustedException is raised only when all configured keys/models are exhausted.
History tracking – keeps the last 3 k characters of context for subsequent prompts.
Prompt assembly – pulls constants such as BASE_SYSTEM_TEXT, BASE_INTRO_CREATE, etc., from engine/config/config.py.

3.3 Pre‑processing

Splitting – split_data respects a user‑defined max_symbols threshold and applies heuristics to stay below token limits.
Compression – compressor.py can reduce large files into concise prompts before they hit the LLM.
Discovery – settings.py controls file patterns to ignore, language, and metadata extraction.

3.4 Post‑processing

Sorting & Ordering – postprocessor.sorting places sections in a logical sequence.
Anchor Extraction – creates internal links for easy navigation.
Intro Generation – optional introductory text or global sections.

3.5 Logging & UI

Singleton Logger – BaseLogger funnels all logs, with optional file output via FileLoggerTemplate.
Progress Feedback – ConsoleGitHubProgress shows real‑time status during CI runs, while LibProgress (Rich) can be used locally.

4. Key Features

Zero‑setup Documentation – a single autodocconfig.yml configures patterns, languages, and module order.
Modular Architecture – add or replace BaseModule implementations without touching core logic.
LLM Flexibility – supports Groq, OpenAI, or any future LLM via the gpt_model.py abstraction.
Token‑Aware Chunking – automatically splits files to stay within token limits while preserving context.
Post‑processing Pipeline – reorder, add anchors, and create global intros automatically.
CI‑Ready – bundled GitHub workflows, progress output for GitHub Actions.
Cache & History Management – avoids redundant API calls and keeps conversational context.
Extensible Prompt System – all AI instructions live in engine/config/config.py; modify tone or formatting with minimal code changes.
Exception Handling – graceful fallback when API limits or key exhaustion occurs.

5. Dependencies

Library / Tool	Purpose
Python 3.10+	Core runtime
rich	Optional console progress UI
pydantic	Schema validation (`DocContent`, `DocInfoSchema`, etc.)
groq / openai SDK	LLM client (client choice determined by `GPTModel` implementation)
PyYAML	Read `autodocconfig.yml`
GitHub Actions	CI workflows (see `.github/workflows/*`)
logging	Standard Python logger (wrapped by `BaseLogger`)

All third‑party dependencies are listed in requirements.txt and are installable via pip install -r requirements.txt.

Auto‑Doc Generator delivers a maintainable, plug‑and‑play solution that turns raw source into polished, AI‑generated documentation while keeping the developer in full control of prompts, module composition, and CI integration.

Executive Navigation Tree

📖 Introduction
🔧 Utilities
📦 Modules
⚙️ Configuration
📦 Manager
🔩 Components
📄 Documentation Generation
🔗 Cross‑module Interaction
📁 Anchor & Path
🛠️ Exception & Cache
- exception-handling
- cache-clear
📑 Summary & Misc
📚 General
- project-metadata

`get_all_html_links(data: str) → list[str]`

Entity	Type	Role	Notes
`data`	`str`	Markdown source	Input document to search for `<a name>` anchors.
`links`	`list[str]`	Result	Returns `[#anchor]` for each anchor with >5 chars.
`logger`	`BaseLogger`	Logger	Logs extraction steps.
`pattern`	`re.Pattern`	Anchor regex	Matches `<a name="...">` or `<a name='...'>`.

Logic Flow

Compile regex r'<a name=["\']?(.*?)["\']?>.
Iterate over all matches; extract anchor_name.
If len(anchor_name)>5, prepend # and append to links.
Log number of links and list.

Result – Returns a list of markdown anchor links that will be used in table‑of‑contents sections.

`get_introdaction(global_data: str, model: Model, language: str = "en") → str`

Entity	Type	Role	Notes
`global_data`	`str`	Raw project description	Supplied by the calling pipeline.
`model`	`Model`	LLM wrapper	Calls `get_answer_without_history`.
`language`	`str`	Target language	Included in system instruction.

Logic Flow

Compose a 3‑message prompt using BASE_INTRO_CREATE.
Pass to LLM and return the resulting intro text.

Result – The top‑level introduction for the README.

`get_links_intro(links: list[str], model: Model, language: str = "en") → str`

Entity	Type	Role	Notes
`links`	`list[str]`	Anchor list	Obtained from `get_all_html_links`.
`model`	`Model`	LLM wrapper	Calls `get_answer_without_history`.
`language`	`str`	Target language	Sent as a system instruction.
`prompt`	`list[dict]`	LLM prompt	Contains system messages and `BASE_INTRODACTION_CREATE_LINKS`.

Logic Flow

Build a 3‑message prompt: language instruction, BASE_INTRODACTION_CREATE_LINKS, and the links string.
Invoke model.get_answer_without_history.
Return the generated introductory section.

Result – A Markdown fragment that introduces the generated documentation with clickable links.

Intro Modules – Automated Introduction Sections

class IntroLinks(BaseModule):
    def generate(self, info: dict, model: Model):
        links = get_all_html_links(info.get("full_data"))
        return get_links_intro(links, model, info.get("language"))

class IntroText(BaseModule):
    def generate(self, info: dict, model: Model):
        return get_introdaction(info.get("global_info"), model, info.get("language"))

Purpose – Build introductory material from the repository’s metadata.
- IntroLinks collects all external URLs and asks the model to format a “Links” section.
- IntroText synthesizes a high‑level introduction based on global_info.
Shared Mechanism – Both rely on postprocessor.custom_intro functions and the same model.

Module	Input Key	Prompt Type	Output
`IntroLinks`	`full_data`	Link list	Markdown list of links
`IntroText`	`global_info`	Overview prompt	Markdown intro

Note – The modules are pure‑function; no side‑effects beyond returning strings.
Error Handling – Any exception in the underlying LLM call propagates to DocFactory; ModelExhaustedException is surfaced at the top level.

Welcome Message Display

The module executes a single helper, _print_welcome, that renders an ASCII banner and a status line when the package is imported.

def _print_welcome():
    ...
    print(ascii_logo)
    print(f"{CYAN}ADG Library{RESET} | {BOLD}Status:{RESET} Ready to work V0.0.1")
    print(f"{'—' * 35}\n")

The routine uses ANSI escape sequences to colour the output. It is immediately invoked at import time, so any consumer of the library will see the banner in the console.

Entity	Type	Role	Notes
`BLUE`	str	Colour escape for banner	`"\033[94m"`
`BOLD`	str	Bold text escape	`"\033[1m"`
`CYAN`	str	Colour escape for library name	`"\033[96m"`
`RESET`	str	Reset formatting	`"\033[0m"`
`ascii_logo`	str	Multi‑line ASCII art	displayed on import
`_print_welcome()`	function	Side‑effect: prints banner	executed automatically

Notice: The function is not exported; it is a private helper for visual feedback only.

Logging Strategy

BaseLogger is instantiated locally in each function; it is a singleton under the hood, guaranteeing a single output stream.
InfoLog objects carry a level integer; higher levels produce quieter logs.
All major stages emit a message, including lengths of generated text.

Logging Component – `autodocgenerator/ui/logging.py`

Entity	Type	Role	Notes
`BaseLog`	Abstract class	Base for log objects	Stores `message`, `level`; generates timestamped prefix
`ErrorLog`, `WarningLog`, `InfoLog`	Sub‑classes	Emit formatted strings with severity tags	Override `format()`
`BaseLoggerTemplate`	Logger abstraction	Holds `log_level`; routes messages through `global_log`	`log()` writes directly (e.g., console)
`FileLoggerTemplate`	Concrete template	Appends formatted logs to a file	Uses `file_path`
`BaseLogger`	Singleton	Central logger instance	`set_logger()` attaches a concrete template; `log()` delegates to the template

Implementation Detail – BaseLogger.__new__ guarantees a single shared instance across the project.

Logger Initialization

After printing the banner, the module sets up the global logger that the rest of the package relies on.

from .ui.logging import BaseLogger, BaseLoggerTemplate, InfoLog, ErrorLog, WarningLog

logger = BaseLogger()
logger.set_logger(BaseLoggerTemplate())

The BaseLogger is a singleton that aggregates log handlers. BaseLoggerTemplate defines the output format and default level. The logger instance becomes available to any sub‑module that imports it from autodocgenerator.

Symbol	Scope	Effect
`BaseLogger`	Module	Singleton logger class
`BaseLoggerTemplate`	Module	Log handler configuration
`logger`	Module	Global logger instance

Assumption: The ui.logging module implements a standard logger that accepts a template via set_logger. No other configuration is performed in this file.

BaseModule – Contract for Documentation Builders

class BaseModule(ABC):
    @abstractmethod
    def generate(self, info: dict, model: Model):
        ...

Role – Declares the interface a concrete module must implement.
Parameters – info is a dictionary of repo‑wide data; model is a Model instance that talks to Groq/Chat‑GPT.
Output – A Markdown string produced by the module.

CustomModule – Context‑Rich Description

class CustomModule(BaseModule):
    def generate(self, info: dict, model: Model):
        return generete_custom_discription(
            split_data(info.get("code_mix"), max_symbols=5000),
            model,
            self.discription,
            info.get("language"))

Goal – Produce a documentation segment that incorporates a code snippet limited to 5 000 symbols.
Dependencies – split_data (preprocessor), generete_custom_discription (postprocessor).
Data Flow
1. info["code_mix"] → split to a manageable chunk.
2. Chunk + user discription + language sent to the model.
3. Result returned as Markdown.

CustomModuleWithOutContext – Self‑Contained Descriptions

class CustomModuleWithOutContext(BaseModule):
    def generate(self, info: dict, model: Model):
        return generete_custom_discription_without(
            model, self.discription, info.get("language"))

Use‑case – Generates a section that does not depend on any source fragment.
Inputs – Only model, static discription, and language.
Output – Plain Markdown paragraph.

Install PowerShell Script – `install.ps1`

Step	Action	Outcome
Create `.github/workflows` dir	`New-Item -ItemType Directory -Force`	Directory ready for workflow file
Write workflow YAML	Here‑string `@' … '@` piped to `Out-File`	Generates `autodoc.yml` that calls reusable workflow
Generate `autodocconfig.yml`	Here‑string with current folder name and settings	Provides ignore patterns, build and structure flags

Notice – The script uses PowerShell variable interpolation and Out-File -Encoding utf8 to ensure proper Unicode handling.

Installer Shell Script

Purpose
Creates the CI workflow and initial configuration for the Auto‑Doc Generator.
The script is run in the project root; it writes a GitHub Actions workflow (.github/workflows/autodoc.yml) that re‑uses a shared reusable workflow and an autodocconfig.yml file that stores project‑specific settings.

Entity	Type	Role	Notes
`mkdir -p .github/workflows`	command	Ensure the workflow directory exists	No side‑effects beyond directory creation
`autodoc.yml`	YAML file	Defines the GitHub Actions job	Uses `GROCK_API_KEY` secret
`autodocconfig.yml`	YAML file	Holds project metadata and ignore patterns	Generated using shell `basename` and heredoc
`echo "✅ Done!"`	console output	Feedback to the user	Non‑blocking

Processing Flow

Create Workflow Directory
```
mkdir -p .github/workflows
```
Ensures the directory for GitHub Actions files exists.
Write autodoc.yml
The file declares a dispatchable workflow that invokes a reusable workflow stored at Drag‑GameStudio/ADG/.github/workflows/reuseble_agd.yml@main. It passes the GROCK_API_KEY secret and grants write permission for repository contents.
Write autodocconfig.yml
Populates project metadata (project_name, language) and several sections:
- ignore_files – glob patterns for files to exclude.
- build_settings – logging options.
- structure_settings – toggles for generated sections. The content is output via a here‑document, with the first $ escaped (\$) so the key can be rendered correctly in GitHub secrets.
Final Notification
echo "✅ Done! …" confirms success.

Critical Assumption
The script expects to be executed in the repository root where a GitHub Actions workflow can be committed and a autodocconfig.yml can be placed.

To set up the application you can run one of two bootstrap scripts directly from the repository.

On Windows with PowerShell execute:
```
irm https://raw.githubusercontent.com/Drag-GameStudio/ADG/main/install.ps1 | iex
```
The script pulls the latest installation package, configures required environment variables, and starts the service.
On Linux‑based systems run the shell script:
```
curl -sSL https://raw.githubusercontent.com/Drag-GameStudio/ADG/main/install.sh | bash
```
This performs the same sequence of download, setup, and launch steps in a POSIX environment.

If you intend to run the installer within a CI pipeline such as GitHub Actions, create a secret named GROCK_API_KEY in the repository’s secret store. The value must be the API key obtained from the Grock interface at https://grockdocs.com. The installer will automatically consume this secret to authenticate with the Grock service during deployment.

Manager – Repository‑level Orchestration

Purpose –
The Manager class is the central coordination hub for the Auto‑Doc Generator pipeline.
It loads and persists intermediate artefacts, triggers LLM‑based transformations, and accumulates final Markdown output.

Manager class

Method	Purpose
`generate_code_file()`	Scans the project folder and creates a cache of Python source files for later use.
`generate_global_info(compress_power: int)`	Builds a compressed global information file if requested.
`generete_doc_parts(max_symbols: int, with_global_file: bool)`	Splits the cached code into document parts respecting the maximum symbol count.
`factory_generate_doc(factory, to_start: bool = False, with_splited: bool = True)`	Uses a `DocFactory` instance to create documentation pieces, optionally inserting them at the beginning and controlling whether the result is split into multiple sections.
`order_doc()`	Reorders the generated documentation sections according to a predefined sequence.
`clear_cache()`	Deletes temporary files and data used during the generation cycle.
`save()`	Persists the final documentation object to disk.
`doc_info.doc.get_full_doc()`	Retrieves the complete assembled documentation text.

Typical usage

from autodocgenerator.manage import Manager
from autodocgenerator.factory.base_factory import DocFactory
from autodocgenerator.factory.modules.general_modules import CustomModule
from autodocgenerator.ui.progress_base import ConsoleGtiHubProgress
from autodocgenerator.auto_runner.config_reader import read_config
from autodocgenerator.engine.models.gpt_model import GPTModel
from autodocgenerator.engine.config.config import API_KEYS

# Load configuration data from a file (context provided in the project)
with open("autodocconfig.yml", "r", encoding="utf-8") as f:
    cfg_data = f.read()
config_obj, custom_mods, struct_opts = read_config(cfg_data)

# Prepare the language model and progress indicator
llm = GPTModel(API_KEYS, use_random=False)

# Create Manager
mgr = Manager(
    project_path=".",          # root of the target project
    config=config_obj,
    llm_model=llm,
    progress_bar=ConsoleGtiHubProgress()
)

# Execute the documentation pipeline
mgr.generate_code_file()

if struct_opts.use_global_file:
    mgr.generate_global_info(compress_power=4)

mgr.generete_doc_parts(
    max_symbols=struct_opts.max_doc_part_size,
    with_global_file=struct_opts.use_global_file
)

mgr.factory_generate_doc(DocFactory(*custom_mods))

if struct_opts.include_order:
    mgr.order_doc()

additional_modules = []
if struct_opts.include_intro_text:
    additional_modules.append(IntroText())
if struct_opts.include_intro_links:
    additional_modules.append(IntroLinks())

mgr.factory_generate_doc(
    DocFactory(*additional_modules, with_splited=False),
    to_start=True
)

mgr.clear_cache()
mgr.save()

# Retrieve the finished documentation
full_text = mgr.doc_info.doc.get_full_doc()

This sequence demonstrates how to instantiate the manager, run all generation steps, and obtain the final documentation string.

`init` – Construction & Cache Preparation

Parameter	Type	Role	Notes
`project_directory`	`str`	Root of target repository	Path used for all cache files
`config`	`Config`	Parsed `autodocconfig.yml`	Provides `pbc.log_level`, `ignore_files`, `language` etc.
`llm_model`	`Model`	LLM client	Handles key rotation, request history
`progress_bar`	`BaseProgress`	UI progress	Default instance if not supplied

Steps

Initialise a new DocInfoSchema container.
Store config, project_directory, llm_model, progress_bar.
Initialise a singleton BaseLogger and attach a FileLoggerTemplate to a log file under the cache folder.
Create a .auto_doc_cache folder if it does not exist.

Note – No network traffic is performed during construction.

Related Configuration

autodocgenerator.engine.exceptions.ModelExhaustedException is the only exception that propagates outside this module, signaling to Manager that the pipeline must terminate gracefully.

The API_KEYS list is sourced from autodocgenerator.config.config and typically contains Groq API keys.

The documentation above is a self‑contained, factual representation of the gpt_model.py and model.py fragments, aligned with the Auto‑Doc Generator’s pipeline and strictly based on the provided source.

Config Reader – Settings Loader

The autodocgenerator.auto_runner.config_reader module is responsible for translating the YAML configuration file (autodocconfig.yml) into runtime objects that drive the documentation pipeline.

StructureSettings

Property	Type	Default	Notes
`include_intro_links`	`bool`	`True`	Whether to inject the `IntroLinks` module during the doc build.
`include_order`	`bool`	`True`	Enables post‑processing re‑ordering.
`use_global_file`	`bool`	`True`	Controls generation of a global‑information file.
`max_doc_part_size`	`int`	`5_000`	Maximum token count per chunk.
`include_intro_text`	`bool`	`True`	Controls injection of a descriptive intro section.

StructureSettings exposes a load_settings(dict) method that dynamically overwrites defaults from a user‑supplied dictionary.

Assumption: The module does not expose any public API beyond the read_config function and the StructureSettings class.

read_config

Entity	Type	Role	Notes
`file_data`	`str`	YAML source	Raw file contents.
`Config`	`autodocgenerator.config.config.Config`	Holds project‑wide settings.	Instantiated and populated.
`ProjectBuildConfig`	`ProjectBuildConfig`	Holds build‑specific toggles.	Loaded from `build_settings`.
`CustomModule` / `CustomModuleWithOutContext`	`BaseModule` subclasses	Custom LLM prompts.	Created from `custom_descriptions`.
`StructureSettings`	`StructureSettings`	Layout flags.	Instantiated and overwritten.

The function returns a tuple of:

(config: Config, custom_modules: list[BaseModule], structure_settings: StructureSettings)

Internally it:

Parses file_data with yaml.safe_load.
Constructs a Config and populates ignore patterns, language, project metadata.
Builds a ProjectBuildConfig from build_settings.
Creates CustomModule instances based on the % marker logic.
Loads any supplied structure_settings.

Missing: No public functions are exposed beyond read_config.

Module Summary

Executes a banner on import.
Instantiates and configures a global logger for the project.
Exposes the logger instance for downstream components.

Information not present in the provided fragment: there are no public functions or classes beyond the internal banner routine; the module does not expose any API beyond the logger.

Engine Exceptions – LLM Availability Guard

autodocgenerator.engine.exceptions.ModelExhaustedException signals that all configured Groq/ChatGPT models have been exhausted, and no further requests can be made. This exception propagates up to the caller, typically resulting in a graceful termination of the documentation pipeline.

Summary

These modules collectively translate a YAML configuration into runtime objects, orchestrate the document generation pipeline via Manager and DocFactory, and expose a clean API for the rest of the system to consume. The design relies on explicit data contracts and avoids hidden state, ensuring that each component can be unit‑tested in isolation.

Supporting Module – `model.py`

History Class

class History:
    def __init__(self, system_prompt: str = BASE_SYSTEM_TEXT):
        self.history: list[dict[str, str]] = []
        if system_prompt is not None:
            self.add_to_history("system", system_prompt)
    ...

Initializes with the default system prompt.
Provides add_to_history(role, content) for appending messages.

Abstract Base
ParentModel abstracts key rotation and history management.

self.api_keys holds the list of API keys.
self.regen_models_name holds the shuffled list of model identifiers.
generate_answer, get_answer_without_history, get_answer are abstract and implemented by concrete subclasses.

Synchronous Implementation
GPTModel implements the actual HTTP request to Groq:

chat_completion = self.client.chat.completions.create(
    messages=messages,
    model=model_name,
)

The request is wrapped in a while True loop that continues until a success or all models are exhausted.
On failure, it logs a warning, updates indices, and re‑instantiates the Groq client with a new key.
ModelExhaustedException is raised when the model pool is empty.

Side Effect – All logs (InfoLog, WarningLog, ErrorLog) are routed through BaseLogger, ensuring uniform traceability across the system.

Run File – Orchestration Entry Point

autodocgenerator.auto_runner.run_file contains the primary driver that ties together all layers of the Auto‑Doc Generator. The core public method is gen_doc.

gen_doc

Parameter	Type	Role	Notes
`project_path`	`str`	Root of the repository	Target for content discovery.
`config`	`Config`	Project configuration	From `config_reader`.
`custom_modules`	`list[BaseModule]`	Custom sections to inject	Provided by `read_config`.
`structure_settings`	`StructureSettings`	Layout toggles	Also from `read_config`.

Flow

LLM Preparation

sync_model = GPTModel(API_KEYS, use_random=False)

Manager Instantiation

manager = Manager(
    project_path,
    config=config,
    llm_model=sync_model,
    progress_bar=ConsoleGtiHubProgress(),
)

Repository Walk
- manager.generate_code_file() – splits the codebase into manageable chunks.

Global Section (Optional)

if structure_settings.use_global_file:
    manager.generate_global_info(compress_power=4)

Document Parts Generation

manager.generete_doc_parts(
    max_symbols=structure_settings.max_doc_part_size,
    with_global_file=structure_settings.use_global_file
)

Custom Module Injection

manager.factory_generate_doc(DocFactory(*custom_modules))

Re‑ordering (Optional)

if structure_settings.include_order:
    manager.order_doc()

Intro Modules (conditioned on flags)

additionals_modules = []
if structure_settings.include_intro_text:
    additionals_modules.append(IntroText())
if structure_settings.include_intro_links:
    additionals_modules.append(IntroLinks())
manager.factory_generate_doc(DocFactory(*additionals_modules, with_splited=False), to_start=True)

Cleanup & Persist
```
manager.clear_cache()
manager.save()
```
Return Value – the assembled markdown string:
```
return manager.doc_info.doc.get_full_doc()
```

Return: str – the complete README content.

The __main__ block simply loads autodocconfig.yml, parses it with read_config, and calls gen_doc on the current directory.

Key Interactions

Component	Interaction	Outcome
`Manager`	`generate_code_file` → `generate_global_info`	Pre‑processing pipeline that creates a cached, compressed representation of the code.
`DocFactory`	`factory_generate_doc`	Instantiates and processes each `BaseModule`, which in turn calls `GPTModel.generate_answer`.
`GPTModel`	LLM requests	Generates Markdown for each code chunk or module.
`ConsoleGtiHubProgress`	Progress callbacks	UI feedback during long operations.
`CustomModule`	Prompt injection	Allows users to embed arbitrary LLM prompts.

Data Contract

Entity	Type	Role	Notes
`project_path`	`str`	Input	Directory to scan.
`config.ignore_files`	`list[str]`	Filters	Files/directories excluded from the scan.
`config.language`	`str`	LLM context	Determines language of prompts.
`config.project_name`	`str`	Metadata	Populated into global information.
`structure_settings.max_doc_part_size`	`int`	Chunk size limit	Token cap for each LLM call.
`manager.doc_info.doc`	`DocContent`	Resulting document	Exposed via `get_full_doc()`

Project Metadata (`pyproject.toml`)

Purpose
Defines packaging metadata, runtime dependencies, and the build system configuration for the Auto‑Doc Generator.

Entity	Type	Role	Notes
`[project]`	section	PEP 621 metadata	Includes `name`, `version`, `authors`, `license`, `readme`, `requires‑python`
`dependencies`	list	Runtime requirements	Uses pinned versions for stability
`[build-system]`	section	Build backend	Uses `poetry‑core` to build a wheel

Key Parameters

Parameter	Value	Effect
`name`	`autodocgenerator`	Package name used in PyPI
`version`	`1.0.3.3`	Semantic versioning tag
`requires-python`	`>=3.11,<4.0`	Ensures compatibility with Python 3.11+
`license.text`	`MIT`	Open‑source license
`readme`	`README.md`	Primary long‑description source
`dependencies`	extensive list	Includes `rich`, `pyyaml`, `pydantic`, `groq`, etc.
`build-system.requires`	`poetry‑core>=2.0.0`	Specifies backend required to build the project

Side Effect
The pyproject.toml is consumed by Poetry (or compatible tools) during installation or packaging, automatically pulling the specified dependencies and ensuring the runtime environment matches the configuration.

`ProjectSettings` (preprocessor.settings)

Core Responsibility
Holds a per‑project prompt template used by compression and other LLM interactions. Allows arbitrary key/value metadata to be inserted into the prompt.

Method	Role	Notes
`__init__(project_name)`	Initializes with project name; starts empty `info` dict.
`add_info(key, value)`	Stores custom metadata.
`prompt` (property)	Builds a composite prompt string: base template + project name + all `info` key/value pairs.	Uses `BASE_SETTINGS_PROMPT` constant from `engine.config.config`.

GPTModel: LLM Request Handler

Role
Acts as the bridge between the Auto‑Doc Generator pipeline and an external Groq‑powered language model.

Manages a rotating pool of API keys and model names.
Provides a synchronous API (generate_answer) used by DocFactory and higher‑level orchestrators.
Emits detailed logs via BaseLogger.

Key Interactions

Called By	Outcome
`DocFactory.factory_generate_doc`	LLM generates Markdown for a code chunk or module.
`Manager.generate_global_info`	Requests auxiliary documentation pieces (e.g., project overview).
`ConsoleGitHubProgress`	Uses the logs generated by `GPTModel` for UI feedback (implicit through `BaseLogger`).

Model Hierarchy & History Context

Class	Base	Purpose
`History`	–	Stores a list of `{role, content}` objects representing the conversation.
`ParentModel`	`ABC`	Holds common state: `api_keys`, `history`, shuffling of `models_list`.
`Model`	`ParentModel`	Synchronous implementation of the LLM interface.
`AsyncModel`	`ParentModel`	Asynchronous counterpart (currently unimplemented).
`AsyncGPTModel`	`AsyncModel`	Stub for future async support.

Critical Logic

Constructor – Shuffles models_list if use_random=True, initializes indices.

generate_answer – Attempts to call client.chat.completions.create.

Error Recovery – On exception, rotates to the next API key and/or model until the pool is exhausted.

Result Handling – Extracts choices[0].message.content, logs success, returns empty string if None.

Progress Interface – `autodocgenerator/ui/progress_base.py`

Entity	Type	Role	Notes
`BaseProgress`	Interface	Defines progress API	Methods are no‑ops or placeholders
`LibProgress`	Rich‑based implementation	Uses `rich.progress.Progress` to show a main task and optional subtasks	`create_new_subtask()`, `update_task()`, `remove_subtask()`
`ConsoleTask`	Simple console helper	Prints percentage of a single task	Not thread‑safe, used by GitHub progress
`ConsoleGtiHubProgress`	GitHub‑friendly wrapper	Falls back to console output when Rich is absent	Delegates to `ConsoleTask` instances

Key Logic Flow – update_task() checks for an active subtask; if none, it advances the base task, otherwise advances the current sub‑task.

Data Contract

Entity	Type	Role	Notes
`data` (in `compress`)	`str`	Input text to be compressed.	Raw Markdown, code snippets, or other free text.
`project_settings`	`ProjectSettings`	Provides contextual prompt information.	Includes `BASE_SETTINGS_PROMPT` and user‑defined metadata.
`model`	`Model`	LLM wrapper exposing `get_answer_without_history`.	Must be an instance of `engine.models.gpt_model.GPTModel` or compatible.
`compress_power`	`int`	Compression granularity hint.	Influences prompt construction and bucket size in higher‑level functions.
`progress_bar`	`BaseProgress`	UI feedback.	Default constructed instance if omitted.

Data Contract for GPTModel

Entity	Type	Role	Notes
`api_key`	`list[str]`	Credentials for Groq client	Default: `API_KEYS` from `config.config`.
`history`	`History`	Conversation history	Contains the system prompt on initialization.
`models_list`	`list[str]`	Candidate LLM models	Defaults include `gpt-oss-120b`, `llama‑3.3‑70b‑versatile`, `gpt-oss-safeguard‑20b`.
`use_random`	`bool`	Shuffle model list	True by default.
`client`	`Groq`	Active Groq client instance	Re‑instantiated when key changes.
`messages`	`list[dict[str,str]]`	Chat messages	Either `history.history` or provided `prompt`.
`regen_models_name`	`list[str]`	Remaining models to try	Updated during error handling.
`current_model_index`	`int`	Index in `regen_models_name`	Rotated after a failed request.
`current_key_index`	`int`	Index in `api_keys`	Rotated after a failed request.
`result`	`str`	LLM response	Returned to caller; logged at level 2.

Repository Content Packing – `CodeMix`

Entity	Type	Role	Notes
`root_dir`	Path	Base directory for traversal	Default `.` resolved to absolute path.
`ignore_patterns`	list[str]	Patterns to skip	Used by `should_ignore`.
`logger`	`BaseLogger`	Logging helper	Emits ignored‑file messages.
`should_ignore(path)`	method	Determines if a file/directory should be excluded	Uses `fnmatch` against path parts and basename.
`build_repo_content()`	method	Generates a Markdown representation of the repository	Returns a single string.

Logic Flow

Append a header “Repository Structure:”.
Walk the file tree (rglob("*")).
For each path:
- Skip if should_ignore(path) is True (log at level 1).
- Calculate depth and indentation; add a line with either <dir_name>/ or file name.
Append a separator of equal signs.
Walk again, this time adding file contents:
- For each file not ignored, write <file path="relative_path"> marker, file text (UTF‑8, ignore errors), then a newline.
- Catch read errors and include an error message line.
Return the joined string.

Result – A consolidated Markdown block describing the repository layout and all source file contents.

`generate_code_file`

Action	Description	Dependencies
Calls `CodeMix(project_directory, config.ignore_files)`	Builds a flattened source string	`preprocessor.code_mix.CodeMix`
Stores result in `self.doc_info.code_mix`	Centralised repository of raw code	`DocInfoSchema.code_mix`
Updates progress	Signals step completion	`BaseProgress.update_task()`

Side‑effect – Emits an InfoLog entry for start/finish.

Code Splitting & Chunking Logic

while True:
    have_to_change = False
    for i, el in enumerate(splited_by_files):
        if len(el) > max_symbols * 1.5:
            splited_by_files.insert(i+1, el[i][int(max_symbols / 2):])
            splited_by_files[i] = el[i][:int(max_symbols / 2)]
            have_to_change = True

    if have_to_change == False:
        break

Purpose – Iteratively bisects any source fragment that exceeds 1.5 × the maximum allowed symbol count (max_symbols).
Behaviour – The loop terminates once every element in splited_by_files is below the threshold.
Side‑effects – Mutates splited_by_files in‑place and records progress via BaseLogger.
Edge – If an element is exactly at the threshold it is not split, ensuring minimal churn.

Key Functions

Function	Purpose	Parameters	Returns	Notes
`compress(data: str, project_settings: ProjectSettings, model: Model, compress_power) -> str`	Sends a single text block to an LLM for compression using a dynamic prompt.	`data`: text to compress `project_settings`: contextual prompts `model`: LLM interface (`Model`) `compress_power`: numeric hint for prompt length	Compressed text string returned by the LLM	Uses `get_BASE_COMPRESS_TEXT(len(data), compress_power)` to prepend token‑limit instructions.
`compress_and_compare(data: list, model: Model, project_settings: ProjectSettings, compress_power: int = 4, progress_bar: BaseProgress = BaseProgress()) -> list`	Aggregates multiple items into compressed buckets of size `compress_power`.	`data`: list of strings to compress `model`, `project_settings`: as above `compress_power`: how many items per bucket `progress_bar`: progress feedback	List of compressed strings; each element represents a bucket	Logs progress via `BaseProgress`.
`compress_to_one(data: list, model: Model, project_settings: ProjectSettings, compress_power: int = 4, progress_bar: BaseProgress = BaseProgress()) -> str`	Iteratively merges the list returned by `compress_and_compare` until a single string remains.	Same as `compress_and_compare`	Final single compressed document	Uses a loop that halves the list size; `new_compress_power` is reduced to `2` when the remaining list is smaller than `compress_power+1`.

Compressor Module – `autodocgenerator.preprocessor.compressor`

Core Responsibility
The compressor module condenses raw source‑code or documentation fragments into a smaller representation suitable for LLM‑based processing. It orchestrates one‑to‑many compression passes, progressively merging chunks until a single compressed document remains.

Compression Logic Flow

Prompt Construction
- System messages: project‑specific prompt (project_settings.prompt) and a size‑aware compression directive (get_BASE_COMPRESS_TEXT).
- User message: the raw data string.
LLM Invocation
- model.get_answer_without_history(prompt) is called synchronously; the response is a compressed string.
Batch Compression
- compress_and_compare groups incoming items (data list) into buckets of compress_power.
- Each bucket’s aggregated content is compressed via compress, appended with a newline.
Recursive Reduction
- compress_to_one repeatedly calls compress_and_compare, reducing the list size until a single string is produced.
- When the remaining list length is below compress_power + 1, the bucket size is lowered to 2 to ensure convergence.
Result
- A single Markdown string that encapsulates the entire repository or documentation section, ready for further post‑processing or file output.

Re‑assembly into Fixed‑Size Parts

curr_index = 0
for el in splited_by_files:
    if len(split_objects) - 1 < curr_index:
        split_objects.append("")

    if len(split_objects[curr_index]) + len(el) > max_symbols * 1.25:
        curr_index += 1
        split_objects.append(el)
        continue

    split_objects[curr_index] += "\n" + el

Goal – Concatenate split fragments into split_objects such that each accumulated string stays within 1.25 × max_symbols.
Result – Returns a list of strings, each a “clean” chunk ready for LLM consumption.
Logging – BaseLogger reports the final chunk count. ` tag and contain no file paths, extensions, or generic terms.

Send to the LLM.
Return the raw answer.

Result – A formatted, tag‑prefixed description suitable for insertion into documentation sections.

DocFactory – Orchestrating Module Execution

class DocFactory:
    def generate_doc(self, info: dict, model: Model, progress: BaseProgress) -> DocHeadSchema:

Purpose – Sequentially runs each BaseModule, splits the result on anchor markers, and aggregates DocHeadSchema.
Workflow
1. Create a sub‑task counter in progress.
2. For every module, call module.generate(info, model).
3. If with_splited is True, split the returned string using split_text_by_anchors and add each fragment to doc_head with its key.
4. Log at level 1 (module finished) and level 2 (raw output).
5. Increment progress, remove sub‑task after all modules finish.

Entity	Type	Role	Notes
`info`	`dict`	Repository metadata	Provided by `Manager`; keys: `code_mix`, `full_data`, `global_info`, `language`, …
`model`	`Model`	LLM wrapper	Handles key rotation, history, and HTTP calls.
`progress`	`BaseProgress`	UI progress	Tracks sub‑task count and updates.
`doc_head`	`DocHeadSchema`	Result container	Holds named `DocContent` entries.

Documentation Data Structures

class DocContent(BaseModel):
    content: str

Holds a raw markdown fragment.

class DocHeadSchema(BaseModel):
    content_orders: list[str] = []
    parts: dict[str, DocContent] = {}

    def add_parts(self, name, content: DocContent):
        ...

    def get_full_doc(self, split_el: str = "\n") -> str:
        ...
    def __add__(self, other: "DocHeadSchema") -> "DocHeadSchema":
        ...

Ordering – content_orders preserves insertion order for deterministic rendering.
Merging – __add__ concatenates two schemas, ensuring no key clashes by renaming.

class DocInfoSchema(BaseModel):
    global_info: str = ""
    code_mix: str = ""
    doc: DocHeadSchema = Field(default_factory=DocHeadSchema)

Aggregates the global metadata, source mix, and generated documentation.

Warning – All schemas derive from pydantic and are serialisable via dict(). No custom validation beyond field types.

`generete_custom_discription(splited_data: str, model: Model, custom_description: str, language: str = "en") → str`

Entity	Type	Role	Notes
`splited_data`	`str`	Iterable of code snippets	The function iterates over it, treating each element as a chunk (likely intended to be a list).
`model`	`Model`	LLM wrapper	`get_answer_without_history`.
`custom_description`	`str`	Prompt text	Task description for the LLM.
`language`	`str`	Target language	System instruction.
`result`	`str`	Accumulated answer	Returned when a valid response is found.

Logic Flow

Loop over splited_data.
For each sp_data, build a prompt with context, BASE_CUSTOM_DISCRIPTIONS, and the task.
Query the LLM.
If the answer does not contain !noinfo or “No information found”, or if !noinfo occurs past 30 chars, break the loop.
Return the first satisfactory result.

Result – A concise description generated for the supplied custom task, or an empty string if no info is available.

`generete_custom_discription_without(model: Model, custom_description: str, language: str = "en") → str`

Entity	Type	Role	Notes
`model`	`Model`	LLM wrapper	Calls `get_answer_without_history`.
`custom_description`	`str`	Prompt text	Task description for the LLM.
`language`	`str`	Target language	System instruction.

Logic Flow

Construct a prompt that includes a strict rule block: the answer must begin with a single `

`gen_doc_parts` – Orchestrator

Entity	Type	Role	Notes
`full_code_mix`	`str`	All source text	Input to be split.
`max_symbols`	`int`	Size threshold	Determines chunk boundaries.
`model`	`Model`	LLM wrapper	Same contract as above.
`project_settings`	`ProjectSettings`	Configuration holder	Provides `prompt`.
`language`	`str`	Target language
`progress_bar`	`BaseProgress`	UI feedback	Can be a concrete implementation.
`global_info`	`str	None`	Repository‑wide metadata

Workflow

Split – split_data(full_code_mix, max_symbols) produces a list of chunks.
Progress – create_new_subtask tracks number of chunks.
Iterate – For every chunk:
- Call write_docs_by_parts.
- Append result to all_result.
- Keep only the last 3 k characters of the current result for context in the next call (result = result[len(result) - 3000:]).
- Update progress.
Finalize – Remove subtask, log total length, and return all_result.

Edge Cases

Scenario	Current Behaviour	Notes
`split_data` returns empty list	No iterations; `all_result` is empty	No docs produced.
`model.get_answer_without_history` throws	Exception propagates	No retry logic in this layer.
`progress_bar` is a stub	Silent execution	Progress not displayed.

`generate_global_info`

Parameter	Default	Role
`compress_power`	`4`	Compression aggressiveness
`max_symbols`	`10000`	Token‑size of initial chunk

Flow

split_data(full_code_mix, max_symbols) – chunk the code mix.
compress_to_one – feeds chunks to llm_model with project settings; returns a single‑string global doc fragment.
Persist fragment to global_info.md.
Store in self.doc_info.global_info.
Log and advance progress.

Note – compress_power controls how many top‑level sections the compressor keeps.

`generete_doc_parts`

Parameter	Default	Role
`max_symbols`	`5_000`	Max chunk size for part generation
`with_global_file`	`False`	Whether to prepend global content

Sequence

Read the cached global_info file (ignores passed with_global_file).
Call gen_doc_parts with:
- full_code_mix
- max_symbols
- llm_model
- project settings from config
- language from config
- progress bar
- global_info payload.
Persist raw output to output_doc.md.
Split output into sections by anchors (split_text_by_anchors).
Store each section into self.doc_info.doc (DocContent).

Output – The fully stitched Markdown string, later refined by post‑processors.

`write_docs_by_parts` – One‑Chunk LLM Pass

Entity	Type	Role	Notes
`part`	`str`	Raw code fragment	Must be a single chunk from `gen_doc_parts`.
`model`	`Model`	LLM wrapper	Exposes `get_answer_without_history`.
`project_settings`	`ProjectSettings`	Configuration holder	Supplies `prompt` and other constants.
`prev_info`	`str	None`	Last LLM output
`language`	`str`	Target language	e.g., `"en"`.
`global_info`	`str	None`	Repository‑wide metadata

Prompt Assembly

System messages – three entries:
- Language directive ("For the following task use language {language}").
- Global project metadata (project_settings.prompt).
- Pre‑defined completion template (BASE_PART_COMPLITE_TEXT).
Optional system messages – appended when global_info or prev_info exist.
User message – the actual part of source code.

Important – The LLM receives no history; each chunk is independent except for prev_info, which is supplied as a system prompt.

Response Handling

answer: str = model.get_answer_without_history(prompt=prompt)
temp_answer = answer.removeprefix("```")

Removes leading triple backticks that some LLMs prepend.
If the response is identical to temp_answer, the function returns it directly.
Otherwise, trailing ``` fences are stripped and the cleaned string is returned.

`factory_generate_doc`

Parameter	Type	Role
`doc_factory`	`DocFactory`	Provides a list of `BaseModule` instances
`to_start`	`bool`	Determines whether to prepend or append generated content

Workflow

Compose a context dictionary:

info = {
    "language": config.language,
    "full_data": curr_doc,           # existing markdown
    "code_mix": self.doc_info.code_mix,
    "global_info": self.doc_info.global_info
}

Log the module names and input keys.
Invoke doc_factory.generate_doc(info, llm_model, progress_bar).
Merge result with self.doc_info.doc either at start or end.
Increment progress.

Key point – DocFactory internally iterates over its BaseModule children, each of which calls the LLM via generete_custom_discription or similar functions.

`order_doc`

Action	Description
Calls `get_order`	Reorders `self.doc_info.doc.content_orders` according to LLM‑generated suggestions.

Result – Final section ordering is stored in doc_info.doc.

Cross‑Module Interactions

sorting.py imports Model from engine.models.model and logging classes from ui.logging.
extract_links_from_start and split_text_by_anchors work together to convert a raw Markdown document into an anchor‑based mapping.
get_order relies on a Model implementation that implements get_answer_without_history; no internal state is mutated.
CodeMix uses fnmatch for ignore logic and logs via BaseLogger; it is independent of any LLM components.
All modules expose purely functional logic; any persistence or higher‑level orchestration occurs elsewhere in the Auto‑Doc Generator pipeline.

Cross‑Module Interactions

Component	Interaction
`compress` → `engine.models.model.Model`	Calls `get_answer_without_history(prompt)` to obtain LLM output.
`compress` → `engine.config.config.get_BASE_COMPRESS_TEXT`	Generates a system instruction based on input length and compression power.
`compress_and_compare` → `BaseProgress`	Creates a sub‑task and updates it per iteration.
`compress_to_one` → `compress_and_compare`	Re‑uses it to progressively merge data.
`ProjectSettings` → `compress` / `compress_and_compare`	Supplies `project_settings.prompt` for system messages.

Note – All imports are explicit; no implicit external library usage beyond those declared.

Cross‑Module Interactions

Uses BASE_INTRODACTION_CREATE_LINKS, BASE_INTRO_CREATE, BASE_CUSTOM_DISCRIPTIONS from engine.config.config.
Relies on GPTModel (subclass of Model) to perform LLM calls.
Logs via BaseLogger/InfoLog; no error handling is performed—exceptions bubble to the caller.
Functions are pure helpers; no state is held within the module.

Observations

generete_custom_discription iterates over a str, which likely is an unintended bug if a list of strings is expected.
All functions return raw LLM responses; downstream code is responsible for formatting and integration.
Logging verbosity can be adjusted through InfoLog level parameter.

Default Ignore List

Pattern	Effect
`.pyo`, `.pyd`, `.pdb`, `.pkl`, `.log`, `.sqlite3`, `*.db`	Binary and log files
`venv`, `env`, `.venv`, `.env`, `.vscode`, `.idea`, `*.iml`, `.gitignore`, `.ruff_cache`	Virtual environments, IDE files, caching
`.pyc`, `__pycache__`, `.git`, `.coverage`, `htmlcov`, `migrations`, `.md`, `static`, `staticfiles`, `.mypy_cache`	Compiled Python, CI artifacts, markdown, static assets, type‑check cache

Usage – Passed to CodeMix constructor to exclude unwanted files from the generated content.

Semantic Ordering via LLM

Entity	Type	Role	Notes
`model`	`Model`	LLM wrapper	Must expose `get_answer_without_history`.
`chanks`	list[str]	Section titles to reorder	Passed directly into the prompt.
`logger`	`BaseLogger`	Logging helper	Records start/end of ordering.
`prompt`	list[dict]	Messages sent to the LLM	Contains a user role prompt that instructs the model to return a comma‑separated list.
`result`	str	Raw LLM output	Returned by `get_answer_without_history`.
`new_result`	list[str]	Cleaned, ordered titles	Result of splitting and trimming `result`.

Logic Flow

Log the start of ordering and the titles to process.
Build a single user message asking the LLM to sort the titles semantically, keeping # prefixes and not adding explanatory text.
Call model.get_answer_without_history(prompt).
Split the returned string on commas, strip whitespace, and store in new_result.
Log the final list and return it.

Result – A list of titles in a LLM‑determined semantic order.

Anchor Extraction Logic

Entity	Type	Role	Notes
`extract_links_from_start(chunks)`	function	Parses a list of Markdown‑section strings for a leading `<a name="..."></a>` tag.	Returns a list of anchor links (prefixed with `#`) and a boolean indicating whether the first chunk should be discarded.
`links`	list[str]	Collected anchor names	Each anchor name longer than 5 characters is considered valid.
`have_to_del_first`	bool	Flag for removal of the first chunk	If any chunk lacks a valid anchor, the first chunk is marked for deletion.

Logic Flow

Iterate over chunks.
For each chunk, strip whitespace and search for the regex pattern ^<a name=["\']?(.*?)["\']?>.
If a match is found and the captured name exceeds five characters, append #<name> to links.
If no match exists for a chunk, set have_to_del_first to True.
Return (links, have_to_del_first).

Result – A tuple used by split_text_by_anchors to identify and manage anchor boundaries.

Text Splitting by Anchors

Entity	Type	Role	Notes
`text`	str	Raw README content	Expected to contain `<a name="..."></a>` anchors.
`chunks`	list[str]	Sub‑strings separated by the anchor regex	Derived via `re.split`.
`result_chanks`	list[str]	Cleaned, non‑empty chunks	Trims whitespace.
`all_links, have_to_del_first`	tuple	Result from `extract_links_from_start`	Determines if the first chunk must be removed.
`result`	dict[str, str]	Mapping of anchor link → section text	Returned by the function.

Logic Flow

Split text at every anchor point using (?=<a name=...) (look‑ahead).
Trim whitespace from each resulting chunk.
Invoke extract_links_from_start on the cleaned chunks.
Detect if the overall file starts with an anchor or the first chunk should be dropped; if so, pop(0).
Verify that the number of links matches the number of remaining chunks; otherwise raise an exception.
Build a dictionary mapping each #anchor to its corresponding section text and return it.

Result – A deterministic mapping of anchors to their associated Markdown sections.

File Path Helpers

Method	Purpose	Notes
`get_file_path(file_key)`	Resolve absolute cache file path	Uses `FILE_NAMES` mapping
`read_file_by_file_key(file_key)`	Load cached content	Returns `None` on failure

`clear_cache`

If the configuration flag pbc.save_logs is False, the method deletes the cached log file.

`save`

Writes the fully assembled documentation (self.doc_info.doc.get_full_doc()) to output_doc.md in the cache directory.

Error Handling

read_file_by_file_key swallows all exceptions and returns None.
All LLM calls inside other methods propagate ModelExhaustedException if no key remains.
No explicit try/except around network or file operations beyond the minimal wrapper, keeping responsibility at the caller level.

Summary of Manager Responsibilities

Cache Management – Creates and cleans .auto_doc_cache.
Source Aggregation – Produces a single code_mix string from the repo.
Global Compression – Condenses the entire code mix into one Markdown snippet.
Chunked Generation – Breaks the mix into manageable parts, queries the LLM, stitches results.
Factory‑based Expansion – Allows plug‑in modules to add or modify sections.
Ordering & Persistence – Orders sections and writes the final README.md.

All interactions are strictly local or via the provided Model and DocFactory interfaces; no external services are invoked outside of the LLM wrapper.

Edge Cases & Error Handling

Scenario	Current Behavior	Potential Issue
`data` contains more than `compress_power` items	Processed in multiple passes	None
LLM returns empty string	`compress` returns an empty string	Missing documentation content
`progress_bar` is the base class (no UI)	Operations run silently	UI feedback unavailable
`model.get_answer_without_history` raises an exception	Propagates upwards	No retry logic implemented

`autodocgenerator.postprocessor.custom_intro` – Module Overview

This module provides helper utilities for enriching the generated Markdown with hyperlinks, introductory text, and custom‑section descriptions. All LLM interactions are delegated to a Model instance passed in as an argument. Logging is performed via the singleton BaseLogger.

Summary

This fragment delivers a logging singleton and progress reporting utilities, together with a PowerShell bootstrapper that scaffolds GitHub Actions and configuration files for the Auto‑Doc Generator. All classes are lightweight, rely only on the standard library (plus rich for CLI progress), and expose a consistent API for the rest of the pipeline.

Summary

The provided fragment implements the chunk‑based documentation pipeline:

Chunking – Splits raw source into size‑bounded parts, ensuring no individual chunk exceeds a token‑like limit.
LLM Pass – Each part is sent to a configured GPT model (Model) with a rich system prompt derived from ProjectSettings and optional contextual messages.
Aggregation – Results are concatenated, trimmed, and returned as a single markdown string.
Schema – Generated fragments are wrapped in DocHeadSchema/DocInfoSchema for later assembly.

All interactions are pure except for logger and progress updates, keeping the core logic deterministic and testable.

Example Call Sequence

# Inside Manager.generate_doc_parts
gpt = GPTModel()
answer = gpt.generate_answer(
    with_history=True,
    prompt=[{"role":"user","content":"Explain this function"}]
)

generate_answer pulls the current conversation history.
Iterates over regen_models_name and api_keys.
On success, returns Markdown string; on total exhaustion, propagates ModelExhaustedException.

Constraints & Observations

Async Support – AsyncGPTModel is a placeholder; the asynchronous logic remains unimplemented.
Error Handling – Generic Exception is caught; specific Groq errors are not distinguished.
Logging Level – Answer: {result} is logged at level 2; consumers can tune verbosity via BaseLogger.

Observations & Edge Cases

extract_links_from_start assumes the anchor appears at the start of a chunk; any deviation may lead to have_to_del_first being True.
split_text_by_anchors raises a generic Exception if anchor–chunk counts mismatch. No recovery strategy is included.
get_order expects the LLM to honor the instruction “leave # in title”; malformed output will be included as‑is.
CodeMix.build_repo_content writes a literal newline "\n\n" after each file block; if a file contains this pattern, duplication may occur.
All logging levels are set to level=1 or default; higher granularity is not provided in the snippets.

The file defines several top‑level keys:

project_name – the title of the documentation set.
language – the language to use for generated text.
ignore_files – a list of glob patterns that will be skipped by the generator. Typical values include cache directories, byte‑code, virtual env folders, database files, logs, git artefacts, IDE folders and markdown files.
build_settings – controls the build process:
- save_logs – Boolean to keep or discard the log file.
- log_level – numeric verbosity (e.g., 2).
structure_settings – governs the layout of the output:
- include_intro_links – add hyperlinks to the introduction.
- include_intro_text – add explanatory introductory text.
- include_order – maintain a defined order for sections.
- use_global_file – whether to pull content from a shared file.
- max_doc_part_size – maximum character count per document chunk (here 5000).
project_additional_info – free‑form fields, such as a project “global idea” description.
custom_descriptions – a list of template strings that can contain placeholders and are inserted into the final documentation. The example items illustrate installing the generator, describing configuration options, and using the Manager class.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

1.6.6.3

Apr 16, 2026

1.6.6.1

Apr 11, 2026

1.6.6.0

Apr 11, 2026

1.6.5.9

Apr 11, 2026

1.6.4.9

Apr 9, 2026

1.6.4.7

Apr 9, 2026

1.6.3.7

Apr 5, 2026

1.6.3.5

Apr 4, 2026

1.6.3.4

Apr 4, 2026

1.6.3.1

Apr 4, 2026

1.6.0.9

Apr 3, 2026

1.6.0.8

Apr 3, 2026

1.6.0.6

Apr 2, 2026

1.6.0.5

Apr 2, 2026

1.6.0.4

Apr 2, 2026

1.6.0.3

Apr 2, 2026

1.6.0.2

Apr 2, 2026

1.6.0.1

Apr 2, 2026

1.6.0.0

Apr 2, 2026

1.5.9.9

Apr 2, 2026

1.4.9.6

Mar 22, 2026

1.4.9.5

Mar 20, 2026

1.4.9.2

Mar 20, 2026

1.4.9.1

Mar 20, 2026

1.4.9.0

Mar 20, 2026

1.1.9.0

Mar 20, 2026

1.1.8.9

Mar 20, 2026

1.1.8.8

Mar 20, 2026

1.0.6.8

Mar 19, 2026

1.0.6.6

Mar 19, 2026

1.0.5.6

Mar 18, 2026

1.0.5.0

Mar 18, 2026

1.0.4.0

Mar 18, 2026

1.0.3.9

Mar 18, 2026

1.0.3.5

Mar 18, 2026

This version

1.0.3.3

Mar 18, 2026

0.9.3.1

Feb 7, 2026

0.9.3.0

Feb 7, 2026

0.9.2.8

Feb 6, 2026

0.9.2.7

Feb 6, 2026

0.9.2.5

Feb 5, 2026

0.9.0.4

Jan 28, 2026

0.9.0.3

Jan 28, 2026

0.9.0.2

Jan 28, 2026

0.9.0.1

Jan 28, 2026

0.9.0.0

Jan 28, 2026

0.8.9.9

Jan 27, 2026

0.8.9.8

Jan 27, 2026

0.8.9.7

Jan 27, 2026

0.8.9.6

Jan 27, 2026

0.8.9.5

Jan 26, 2026

0.8.9.1

Jan 26, 2026

0.8.9

Jan 26, 2026

0.8.8

Jan 26, 2026

0.8.7

Jan 26, 2026

0.8.6

Jan 26, 2026

0.8.5.9

Jan 26, 2026

0.8.5.8

Jan 26, 2026

0.8.5.7

Jan 26, 2026

0.8.5.6

Jan 26, 2026

0.8.5.4

Jan 26, 2026

0.8.5.3

Jan 26, 2026

0.8.5.2

Jan 26, 2026

0.8.5.1

Jan 26, 2026

0.8.5

Jan 25, 2026

0.8.4

Jan 25, 2026

0.8.3

Jan 25, 2026

0.8.1

Jan 25, 2026

0.8.0

Jan 25, 2026

0.7.9

Jan 25, 2026

0.7.6

Jan 25, 2026

0.7.5

Jan 25, 2026

0.7.4

Jan 23, 2026

0.7.3

Jan 23, 2026

0.7.2

Jan 23, 2026

0.7.1

Jan 23, 2026

0.7.0

Jan 23, 2026

0.6.9

Jan 23, 2026

0.6.8

Jan 23, 2026

0.6.5

Jan 22, 2026

0.6.3

Jan 22, 2026

0.6.2

Jan 22, 2026

0.6.1

Jan 22, 2026

0.6.0

Jan 21, 2026

0.5.9

Jan 21, 2026

0.5.8

Jan 21, 2026

0.5.5

Jan 21, 2026

0.5.4

Jan 21, 2026

0.5.3

Jan 19, 2026

0.5.2

Jan 19, 2026

0.5.1

Jan 19, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

autodocgenerator-1.0.3.3.tar.gz (60.3 kB view details)

Uploaded Mar 18, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

autodocgenerator-1.0.3.3-py3-none-any.whl (47.1 kB view details)

Uploaded Mar 18, 2026 Python 3

File details

Details for the file autodocgenerator-1.0.3.3.tar.gz.

File metadata

Download URL: autodocgenerator-1.0.3.3.tar.gz
Upload date: Mar 18, 2026
Size: 60.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.3.2 CPython/3.12.13 Linux/6.14.0-1017-azure

File hashes

Hashes for autodocgenerator-1.0.3.3.tar.gz
Algorithm	Hash digest
SHA256	`947afb68afb13560b982d04332dfff14cad1b17355d3b212ec161dd52e0454e2`
MD5	`9f68f9d7a947531f14cd7a17bb299ed1`
BLAKE2b-256	`64672d68212ca6862fb4dbe8373e8c728fa872d24ae9041c2442bac3b7a94569`

See more details on using hashes here.

File details

Details for the file autodocgenerator-1.0.3.3-py3-none-any.whl.

File metadata

Download URL: autodocgenerator-1.0.3.3-py3-none-any.whl
Upload date: Mar 18, 2026
Size: 47.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.3.2 CPython/3.12.13 Linux/6.14.0-1017-azure

File hashes

Hashes for autodocgenerator-1.0.3.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f92dc8688b8b23af211d7e358a39972247e6c3e470c861d6ca6db082aa4047e4`
MD5	`449c633043f06a3599d5c40255036bd3`
BLAKE2b-256	`08ec55693bc7aea4ad9093873324677189d2c3e94dd8cb6742619bf21ddbc468`

See more details on using hashes here.

autodocgenerator 1.0.3.3

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Auto‑Doc Generator

1. Project Title

2. Project Goal

3. Core Logic & Principles

3.1 Pipeline Overview

3.2 LLM Wrapper

3.3 Pre‑processing

3.4 Post‑processing

3.5 Logging & UI

4. Key Features

5. Dependencies

Executive Navigation Tree

get_all_html_links(data: str) → list[str]

get_introdaction(global_data: str, model: Model, language: str = "en") → str

get_links_intro(links: list[str], model: Model, language: str = "en") → str

Intro Modules – Automated Introduction Sections

Welcome Message Display

Logging Strategy

Logging Component – autodocgenerator/ui/logging.py

Logger Initialization

BaseModule – Contract for Documentation Builders

CustomModule – Context‑Rich Description

CustomModuleWithOutContext – Self‑Contained Descriptions

Install PowerShell Script – install.ps1

Installer Shell Script

Manager – Repository‑level Orchestration

__init__ – Construction & Cache Preparation

Related Configuration

Config Reader – Settings Loader

StructureSettings

read_config

Module Summary

Engine Exceptions – LLM Availability Guard

Summary

Supporting Module – model.py

Run File – Orchestration Entry Point

gen_doc

Flow

Key Interactions

Data Contract

Project Metadata (pyproject.toml)

ProjectSettings (preprocessor.settings)

GPTModel: LLM Request Handler

Model Hierarchy & History Context

Progress Interface – autodocgenerator/ui/progress_base.py

Data Contract

Data Contract for GPTModel

Repository Content Packing – CodeMix

Logic Flow

generate_code_file

Code Splitting & Chunking Logic

Key Functions

Compressor Module – autodocgenerator.preprocessor.compressor

Compression Logic Flow

Re‑assembly into Fixed‑Size Parts

DocFactory – Orchestrating Module Execution

Documentation Data Structures

generete_custom_discription(splited_data: str, model: Model, custom_description: str, language: str = "en") → str

generete_custom_discription_without(model: Model, custom_description: str, language: str = "en") → str

gen_doc_parts – Orchestrator

Workflow

Edge Cases

generate_global_info

generete_doc_parts

write_docs_by_parts – One‑Chunk LLM Pass

Prompt Assembly

Response Handling

factory_generate_doc

order_doc

Cross‑Module Interactions

Cross‑Module Interactions

Cross‑Module Interactions

`get_all_html_links(data: str) → list[str]`

`get_introdaction(global_data: str, model: Model, language: str = "en") → str`

`get_links_intro(links: list[str], model: Model, language: str = "en") → str`

Logging Component – `autodocgenerator/ui/logging.py`

Install PowerShell Script – `install.ps1`

`init` – Construction & Cache Preparation

Supporting Module – `model.py`

Project Metadata (`pyproject.toml`)

`ProjectSettings` (preprocessor.settings)

Progress Interface – `autodocgenerator/ui/progress_base.py`

Repository Content Packing – `CodeMix`

`generate_code_file`

Compressor Module – `autodocgenerator.preprocessor.compressor`

`generete_custom_discription(splited_data: str, model: Model, custom_description: str, language: str = "en") → str`

`generete_custom_discription_without(model: Model, custom_description: str, language: str = "en") → str`

`gen_doc_parts` – Orchestrator

`generate_global_info`

`generete_doc_parts`

`write_docs_by_parts` – One‑Chunk LLM Pass

`factory_generate_doc`

`order_doc`

`clear_cache`

`save`

`autodocgenerator.postprocessor.custom_intro` – Module Overview