autodocgenerator

This Project helps you to create docs for your projects

These details have not been verified by PyPI

Project description

Executive Navigation Tree

📦 Installation & Setup
- Install Workflow Powershell Unix
- Install Scripts Behaviour
📄 Documentation Generation
🛠️ Module & Manager Architecture
📊 Metadata & Configuration
🔧 Processing & Compression
🤖 AI Model Integration

To set up the automated installation for both Windows PowerShell and Unix‑like shells, follow these steps in your GitHub Actions workflow:

1. PowerShell bootstrap (Windows agents)

Use Invoke‑RestMethod (irm) to download the PowerShell bootstrap script directly from the repository and execute it in the same pipeline:

- name: Run PowerShell installer
  if: runner.os == 'Windows'
  shell: pwsh
  run: |
    irm raw.githubusercontent.com/Drag-GameStudio/ADG/main/installer.ps1 | iex

irm fetches the script content.
The pipeline pipes the content to iex (Invoke‑Expression) so the script runs immediately without writing a temporary file.

2. Shell bootstrap (Linux/macOS agents)

Use curl to fetch the Unix shell bootstrap script and pipe it to bash:

- name: Run Unix installer
  if: runner.os != 'Windows'
  run: |
    curl -sSL raw.githubusercontent.com/Drag-GameStudio/ADG/main/installer.sh | bash

-sSL makes curl silent, follows redirects, and ensures TLS is used.
The script is streamed directly into bash for execution.

3. Add required secret for the installer

The installer expects an API key to be provided via a secret environment variable. Define a secret named GROCK_API_KEY in the repository settings (Settings → Secrets → Actions) with the key you obtained from the Grock documentation.

env:
  GROCK_API_KEY: ${{ secrets.GROCK_API_KEY }}

4. Complete workflow example

name: Install ADG

on:
  push:
    branches: [ main ]

jobs:
  install:
    runs-on: ${{ matrix.os }}
    strategy:
      matrix:
        os: [ windows-latest, ubuntu-latest, macos-latest ]

    steps:
      - uses: actions/checkout@v3

      # Provide the API key to all steps
      - name: Set API key
        env:
          GROCK_API_KEY: ${{ secrets.GROCK_API_KEY }}

      # Windows PowerShell installer
      - name: Run PowerShell installer
        if: runner.os == 'Windows'
        shell: pwsh
        run: |
          irm raw.githubusercontent.com/Drag-GameStudio/ADG/main/installer.ps1 | iex

      # Linux/macOS shell installer
      - name: Run Unix installer
        if: runner.os != 'Windows'
        run: |
          curl -sSL raw.githubusercontent.com/Drag-GameStudio/ADG/main/installer.sh | bash

Key points to remember

Use the appropriate command (irm + iex for PowerShell, curl + bash for Unix) to pull and execute the remote bootstrap script in one step.
Protect the API key by storing it as a secret (GROCK_API_KEY) and referencing it via ${{ secrets.GROCK_API_KEY }}.
The workflow runs on all major runner OSes, selecting the correct installer automatically.

Installation Scripts (`install.ps1` / `install.sh`)

Script	Action	Result
`install.ps1`	Creates `.github/workflows` directory; writes a GitHub Actions workflow YAML (`autodoc.yml`); writes `autodocconfig.yml` containing current folder name and language `"en"`	PowerShell‑based setup for Windows
`install.sh`	Same as above using Bash; uses `mkdir -p`; writes the workflow YAML with escaped `${{…}}`; creates `autodocconfig.yml` with `project_name` derived from `pwd`	Cross‑platform (Linux/macOS) setup

Both scripts end with a success message printed to the console. The autodoc configuration file is a YAML document with the following top‑level keys and sub‑options:

project_name – a string that defines the name of the project.

language – a string indicating the language for generated documentation (e.g., “en”).

build_settings – a map containing:

save_logs – boolean, whether to keep generation logs.
log_level – integer, level of detail for logging.

structure_settings – a map that controls documentation layout:

include_intro_links – boolean, include introductory hyperlinks.
include_order – boolean, preserve the order of sections.
max_doc_part_size – integer, maximum size (in characters) for each documentation part.

project_additional_info – a map for extra project metadata, for example:

global idea – a descriptive string about the project’s purpose.

custom_descriptions – a list of strings, each providing custom prompts or explanations that will be incorporated into the generated documentation.

When writing the file, use proper YAML indentation, include the keys above, and set values according to the desired behavior. No other options are shown in the provided context.

`DocFactory` – Orchestrator of Module Pipeline

Responsibility – Executes a sequence of BaseModule instances, concatenates their outputs, logs progress, and updates a BaseProgress sub‑task.

Logic Flow

__init__(*modules) stores modules list and creates BaseLogger.
generate_doc(info, model, progress)
- Calls progress.create_new_subtask with module count.
- Iterates over self.modules:
  • module.generate(info, model) → module_result.
  • Appends module_result and two newlines to output.
  • Logs module completion (InfoLog).
  • Logs raw module output at level 2.
  • Calls progress.update_task().
- After loop, progress.remove_subtask().
- Returns the aggregated output.

Data Contract

Entity	Type	Role	Notes
`modules`	`list[BaseModule]`	Generation steps	Provided at construction
`logger`	`BaseLogger`	Central logger	Uses `InfoLog`/`ErrorLog` etc.
`info`	`dict`	Shared context for all modules	Passed unchanged
`model`	`Model`	LLM interface for modules	Same instance reused
`progress`	`BaseProgress`	Progress UI controller	Must support sub‑task API
Return	`str`	Full documentation string	Newlines separate parts

`gen_doc` – Orchestrator of Documentation Generation

Responsibility – Creates the AI models, configures a Manager, drives the full generation cycle, and returns the final markdown document.

Logic Flow

Instantiate GPTModel (sync) and AsyncGPTModel (async) with API_KEY.
Initialise Manager with project_path, the Config, both models, and a ConsoleGtiHubProgress bar.
Sequential calls on manager:
- generate_code_file() – extracts source files.
- generete_doc_parts(max_symbols=structure_settings.max_doc_part_size) – splits docs.
- factory_generate_doc(DocFactory(*custom_modules)) – applies custom modules.
- Conditional order_doc() if include_order is true.
- Conditional factory_generate_doc(DocFactory(IntroLinks())) if include_intro_links is true.
- clear_cache() – removes temporary artefacts.
Retrieve the assembled output via manager.read_file_by_file_key("output_doc").

Data Contract

Entity	Type	Role	Notes
`project_path`	`str`	Filesystem root of the target project	Passed unchanged to `Manager`.
`config`	`Config`	Global configuration (language, ignores, etc.)	Produced by `read_config`.
`custom_modules`	`list[CustomModule]`	Extension points for description generation	May be empty.
`structure_settings`	`StructureSettings`	Controls chunk size & optional sections	Influences `generete_doc_parts` and ordering.
Return value	`str`	Final assembled documentation	Read from manager’s internal cache.

Warning – The function assumes API_KEY is defined in autodocgenerator.engine.config.config; missing or invalid keys will raise at model construction time.

write_docs_by_parts Interaction

Entity	Type	Role	Notes
`part`	`str`	Code fragment to document	Supplied by caller
`model`	`Model`	LLM interface	Provides `get_answer_without_history`
`project_settings`	`ProjectSettings`	Holds `prompt` for global context	Inserted into system messages
`prev_info`	`str	None`	Prior generated doc snippet
`language`	`str`	Desired output language	Defaults `"en"`
Return	`str`	Generated documentation snippet	Stripped of surrounding markdown fences if present

The function builds a prompt (system messages + optional previous info + user part), calls the model, trims leading/trailing ``````` fences, logs the result, and returns the cleaned answer.

async_write_docs_by_parts Async Generation

Mirrors write_docs_by_parts but awaits async_model.get_answer_without_history within a semaphore‑protected block. It also accepts an optional update_progress callback invoked after the LLM call.

gen_doc_parts Orchestration

Entity	Type	Role
`full_code_mix`	`str`	Whole source to split
`max_symbols`	`int`	Size limit for chunks
`model`	`Model`	Synchronous LLM
`project_settings`	`ProjectSettings`	Global prompt
`language`	`str`	Output language
`progress_bar`	`BaseProgress`	Visual progress tracking
Return	`str`	Concatenated documentation

Calls split_data, iterates over each chunk, invokes write_docs_by_parts, appends results, trims the tail to keep a 3000‑character context window, updates the progress bar, and finally returns the aggregated documentation.

async_gen_doc_parts Asynchronous Orchestration

Functions identically to gen_doc_parts but creates tasks that run async_write_docs_by_parts concurrently (semaphore limit = 4) and gathers their results. Progress updates are performed via the supplied update_progress lambda. The final concatenated documentation is returned.

`generate_discribtions_for_code` – LLM‑Driven Doc Generation

Role – Sends each code snippet to a model with a strict “no‑hallucination” prompt, collects the responses.

Entity	Type	Role	Notes
`data`	`list[str]`	Input	Code snippets.
`model`	`Model`	Input	Must support `get_answer_without_history`.
`project_settings`	`ProjectSettings`	—	Not used directly.
`progress_bar`	`BaseProgress`	Internal	Per‑snippet progress.
`describtions`	`list[str]`	Output	Model‑generated documentation strings.

Flow

Create sub‑task sized to len(data).
For each snippet, build a prompt that forces the model to produce a self‑contained description.
Call model.get_answer_without_history.
Append the answer to describtions.
Update progress, remove sub‑task, return the list.

`Manager` – Core Coordination Component

Responsibility – Orchestrates the end‑to‑end documentation pipeline: prepares cache, reads/writes intermediate files, drives code‑mix creation, part‑wise generation, factory‑based enrichment, and final ordering.

Visible Interactions

Imports preprocessing utilities (split_data, gen_doc_parts, CodeMix), post‑processing helpers (get_introdaction, get_all_html_links, get_links_intro), model abstractions (Model, AsyncModel), and UI logging/progress classes.
Persists files under self.CACHE_FOLDER_NAME using paths from self.FILE_NAMES.
Delegates heavy work to external functions (gen_doc_parts, DocFactory.generate_doc, split_text_by_anchors, get_order).

Logic Flow

__init__(project_directory, config, sync_model=None, async_model=None, progress_bar=BaseProgress())
- Stores arguments, creates a logger with file target logs.
- Ensures cache directory exists (.auto_doc_cache).
read_file_by_file_key(file_key) → opens self.get_file_path(file_key) and returns its UTF‑8 content.
get_file_path(file_key) → builds os.path.join(project_directory, CACHE_FOLDER_NAME, FILE_NAMES[file_key]).
generate_code_file() →
- Instantiates CodeMix with project_directory and config.ignore_files.
- Calls cm.build_repo_content to write the code mix to code_mix.txt.
generete_doc_parts(max_symbols=5_000) →
- Reads full code mix, logs start, invokes gen_doc_parts(full_code_mix, max_symbols, sync_model, config.get_project_settings(), config.language, progress_bar).
- Writes the returned markdown to output_doc.md.
factory_generate_doc(doc_factory) →
- Reads current doc and code mix, builds info dict (language, full_data, code_mix).
- Calls doc_factory.generate_doc(info, sync_model, progress_bar).
- Prepends the new fragment to existing doc and writes back.
order_doc() →
- Splits current doc by anchors via split_text_by_anchors, orders sections with get_order, and rewrites the file.
clear_cache() → deletes logs file when config.pbc.save_logs is False.

Data Contract

Method	Input(s)	Output	Side Effects
`__init__`	`project_directory: str`, `config: Config`, optional models, `progress_bar`	`Manager` instance	Creates cache folder, sets up logger
`read_file_by_file_key`	`file_key: str` (key in `FILE_NAMES`)	`str` file content	None
`get_file_path`	`file_key: str`	`str` absolute path	None
`generate_code_file`	–	–	Writes `code_mix.txt`
`generete_doc_parts`	`max_symbols: int`	–	Writes `output_doc.md`
`factory_generate_doc`	`doc_factory: DocFactory`	–	Updates `output_doc.md`
`order_doc`	–	–	Rewrites `output_doc.md`
`clear_cache`	–	–	May delete `report.txt`

Warning – All file operations assume the referenced keys exist in FILE_NAMES; missing keys raise KeyError.

Package Initializer – Logger Bootstrap

Functional role
The autodocgenerator/__init__.py module bootstraps the AutoDocGenerator package by:

Emitting a simple runtime banner (print("ADG")).
Importing the core logging classes from autodocgenerator.ui.logging.
Instantiating a global logger object (BaseLogger).
Attaching a concrete logging template (BaseLoggerTemplate) to the logger.

This makes a ready‑to‑use logger available to every submodule that imports autodocgenerator.

`BaseModule` – Abstract Generation Unit

Responsibility – Serves as the contract for all documentation modules; subclasses must implement generate(info: dict, model: Model).

Logic Flow

Inherits from ABC.
Defines abstract generate.
Provides a no‑op __init__.

Data Contract

Entity	Type	Role	Notes
`info`	`dict`	Input data for generation	Expected keys are module‑specific
`model`	`Model`	LLM wrapper used by modules	Imported from `engine.models.model`
Return	`str`	Generated documentation fragment	Must be a string

`CustomModule` – Context‑Aware Description Generator

Responsibility – Produces a custom description using the code mix (split to ≤ 5000 symbols) and language.

Logic Flow

Stores discription (sic) on init.
generate(info, model) →
- Calls split_data(info.get("code_mix"), max_symbols=5000).
- Passes result, model, stored discription, and info.get("language") to generete_custom_discription.
- Returns the obtained string.

Data Contract

Entity	Type	Role	Notes
`discription`	`str`	Template text for generation	Set via constructor
`info["code_mix"]`	`str`	Source code fragment	May be absent → `None`
`info["language"]`	`str`	Target language for output	Optional
Return	`str`	Generated custom description	From `generete_custom_discription`

`CustomModuleWithOutContext` – Context‑Free Description Generator

Responsibility – Generates a description without processing any code context.

Logic Flow

Stores discription on init.
generate(info, model) → calls generete_custom_discription_without(model, self.discription, info.get("language")) and returns its result.

Data Contract

Entity	Type	Role	Notes
`discription`	`str`	Fixed template	Constructor argument
`info["language"]`	`str`	Desired language	Optional
Return	`str`	Description string	From `generete_custom_discription_without`

Custom Intro Processor Functions

Responsibility – Generates enriched introductory sections and HTML link lists for the documentation pipeline. It extracts anchors, prompts a Model to create link‑based intros, full introductions, and optional custom descriptions.

`CodeMix` – Repository Content Packager

Responsibility – Walks a directory tree, writes a textual representation of its structure and the raw content of each non‑ignored file to an output file.

Entity	Type	Role	Notes
`root_dir`	`Path`	Input	Base directory (resolved).
`ignore_patterns`	`list[str]`	Input	Glob patterns for files/dirs to skip.
`logger`	`BaseLogger`	Internal	Logs actions at configurable levels.
`should_ignore(path)`	`bool`	Output	True if `path` matches any ignore pattern.
`build_repo_content(output_file)`	`None`	Side‑effect	Writes structure + file bodies to output_file.

Logic Flow

should_ignore computes a relative path, then checks glob matches against the full path, basename, and any path component.
build_repo_content opens output_file.
First pass: writes “Repository Structure” with indentation proportional to depth, skipping ignored entries.
Writes a separator line.
Second pass: for each non‑ignored file, writes <file path="…"> header, the file’s text (read with UTF‑8, errors ignored), and a blank line; errors are recorded in the output.

Interaction – Relies solely on the standard library (os, pathlib, fnmatch) and the internal BaseLogger. No external services are invoked.

`logging.py` Classes and Flow

Entity	Type	Role	Notes
`BaseLog`	class	Holds raw `message` and numeric `level`; provides default `format()`	`level` defaults 0
`ErrorLog` / `WarningLog` / `InfoLog`	subclasses of `BaseLog`	Override `format()` to prepend timestamp & severity	Use `_log_prefix` property
`BaseLoggerTemplate`	class	Routes a `BaseLog` to an output; respects `log_level` filter	`global_log()` checks filter
`FileLoggerTemplate`	subclass of `BaseLoggerTemplate`	Writes formatted logs to a file path	Overrides `log()`
`BaseLogger`	singleton class	Central façade exposing `set_logger()` and `log()`	Stores a single `logger_template` instance

Logic flow

A log object (e.g., ErrorLog("msg")) is instantiated → _log_prefix builds "[YYYY‑MM‑DD HH:MM:SS]".
BaseLogger().set_logger(logger_template) injects a concrete template (BaseLoggerTemplate or FileLoggerTemplate).
BaseLogger().log(log_obj) forwards to logger_template.global_log.
global_log checks self.log_level; if -1 or log_level >= log.level, it calls log() of the template, which either prints or appends to file.

`progress_base.py` Classes and Flow

Entity	Type	Role	Notes
`BaseProgress`	abstract class	Defines progress‑related API (`create_new_subtask`, `update_task`, `remove_subtask`)	Methods are placeholders
`LibProgress`	subclass of `BaseProgress`	Wraps rich `Progress`; tracks a base task and optional sub‑task	`update_task()` advances the appropriate task
`ConsoleTask`	helper class	Prints simple textual progress updates	`progress()` increments and prints percent
`ConsoleGtiHubProgress`	subclass of `BaseProgress`	Uses `ConsoleTask` for both general and sub‑tasks	`gen_task` created for overall progress

Logic flow

LibProgress.__init__ creates a rich base task (total defaults 4).
create_new_subtask(name, total_len) registers a sub‑task and stores its ID.
update_task() advances the sub‑task if present; otherwise advances the base task.
remove_subtask() clears the sub‑task reference.
ConsoleGtiHubProgress mirrors this behavior with ConsoleTask objects, printing textual percentages.

`IntroLinks` – HTML Link Intro Builder

Responsibility – Extracts HTML links from full_data and formats an introductory block using the model.

Logic Flow

generate(info, model) →
- links = get_all_html_links(info.get("full_data")).
- intro_links = get_links_intro(links, model, info.get("language")).
- Returns intro_links.

Data Contract

Entity	Type	Role	Notes
`info["full_data"]`	`Any`	Raw data containing HTML links	Passed to `get_all_html_links`
`info["language"]`	`str`	Output language	Optional
Return	`str`	Intro text with links	From `get_links_intro`

`IntroText` – General Introduction Generator

Responsibility – Generates a textual introduction from global_data via the model.

Logic Flow

generate(info, model) →
- intro = get_introdaction(info.get("global_data"), model, info.get("language")).
- Returns intro.

Data Contract

Entity	Type	Role	Notes
`info["global_data"]`	`Any`	Source content for intro	Consumed by `get_introdaction`
`info["language"]`	`str`	Desired language	Optional
Return	`str`	Introduction paragraph	From `get_introdaction`

Warning – All modules assume required keys exist in info; missing keys will cause None to be passed to downstream functions, which may raise errors if not handled.

External Interactions

Imports GPTModel, Model (model abstractions).
Uses UI logging classes BaseLogger, InfoLog for progress reporting.
Relies on configuration constants BASE_INTRODACTION_CREATE_LINKS, BASE_INTRO_CREATE, BASE_CUSTOM_DISCRIPTIONS.
No file I/O is performed here; all persistence is handled upstream/downstream.

Execution Flow

get_all_html_links(data) – Scans data with regex <a name=["']?(.*?)["']?</a>; collects anchors longer than 5 chars, prefixes “#”. Logs start/completion.
get_links_intro(links, model, language) – Builds a system‑prompt using BASE_INTRODACTION_CREATE_LINKS, adds the links list, calls model.get_answer_without_history, returns the generated intro.
get_introdaction(global_data, model, language) – Similar prompt flow with BASE_INTRO_CREATE; returns plain introduction text.
generete_custom_discription(splited_data, model, custom_description, language) – Iterates over splited_data, constructs a detailed system prompt (technical analyst + BASE_CUSTOM_DISCRIPTIONS), stops when the model returns a substantive answer (absence of “!noinfo”/“No information found”). Returns the first valid result.
generete_custom_discription_without(model, custom_description, language) – Sends a single prompt that enforces a mandatory `

Logic Flow

Build tool opens pyproject.toml.
Parses [project] section → extracts metadata fields.
Parses [build-system] → ensures poetry-core is available.
Emits a ProjectMetadata object; downstream steps use it to construct distribution files.

Functional Role

The fragment declares static package metadata and dependency constraints for the Auto Doc Generator project. It serves as the single source of truth for packaging, installation, and build‑time resolution of required libraries.

Visible Interactions

Build backend (poetry.core.masonry.api) reads the file to generate pyproject‑metadata and wheel/sdist artifacts.
Dependency resolvers (pip, poetry) consume the dependencies list to install required packages.
IDE tooling may display the metadata (name, version, authors) for user reference.

Data Contract

Entity	Type	Role	Notes
Input file	toml	Source of configuration	Must be present at project root
Output artifact	wheel / sdist	Packaged distribution	Contains metadata derived from this fragment
Side effects	none at runtime	Build‑time only	No code execution, only static analysis

This documentation isolates the pyproject.toml fragment, adhering strictly to the provided content without external inference.

`get_order` – Semantic Title Sorting

Responsibility – Sends the list of anchor titles to a Model for semantic ordering, then assembles the ordered markdown.

Entity	Type	Role	Notes
`model`	`Model`	Input	Must implement `get_answer_without_history`.
`chanks`	`dict[str, str]`	Input	Anchor → content mapping from `split_text_by_anchors`.
`result`	`str`	Internal	Raw comma‑separated answer from model.
`new_result`	`list[str]`	Internal	Stripped title list.
`order_output`	`str`	Output	Concatenated content in the model’s order.

Logic Flow

Log start and inputs.
Build prompt requesting a comma‑separated list of titles (anchors kept).
Call model.get_answer_without_history(prompt).
- If the model lacks this method, an AttributeError is raised.
Split and strip the answer → new_result.
Iterate new_result, append chanks[el] to order_output with a trailing newline, logging each addition.
Return the assembled string.

Data Contract

Function	Input(s)	Output	Side Effects
`get_all_html_links`	`data: str`	`list[str]` of “#anchor” links	Logs extraction
`get_links_intro`	`links: list[str]`, `model: Model`, `language: str`	`str` intro with links	Logs generation
`get_introdaction`	`global_data: str`, `model: Model`, `language: str`	`str` full intro	None
`generete_custom_discription`	`splited_data: str`, `model: Model`, `custom_description: str`, `language: str`	`str` description (or empty)	Logs each attempt
`generete_custom_discription_without`	`model: Model`, `custom_description: str`, `language: str`	`str` description with mandatory tag	None

⚠️ Warning – All prompts assume the provided Model implements get_answer_without_history. If the model is absent or mis‑configured, the functions will raise an AttributeError.

`StructureSettings` – Runtime Documentation Layout Settings

Entity	Type	Role	Notes
`include_intro_links`	`bool`	Toggle insertion of intro‑links section	Defaults `True`
`include_order`	`bool`	Toggle automatic ordering of generated parts	Defaults `True`
`max_doc_part_size`	`int`	Maximum symbols per documentation chunk	Default `5 000`
`load_settings`	method	Populates attributes from a dict	Assumption – keys match attribute names exactly

Assumption – No validation is performed; unknown keys become new attributes.

`ProjectSettings` – Project‑Specific Prompt Builder

Role – Holds a project name and arbitrary key/value pairs, exposing a prompt property that concatenates the base settings prompt with those values.

Entity	Type	Role	Notes
`project_name`	`str`	Input	Identifier for the project.
`info`	`dict`	Internal	Arbitrary metadata added via `add_info`.
`add_info(key, value)`	`method`	Mutator	Stores `key: value` in `info`.
`prompt`	`property`	Output	`BASE_SETTINGS_PR` + project name + all `info` entries.

Flow

add_info inserts a key/value pair.
prompt starts with BASE_SETTINGS_PROMPT, appends “Project Name: …”, then each stored pair as key: value.

Note: All components rely solely on the standard library and the internal engine and ui packages; no external services are invoked beyond the supplied Model implementations.

`pyproject.toml` Project Metadata

Entity	Type	Role	Notes
`name`	string	Package identifier	`"autodocgenerator"`
`version`	string	Semantic version	`"0.9.0.1"`
`description`	string	One‑line summary	Helps create docs for projects
`authors`	list of tables	Author contact info	`{name, email}`
`license`	table	SPDX‑compatible licensing	`text = "MIT"`
`readme`	string	Path to long description	`"README.md"`
`requires-python`	string	Python version constraint	`">=3.11,<4.0"`
`dependencies`	list of strings	Runtime requirements	44 packages listed
`build-system.requires`	list of strings	Build‑time requirements	`["poetry-core>=2.0.0"]`
`build-system.build-backend`	string	Build backend entry point	`"poetry.core.masonry.api"`

Assumption: The file is processed by PEP 621‑compatible tools (e.g., poetry). No other runtime logic is present. ` tag and strict lexical rules; returns the model’s raw answer.

`read_config` – YAML Configuration Loader

Responsibility – Parses a YAML string, builds a Config object, instantiates custom description modules, and prepares a StructureSettings instance for the generation pipeline.

Logic Flow

yaml.safe_load → data dict.
Extract ignore_files, language, project_name, project_additional_info, build_settings, custom_descriptions, structure_settings.
Initialise Config() → pcs = ProjectBuildConfig(); pcs.load_settings(build_settings).
Chain setters: set_language, set_project_name, set_pcs.
Append each ignore pattern via config.add_ignore_file.
Populate additional info with config.add_project_additional_info.
Build custom_modules list: entries starting with % become CustomModuleWithOutContext(custom_desc[1:]); otherwise CustomModule(custom_desc).
Load structure settings into a fresh StructureSettings object.

Data Contract

Entity	Type	Role	Notes
`file_data`	`str`	Raw YAML content	Must be UTF‑8 valid
`config`	`Config`	Central project configuration	Populated via setters
`custom_modules`	`list[CustomModule]`	User‑supplied description hooks	Context‑aware vs. context‑less
`structure_settings_object`	`StructureSettings`	Controls doc chunking & ordering	Loaded from `structure_settings` dict
Return value	`tuple[Config, list[CustomModule], StructureSettings]`	Combined runtime configuration	Order fixed as shown

`extract_links_from_start` – Anchor Extraction

Responsibility – Scans a list of markdown chunks and returns a list of anchor links (#anchor) whose name length > 5.

Entity	Type	Role	Notes
`chunks`	`list[str]`	Input	Raw markdown sections.
`links`	`list[str]`	Output	Collected `#`‑prefixed anchors.

Assumption – Each chunk may start with an HTML <a name="…"></a> tag.

`split_text_by_anchors` – Chunk Partitioning

Responsibility – Splits a markdown text at anchor tags, validates a 1:1 mapping, and builds a dict {anchor: chunk}.

Entity	Type	Role	Notes
`text`	`str`	Input	Full markdown document.
`result_chanks`	`list[str]`	Internal	Trimmed, non‑empty pieces.
`all_links`	`list[str]`	Internal	Output of `extract_links_from_start`.
`result`	`dict[str, str]`	Output	Mapping anchor → chunk; `None` if lengths differ.

Logic Flow

Regex (?=<a name=…) creates split points.
Whitespace‑trim each piece.
Call extract_links_from_start → all_links.
If len(all_links) != len(result_chanks) → return None.
Populate result by index.

`split_data` – Input Chunking for Symbol‑Limited Processing

Role – Divides a raw documentation string into smaller fragments that do not exceed a given symbol count, preparing the text for downstream model calls that have size limits.

Entity	Type	Role	Notes
`data`	`str`	Input	Complete documentation text to be partitioned.
`max_symbols`	`int`	Input	Maximum number of characters allowed per output fragment.
`split_objects`	`list[str]`	Internal	Accumulates the generated fragments; initially empty.
`splited_by_files`	`list[str]`	Internal	Result of `data.split(...)`; the delimiter is not visible in the fragment.
`BASE_PART_COMPLITE_TEXT`	`str` (import)	External	Constant from the engine config; purpose not evident in the shown code.
`ProjectSettings`, `BaseProgress`, `BaseLogger`, …	imports	External	Imported but not referenced in the visible fragment.

Note: The delimiter used in data.split(...) and any subsequent processing steps are omitted from the provided snippet; therefore, the full transformation logic cannot be described.

Flow

Initialize split_objects as an empty list.
Split the input string data using an unspecified delimiter (data.split("…")) and store the resulting list in splited_by_files.
(Further processing – e.g., chunk size enforcement, aggregation into split_objects – is not present in the supplied fragment and therefore not documented.)

Visible Interactions

Relies solely on Python's built‑in str.split method; no external APIs or custom classes are invoked within the visible portion.
Imports from the engine and UI layers are present but unused in the excerpt, indicating that additional functionality may be added later in the function.

Critical Assumption: Without the remainder of the function, the exact mechanism for respecting max_symbols and how the fragments are returned remains undefined.

End of split_data documentation.

split_data Loop Adjustments

Entity	Type	Role	Notes
`splited_by_files`	`list[str]`	Source fragments after initial split	Populated before this fragment
`max_symbols`	`int`	Size threshold	Used with multipliers `1.5` and `1.25`
`have_to_change`	`bool`	Loop‑control flag	Starts `False`, set `True` when a slice is re‑partitioned
`i`, `el`	`int`, `str`	Iterator values	`enumerate(splited_by_files)`

Logic flow

Enter an indefinite while loop.
Iterate over splited_by_files.
If a segment length exceeds max_symbols * 1.5, insert a new slice at i+1 containing the right half (el[i][int(max_symbols/2):]) and truncate the original to the left half (el[i][:int(max_symbols/2)]).
Set have_to_change = True.
After the for loop, break when no segment required further splitting.

Assumption: el[i] indexing is safe because el is a string; the code relies on Python slicing semantics.

split_data Chunk Assembly

Entity	Type	Role	Notes
`split_objects`	`list[str]`	Accumulator for final parts	Initialized earlier as `[]`
`curr_index`	`int`	Current output bucket index	Starts `0`
`el` (loop variable)	`str`	Current piece from `splited_by_files`	Processed sequentially

Logic flow

Ensure split_objects has an entry for curr_index; append empty string if missing.
If adding el would make the current bucket exceed max_symbols * 1.25, increment curr_index, start a new bucket with el, and continue.
Otherwise, append "\n"+el to the existing bucket.

The loop yields split_objects, a list of strings each respecting the size heuristic.

`compress` – Text Chunk Compression

Role – Sends a single text fragment to a language model with a project‑specific system prompt and a base compression prompt, then returns the model’s raw answer.

Entity	Type	Role	Notes
`data`	`str`	Input	Text to compress.
`project_settings`	`ProjectSettings`	Input	Provides the system prompt via its `prompt` property.
`model`	`Model`	Input	Must implement `get_answer_without_history`.
`compress_power`	`int`	Input	Determines the compression strength.
`prompt`	`list[dict]`	Internal	System and user messages built for the model.
`answer`	`str`	Output	Raw response from `model.get_answer_without_history`.

Flow

Assemble prompt with three messages: project settings, base‑compression text (get_BASE_COMPRESS_TEXT(10000, compress_power)), and the target data.
Call model.get_answer_without_history(prompt=prompt).
Return the obtained answer.

`compress_and_compare` – Batch Compression & Comparison

Role – Groups input strings into bundles of size compress_power, compresses each via compress, and returns a list where each element contains the concatenated compressed results of a bundle.

Entity	Type	Role	Notes
`data`	`list[str]`	Input	Source strings to compress.
`model`	`Model`	Input	Used by `compress`.
`project_settings`	`ProjectSettings`	Input	Passed to `compress`.
`compress_power`	`int`	Input	Bundle size (default 4).
`progress_bar`	`BaseProgress`	Internal	Updates progress per item.
`compress_and_compare_data`	`list[str]`	Output	List sized ⌈len(data)/compress_power⌉.

Flow

Allocate a result list sized for the number of bundles.
Create a sub‑task for progress.
Iterate over data; compute curr_index = i // compress_power.
Append compress(el, …) plus newline to the corresponding slot.
Update progress after each element.
Remove the sub‑task and return the list.

`async_compress` – Async Single‑Item Compression

Role – Async counterpart of compress; respects a semaphore to limit concurrent model calls and reports progress.

Entity	Type	Role	Notes
`data`	`str`	Input	Text to compress.
`project_settings`	`ProjectSettings`	Input	Provides system prompt.
`model`	`AsyncModel`	Input	Async `get_answer_without_history`.
`compress_power`	`int`	Input	Compression strength.
`semaphore`	`asyncio.Semaphore`	Input	Limits parallel calls.
`progress_bar`	`BaseProgress`	Input	Updates after each call.
`answer`	`str`	Output	Model’s response.

Flow

Acquire semaphore.
Build the same three‑message prompt as compress.
Await model.get_answer_without_history(prompt=prompt).
Increment progress, release semaphore, return answer.

`async_compress_and_compare` – Parallel Batch Compression

Role – Dispatches async_compress for each element, runs up to four concurrent tasks, and merges results into bundles of size compress_power.

Entity	Type	Role	Notes
`data`	`list[str]`	Input	Items to compress.
`model`	`AsyncModel`	Input	Async model.
`project_settings`	`ProjectSettings`	Input	System prompt source.
`compress_power`	`int`	Input	Bundle size (default 4).
`progress_bar`	`BaseProgress`	Internal	Tracks overall progress.
`compressed_elements`	`list[str]`	Internal	Individual async results.
`final_data`	`list[str]`	Output	Concatenated bundles.

Flow

Create a semaphore of 4 permits.
Queue a task for each element via async_compress.
Await asyncio.gather(*tasks).
Slice the flat list into chunks of compress_power and join with newlines.

`compress_to_one` – Recursive Full‑Project Compression

Role – Repeatedly compresses a list until a single string remains; optionally uses async processing.

Entity	Type	Role	Notes
`data`	`list[str]`	Input	Initial fragments.
`model`	`Model`	Input	Model for sync or async calls.
`project_settings`	`ProjectSettings`	Input	Prompt source.
`compress_power`	`int`	Input	Max items per iteration.
`use_async`	`bool`	Input	Selects async path.
`progress_bar`	`BaseProgress`	Internal	Progress for each iteration.
`count_of_iter`	`int`	Internal	Iteration counter.
`data` (final)	`str`	Output	Final compressed document.

Flow
Loop while len(data) > 1:
  • Adjust compress_power to 2 when list is smaller than compress_power+1.
  • Call either async_compress_and_compare (via asyncio.run) or compress_and_compare.
  • Increment iteration counter.
Return the sole remaining element.

`AsyncGPTModel` – Asynchronous LLM Wrapper

Responsibility – Implements AsyncModel for Groq’s async client, handling model rotation, history management, and logging.

Logic Flow

__init__ → calls ParentModel.__init__ → shuffles MODELS_NAME into regen_models_name.
Instantiates AsyncGroq client with api_key.
generate_answer logs start, selects messages from self.history.history (or a single prompt).
Loops until a model succeeds:
- Picks model_name from regen_models_name[self.current_model_index].
- Calls await self.client.chat.completions.create(messages=messages, model=model_name).
- On exception logs a warning and advances current_model_index (wrap‑around).
On success extracts chat_completion.choices[0].message.content, logs result, returns it.

Data Contract

Entity	Type	Role	Notes
`api_key`	`str`	Authentication token	Defaults to `API_KEY` from config
`history`	`History`	Message buffer for context	Provided to `ParentModel`
`use_random`	`bool`	Enables random model order	Shuffles `MODELS_NAME`
`messages`	`list[dict]`	Payload sent to Groq	Derived from `history` or `prompt`
`model_name`	`str`	Current LLM identifier	Rotated on failure
`chat_completion`	`AsyncGroq` response	Raw API result	Must contain `choices[0].message.content`
Return	`str`	Generated answer	Logged at level 2

Warning – If regen_models_name becomes empty, ModelExhaustedException is raised; ensure at least one model is configured.

`GPTModel` – Synchronous LLM Wrapper

Responsibility – Mirrors AsyncGPTModel for the synchronous Groq client, providing the same rotation and logging logic.

Logic Flow

__init__ → inherits ParentModel setup, creates Groq client.
generate_answer follows the identical loop as the async version, using self.client.chat.completions.create.
Logs success and returns the extracted content.

Data Contract – Same as AsyncGPTModel with the only difference being a synchronous client (Groq).

`Model` / `AsyncModel` – Shared Foundations

Responsibility – Provide history handling (get_answer, get_answer_without_history) and expose generate_answer placeholders overridden by subclasses.

Visible Interactions

Both subclasses call self.history.add_to_history to record user and assistant turns.
They rely on BASE_SYSTEM_TEXT, API_KEY, and MODELS_NAME imported from autodocgenerator.engine.config.config.

Data Contract

Entity	Type	Role	Notes
`history`	`History`	Stores system, user, and assistant messages	Initialized with `BASE_SYSTEM_TEXT`
`api_key`	`str`	API credential	Default from config
`regen_models_name`	`list[str]`	Rotated list of model identifiers	Shuffled when `use_random=True`
`current_model_index`	`int`	Pointer to the next model to try	Reset on each call cycle
Return of `get_answer`	`str`	Answer from underlying `generate_answer`	Also updates history

Note – Base implementations of generate_answer return the literal string "answer"; real behavior is supplied by GPTModel and AsyncGPTModel.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

1.6.6.3

Apr 16, 2026

1.6.6.1

Apr 11, 2026

1.6.6.0

Apr 11, 2026

1.6.5.9

Apr 11, 2026

1.6.4.9

Apr 9, 2026

1.6.4.7

Apr 9, 2026

1.6.3.7

Apr 5, 2026

1.6.3.5

Apr 4, 2026

1.6.3.4

Apr 4, 2026

1.6.3.1

Apr 4, 2026

1.6.0.9

Apr 3, 2026

1.6.0.8

Apr 3, 2026

1.6.0.6

Apr 2, 2026

1.6.0.5

Apr 2, 2026

1.6.0.4

Apr 2, 2026

1.6.0.3

Apr 2, 2026

1.6.0.2

Apr 2, 2026

1.6.0.1

Apr 2, 2026

1.6.0.0

Apr 2, 2026

1.5.9.9

Apr 2, 2026

1.4.9.6

Mar 22, 2026

1.4.9.5

Mar 20, 2026

1.4.9.2

Mar 20, 2026

1.4.9.1

Mar 20, 2026

1.4.9.0

Mar 20, 2026

1.1.9.0

Mar 20, 2026

1.1.8.9

Mar 20, 2026

1.1.8.8

Mar 20, 2026

1.0.6.8

Mar 19, 2026

1.0.6.6

Mar 19, 2026

1.0.5.6

Mar 18, 2026

1.0.5.0

Mar 18, 2026

1.0.4.0

Mar 18, 2026

1.0.3.9

Mar 18, 2026

1.0.3.5

Mar 18, 2026

1.0.3.3

Mar 18, 2026

0.9.3.1

Feb 7, 2026

0.9.3.0

Feb 7, 2026

0.9.2.8

Feb 6, 2026

0.9.2.7

Feb 6, 2026

0.9.2.5

Feb 5, 2026

0.9.0.4

Jan 28, 2026

0.9.0.3

Jan 28, 2026

This version

0.9.0.2

Jan 28, 2026

0.9.0.1

Jan 28, 2026

0.9.0.0

Jan 28, 2026

0.8.9.9

Jan 27, 2026

0.8.9.8

Jan 27, 2026

0.8.9.7

Jan 27, 2026

0.8.9.6

Jan 27, 2026

0.8.9.5

Jan 26, 2026

0.8.9.1

Jan 26, 2026

0.8.9

Jan 26, 2026

0.8.8

Jan 26, 2026

0.8.7

Jan 26, 2026

0.8.6

Jan 26, 2026

0.8.5.9

Jan 26, 2026

0.8.5.8

Jan 26, 2026

0.8.5.7

Jan 26, 2026

0.8.5.6

Jan 26, 2026

0.8.5.4

Jan 26, 2026

0.8.5.3

Jan 26, 2026

0.8.5.2

Jan 26, 2026

0.8.5.1

Jan 26, 2026

0.8.5

Jan 25, 2026

0.8.4

Jan 25, 2026

0.8.3

Jan 25, 2026

0.8.1

Jan 25, 2026

0.8.0

Jan 25, 2026

0.7.9

Jan 25, 2026

0.7.6

Jan 25, 2026

0.7.5

Jan 25, 2026

0.7.4

Jan 23, 2026

0.7.3

Jan 23, 2026

0.7.2

Jan 23, 2026

0.7.1

Jan 23, 2026

0.7.0

Jan 23, 2026

0.6.9

Jan 23, 2026

0.6.8

Jan 23, 2026

0.6.5

Jan 22, 2026

0.6.3

Jan 22, 2026

0.6.2

Jan 22, 2026

0.6.1

Jan 22, 2026

0.6.0

Jan 21, 2026

0.5.9

Jan 21, 2026

0.5.8

Jan 21, 2026

0.5.5

Jan 21, 2026

0.5.4

Jan 21, 2026

0.5.3

Jan 19, 2026

0.5.2

Jan 19, 2026

0.5.1

Jan 19, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

autodocgenerator-0.9.0.2.tar.gz (45.4 kB view details)

Uploaded Jan 28, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

autodocgenerator-0.9.0.2-py3-none-any.whl (39.2 kB view details)

Uploaded Jan 28, 2026 Python 3

File details

Details for the file autodocgenerator-0.9.0.2.tar.gz.

File metadata

Download URL: autodocgenerator-0.9.0.2.tar.gz
Upload date: Jan 28, 2026
Size: 45.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.3.1 CPython/3.12.12 Linux/6.11.0-1018-azure

File hashes

Hashes for autodocgenerator-0.9.0.2.tar.gz
Algorithm	Hash digest
SHA256	`85cbef124e3ee59b4a93fa90e5c37e37e14116588b2398696550f1b937c66cdf`
MD5	`2f78a0a9a82ebe28af0c330c58c56f93`
BLAKE2b-256	`ca6e7c89e3101b56dac8e421ddc7cea6defa0c44c259d5a86a22a827f60ad0f3`

See more details on using hashes here.

File details

Details for the file autodocgenerator-0.9.0.2-py3-none-any.whl.

File metadata

Download URL: autodocgenerator-0.9.0.2-py3-none-any.whl
Upload date: Jan 28, 2026
Size: 39.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.3.1 CPython/3.12.12 Linux/6.11.0-1018-azure

File hashes

Hashes for autodocgenerator-0.9.0.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`843dc20dee25523b760e4ecd0f5fb883ece3ac72a1bfcd0accbd99ec35ad17a4`
MD5	`562be6bb869eaa8030abcf86ce7dd3fa`
BLAKE2b-256	`2adb60596c3c1a3e554b4d186cf9971f2247d97560949cfec0e3a0de07c16e7c`

See more details on using hashes here.

autodocgenerator 0.9.0.2

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Executive Navigation Tree

1. PowerShell bootstrap (Windows agents)

2. Shell bootstrap (Linux/macOS agents)

3. Add required secret for the installer

4. Complete workflow example

Installation Scripts (install.ps1 / install.sh)

DocFactory – Orchestrator of Module Pipeline

gen_doc – Orchestrator of Documentation Generation

write_docs_by_parts Interaction

async_write_docs_by_parts Async Generation

gen_doc_parts Orchestration

async_gen_doc_parts Asynchronous Orchestration

generate_discribtions_for_code – LLM‑Driven Doc Generation

Manager – Core Coordination Component

Package Initializer – Logger Bootstrap

BaseModule – Abstract Generation Unit

CustomModule – Context‑Aware Description Generator

CustomModuleWithOutContext – Context‑Free Description Generator

Custom Intro Processor Functions

CodeMix – Repository Content Packager

logging.py Classes and Flow

progress_base.py Classes and Flow

IntroLinks – HTML Link Intro Builder

IntroText – General Introduction Generator

External Interactions

Execution Flow

Logic Flow

Functional Role

Visible Interactions

Data Contract

get_order – Semantic Title Sorting

Data Contract

StructureSettings – Runtime Documentation Layout Settings

ProjectSettings – Project‑Specific Prompt Builder

pyproject.toml Project Metadata

read_config – YAML Configuration Loader

extract_links_from_start – Anchor Extraction

split_text_by_anchors – Chunk Partitioning

split_data – Input Chunking for Symbol‑Limited Processing

split_data Loop Adjustments

split_data Chunk Assembly

compress – Text Chunk Compression

compress_and_compare – Batch Compression & Comparison

async_compress – Async Single‑Item Compression

async_compress_and_compare – Parallel Batch Compression

compress_to_one – Recursive Full‑Project Compression

AsyncGPTModel – Asynchronous LLM Wrapper

GPTModel – Synchronous LLM Wrapper

Model / AsyncModel – Shared Foundations

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Installation Scripts (`install.ps1` / `install.sh`)

`DocFactory` – Orchestrator of Module Pipeline

`gen_doc` – Orchestrator of Documentation Generation

write_docs_by_parts Interaction

async_write_docs_by_parts Async Generation

gen_doc_parts Orchestration

async_gen_doc_parts Asynchronous Orchestration

`generate_discribtions_for_code` – LLM‑Driven Doc Generation

`Manager` – Core Coordination Component

`BaseModule` – Abstract Generation Unit

`CustomModule` – Context‑Aware Description Generator

`CustomModuleWithOutContext` – Context‑Free Description Generator

`CodeMix` – Repository Content Packager

`logging.py` Classes and Flow

`progress_base.py` Classes and Flow

`IntroLinks` – HTML Link Intro Builder

`IntroText` – General Introduction Generator

`get_order` – Semantic Title Sorting

`StructureSettings` – Runtime Documentation Layout Settings

`ProjectSettings` – Project‑Specific Prompt Builder

`pyproject.toml` Project Metadata

`read_config` – YAML Configuration Loader

`extract_links_from_start` – Anchor Extraction

`split_text_by_anchors` – Chunk Partitioning

`split_data` – Input Chunking for Symbol‑Limited Processing

split_data Loop Adjustments

split_data Chunk Assembly

`compress` – Text Chunk Compression

`compress_and_compare` – Batch Compression & Comparison

`async_compress` – Async Single‑Item Compression

`async_compress_and_compare` – Parallel Batch Compression

`compress_to_one` – Recursive Full‑Project Compression

`AsyncGPTModel` – Asynchronous LLM Wrapper

`GPTModel` – Synchronous LLM Wrapper

`Model` / `AsyncModel` – Shared Foundations