Skip to main content

This Project helps you to create docs for your projects

Project description

Executive Navigation Tree

autodocconfig.yml Options
The autodocconfig.yml file is a YAML configuration used by ADG. The available top‑level keys are:

  • project_name: string – Name of the project.
  • language: string – Language for the generated documentation (default en).
  • ignore_files: list of glob patterns – Files or directories that should be excluded from processing.
  • project_settings: mapping – Controls ADG behavior:
    • save_logs: boolean – Whether to save logs (true/false).
    • log_level: integer – Verbosity level (e.g., 1).
  • project_additional_info: mapping – Arbitrary key‑value pairs that become part of the project’s metadata.
  • custom_descriptions: list of strings – Custom prompts or descriptions that will be turned into documentation modules.

Example structure (as shown in the repository):

project_name: "Auto Doc Generator"
language: "en"
project_settings:
  save_logs: true
  log_level: 1
project_additional_info:
  global idea: "This project was created to help developers make documentations for them projects"
custom_descriptions:
  - "explain how install workflow with install.ps1 and install.sh scripts ..."
  - "how to use Manager class what parameters i need to give ..."
  - "explain how to write autodocconfig.yml file what options are available"

ProjectConfigSettings – Runtime configuration container

ProjectConfigSettings holds transient flags used by the generation engine (e.g., save_logs, log_level).

  • Methodsload_settings(data) iterates over a dict and assigns each key/value to the instance via setattr, enabling dynamic injection from external sources (CLI, CI).
  • Data flow – Input: dict[str, any]; Output: the same object with updated attributes; no side‑effects beyond attribute mutation.

Config – Core documentation‑generator settings

Config aggregates all static options required by the Manager pipeline.

Attribute Purpose
ignore_files Glob patterns excluded during repository scanning (e.g., byte‑code, virtual‑env folders).
language ISO‑code passed to the LLM for localized output.
project_name Identifier used for title generation and ProjectSettings.
project_additional_info Arbitrary key/value pairs injected into ProjectSettings.
pcs Instance of ProjectConfigSettings controlling runtime flags.
  • Fluent settersset_language, set_pcs, set_project_name, add_project_additional_info, add_ignore_file each return self to allow chaining (e.g., Config().set_language('ru').add_ignore_file('*.tmp')).
  • get_project_settings() – Constructs a ProjectSettings object (from autodocgenerator.preprocessor.settings) with the configured project_name and any supplemental info, then returns it. This object is later consumed by the pre‑processor to embed project metadata into generated docs.

Interactions

  • Config is instantiated in the CI entry‑point and supplied to Manager.
  • Manager reads ignore_files to prune the file‑system walk, queries language for prompt localisation, and passes pcs to the logging subsystem.
  • ProjectSettings produced by get_project_settings is handed to the Preprocessor, which annotates source files before chunking.

Side effects
All setters mutate the Config instance in‑place; load_settings may overwrite existing flags. No I/O occurs here—persistence is handled elsewhere (e.g., cache cleanup in Manager).

Config Module Constants & Environment Loading

The module defines several multi‑line string templates (BASE_SYSTEM_TEXT, BASE_PART_COMPLITE_TEXT, BASE_INTRODACTION_CREATE_TEXT, BASE_INTRO_CREATE, BASE_SETTINGS_PROMPT) that drive the documentation generation workflow.
It also loads API_KEY from the environment (via dotenv) and validates its presence, raising an exception if missing.
MODELS_NAME enumerates the model identifiers used by the AI‑driven pre‑processor.

Configuration Loading

read_config parses the user‑provided autodocconfig.yml. It extracts:

  • ignore_files – patterns added to Config.ignore_files.
  • language, project_name, project_additional_info – stored in a fresh Config instance via fluent setters.
  • project_settings – mapped onto a ProjectConfigSettings object via load_settings.
  • custom_descriptions – each string is wrapped in a CustomModule (from factory.modules.general_modules).

The function returns a tuple (Config, list[CustomModule]), ready for the generation stage.


ProjectSettings Prompt Builder

ProjectSettings stores project_name and arbitrary key‑value metadata via add_info.
The prompt property concatenates BASE_SETTINGS_PROMPT with the project name and each metadata entry, producing the system‑level prompt consumed by the compressor and description generators.


Title: Using the Manager class

The Manager class is instantiated with the following parameters:

  • project_path – Path to the root of the project (e.g., ".").
  • config – An instance of Config loaded from autodocconfig.yml.
  • sync_model – A synchronous GPTModel object (created with the API key).
  • async_model – An asynchronous AsyncGPTModel object (created with the API key).
  • progress_bar – An object implementing a progress interface, e.g., ConsoleGtiHubProgress().

Example usage (mirrors the script in autodocgenerator/auto_runner/run_file.py):

from autodocgenerator.manage import Manager
from autodocgenerator.factory.base_factory import DocFactory
from autodocgenerator.factory.modules.intro import IntroLinks
from autodocgenerator.ui.progress_base import ConsoleGtiHubProgress
from autodocgenerator.auto_runner.config_reader import read_config, Config
from autodocgenerator.engine.models.gpt_model import GPTModel, AsyncGPTModel
from autodocgenerator.engine.config.config import API_KEY

# Load configuration and custom modules
with open("autodocconfig.yml", "r", encoding="utf-8") as f:
    config_data = f.read()
config, custom_modules = read_config(config_data)

# Prepare GPT models
sync_model = GPTModel(API_KEY, use_random=False)
async_model = AsyncGPTModel(API_KEY)

# Create Manager instance
manager = Manager(
    project_path=".",          # path to the project
    config=config,            # Config object
    sync_model=sync_model,    # synchronous model
    async_model=async_model,  # asynchronous model
    progress_bar=ConsoleGtiHubProgress()  # progress display
)

# Generate documentation
manager.generate_code_file()                     # scans code files
manager.generete_doc_parts(max_symbols=6000)     # creates doc fragments
manager.factory_generate_doc(DocFactory(*custom_modules))  # applies custom modules
manager.order_doc()                              # orders sections
manager.factory_generate_doc(DocFactory(IntroLinks()))      # adds intro links
manager.clear_cache()                            # cleans temporary data

# Retrieve final documentation
output_doc = manager.read_file_by_file_key("output_doc")
print(output_doc)

GPTModel – Synchronous Answer Generation

GPTModel extends Model (which itself inherits ParentModel).

  • Instantiates a synchronous Groq client and a BaseLogger.
  • generate_answer builds the request payload from either the full conversation history or a single prompt.
  • It loops over regen_models_name, attempting client.chat.completions.create; on failure it logs a warning and advances current_model_index.
  • When the list is exhausted, ModelExhaustedException is raised.
  • The final answer is extracted from chat_completion.choices[0].message.content and logged at two verbosity levels before being returned.

AsyncGPTModel – Asynchronous Answer Generation

Mirrors GPTModel but uses AsyncGroq and async def generate_answer.

  • All control flow (model rotation, error handling, logging) is identical, allowing the caller to await the result.
  • The method returns the generated string once the asynchronous request resolves.

ModelExhaustedException

class ModelExhaustedException(Exception):
    """If in list of models no one model is available for use."""
    ...

A lightweight sentinel exception raised by the model‑selection logic when all entries in MODELS_NAME are unavailable. It propagates up to the Manager, which catches it to trigger fallback handling.

ParentModel – Shared Model Configuration

ParentModel stores the API key, a mutable History object, and the shuffled list regen_models_name that drives model rotation.

  • current_model_index tracks which entry in regen_models_name is active.
  • If use_random is true the order is randomized on each instance, enabling simple load‑balancing across MODELS_NAME.

IntroLinks – HTML Link Extraction

Responsibility
Collects every HTML anchor from the full‑document markdown (full_data) and produces a concise block of link references suitable for inclusion at the top of the generated documentation.

Interactions

  • Receives a pre‑populated info dict from DocFactory.generate_doc.
  • Calls get_all_html_links (post‑processor) to parse info["full_data"].
  • Passes the extracted link list, the shared Model instance, and the target language to get_links_intro, which formats the links using the LLM.

Technical Flow

  1. links = get_all_html_links(info.get("full_data")) – regex/HTML parser returns List[Dict].
  2. intro_links = get_links_intro(links, model, info.get("language")) – invokes the model’s generate_answer to craft natural‑language link introductions.
  3. Returns intro_links (markdown string).

Data Flow

  • Input: info["full_data"] (raw doc), info["language"].
  • Output: Markdown block containing formatted links.
  • Side Effects: None; model history is updated inside get_links_intro via Model.get_answer.

IntroText – Project Introduction Generation

Responsibility
Creates a high‑level introductory paragraph that summarizes the project’s purpose, using the global metadata (global_data).

Interactions

  • Consumes info["global_data"] supplied by DocFactory.
  • Utilises the same shared Model instance to ask the LLM for a project‑specific intro via get_introdaction.

Technical Flow

  1. intro = get_introdaction(info.get("global_data"), model, info.get("language")) – triggers an LLM call.
  2. Returns the generated paragraph as a markdown string.

Data Flow

  • Input: info["global_data"], info["language"].
  • Output: Single‑paragraph markdown intro.
  • Side Effects: Model history updated inside get_introdaction.

Both classes inherit from BaseModule, exposing a uniform generate(info, model) API used by DocFactory to stitch their outputs into the final documentation before the progress bar cleanup.

HTML‑Link Extraction (get_all_html_links)

Responsibility – Scans a markdown string for <a name="…"></a> anchors and returns a list of markdown‑style links (#anchor).

Interactions – Called by post‑processing pipelines that need to reference generated sections; uses only the BaseLogger for diagnostic output.

Technical Details – Compiles a regex r'<a name=["\']?(.*?)["\']?>', iterates with re.finditer, prefixes each captured name with #, and logs count and content.

Data FlowInput: raw documentation string. Output: list[str] of #anchor links. No side‑effects beyond logging.

Global Intro Generation (get_introdaction)

Responsibility – Produces a one‑paragraph project introduction from global_data via the shared Model instance.

Interactions – Consumes info["global_data"] supplied by DocFactory; uses the same Model (e.g., GPTModel) passed through the uniform generate(info, model) API.

Technical Details – Builds a system‑prompt with BASE_INTRO_CREATE, injects the selected language, and calls model.get_answer_without_history. Returns the raw markdown paragraph.

Data FlowInput: global_data: str, language: str. Output: markdown string. Side‑effect: updates the model’s internal conversation history.

Link‑Based Intro Generation (get_links_intro)

Responsibility – Crafts an introduction that references a list of section links.

Interactions – Receives the link list from get_all_html_links; forwards it to the LLM using BASE_INTRODACTION_CREATE_TEXT.

Technical Details – Constructs a three‑message prompt (language system, intro template, user‑provided links) and calls model.get_answer_without_history.

Data FlowInput: links: list[str], language. Output: generated intro markdown. Logs progress at level 1.

Custom Description Generation (generete_custom_discription)

Responsibility – Iterates over split documentation chunks, asking the LLM to produce a titled, anchored description for a user‑defined topic.

Interactions – Uses the same Model instance; each iteration may break early if a satisfactory answer is returned.

Technical Details – For each chunk it sends a strict system prompt (rules, context, title request) and a user prompt containing custom_description. The LLM’s response must start with <a name="URL"></a> followed by the answer or special tokens (!noinfo).

Data FlowInput: splited_data: str, custom_description: str, language. Output: the first non‑empty LLM response that satisfies the rules. Side‑effects limited to model history updates and logging.

Code Description Generator

generate_discribtions_for_code sends each source file through a fixed instruction prompt that asks the model to enumerate public components, parameters, and usage examples.
Results are collected in a list; progress is tracked with a sub‑task.

Output – list of markdown‑formatted documentation strings, one per input file.

Semantic Ordering (get_order)

Responsibility – Receives a dictionary mapping anchors to chunk text and returns the chunks reordered according to LLM‑determined semantic grouping.

Interactions – Works after split_text_by_anchors; supplies the title list (list(chanks.keys())) to the LLM and rebuilds the final document order.

Technical Details – Sends a user‑only prompt requesting a comma‑separated, #‑prefixed title list; parses the response, then concatenates the corresponding chunk values.

Data FlowInput: chanks: dict[str, str]. Output: single markdown string with reordered sections. Logs each step and the final ordering.

["#global-intro-generation", "#link‑based‑intro-generation", "#custom‑description‑generation", "#semantic‑ordering‑logic"]

get_BASE_COMPRESS_TEXT Factory

def get_BASE_COMPRESS_TEXT(start, power):
    return f"""
You will receive a large code snippet (up to ~{start} characters).
...

Purpose: Returns a formatted instruction block whose size scales with start and power.
Logic Flow: Interpolates the supplied parameters into a template that specifies analysis, summary length, and a strict usage‑example clause. The returned string is later injected into prompts that guide the AI to produce concise summaries and runnable examples.

Compress Function Workflow

The compress routine receives raw text, a ProjectSettings instance, a GPT Model, and a numeric compress_power.
It builds a three‑message prompt: the project‑specific system prompt, a static compression template from get_BASE_COMPRESS_TEXT, and the user payload. The model’s get_answer_without_history returns a shortened version, which is returned directly to the caller.

Inputsdata: str, project_settings.prompt, compress_power.
Outputs – compressed string.
Side‑effects – none (pure function).


Asynchronous Compression‑And‑Compare Pipeline

async_compress mirrors the synchronous prompt creation but runs under an asyncio.Semaphore to limit concurrency.
async_compress_and_compare spawns one coroutine per element, gathers results, then re‑chunks them into groups of compress_power.
Progress is updated after each coroutine finishes.

Key parameterssemaphore (max 4 concurrent calls by default).


Synchronous Compression‑And‑Compare Pipeline

compress_and_compare iterates over a list of file contents, groups them in batches of compress_power, and concatenates each batch’s compressed results.
It uses a BaseProgress sub‑task to report progress. The resulting list length equals ceil(len(data)/compress_power).

Assumptionscompress_power ≥ 2; model is synchronous.


Document Generation Pipeline

gen_doc orchestrates the end‑to‑end documentation flow:

  1. Model Instantiation – creates a synchronous GPTModel and an asynchronous AsyncGPTModel using the global API_KEY.
  2. Manager Construction – passes the project root, parsed Config, both models, and a ConsoleGtiHubProgress bar to Manager.
  3. Code Extractionmanager.generate_code_file() scans the repository and caches source files.
  4. Chunked AI Promptingmanager.generete_doc_parts(max_symbols=6000) splits code into ≤6000‑symbol blocks and queries the GPT models.
  5. Custom Module Injectionmanager.factory_generate_doc(DocFactory(*custom_modules)) lets each CustomModule inject user‑defined sections.
  6. Ordering & Intro Linksmanager.order_doc() reorders parts; a second factory call adds IntroLinks.
  7. Cache Cleanupmanager.clear_cache() removes temporary artifacts.

Finally, manager.read_file_by_file_key("output_doc") returns the assembled markdown string, which the CI step writes to README.md.

Inputs: project path, Config, list of CustomModule.
Outputs: rendered documentation (output_doc).
Side effects: filesystem writes to the .auto_doc_cache folder and progress output to the console.

Documentation Generation Pipeline

gen_doc_parts (sync) and async_gen_doc_parts (async) invoke the splitter, then iterate over chunks, calling the respective part‑writer, concatenating results, and feeding a sliding “context window” (result[-3000:]) to preserve continuity.
Both functions drive a BaseProgress sub‑task, log final length, and return the full assembled documentation.

DocFactory – Module‑Level Documentation Assembly

DocFactory receives an ordered list of BaseModule instances.

  • generate_doc creates a sub‑task in the supplied BaseProgress, iterates over modules, invokes module.generate(info, model), concatenates results, and logs each module’s output.
  • The final documentation string is returned after progress cleanup.

Data Flow Summary

  1. Caller creates a GPTModel/AsyncGPTModel with optional history.
  2. generate_answer → selects model name → API call → logs → returns answer.
  3. Model.get_answer updates history before/after the call.
  4. DocFactory feeds the same model instance to each documentation module, stitching their outputs into the final doc.

Synchronous Part Documentation

write_docs_by_parts builds a system‑role prompt containing language, part ID, BASE_PART_COMPLITE_TEXT, and optional prev_info.
It sends the prompt to model.get_answer_without_history, strips surrounding ``` fences, logs the raw and trimmed answer, and returns the cleaned markdown.
Inputs: part_id, `part`, `model`, optional `prev_info`, `language`.
Outputs: formatted documentation string for that part.

Asynchronous Part Documentation

async_write_docs_by_parts mirrors the synchronous flow but runs inside an asyncio.Semaphore, uses await async_model.get_answer_without_history, and optionally calls update_progress.
It returns the same trimmed markdown.

Runtime Interactions with Manager & Preprocessor

  • Config is instantiated at CI entry‑point, then supplied to Manager.
  • Manager reads ignore_files from the constants, queries language for localisation, and forwards pcs to the logging subsystem.
  • The settings object produced by get_project_settings (built from the above templates) is handed to the Preprocessor, which annotates source files before chunking.

All setters mutate the Config instance in‑place; load_settings can overwrite flags, but no I/O occurs within this fragment—the persistence layer lives elsewhere (e.g., cache cleanup in Manager).

History – Conversation Buffer

Provides add_to_history(role, content) and initializes with the system prompt (BASE_SYSTEM_TEXT).
The buffer is consumed by Model.get_answer* helpers to maintain a turn‑based dialogue.

Data Splitting Logic

split_data receives the full source text and a max_symbols limit.
It iteratively chops oversized fragments ( > 1.5 × limit ) in half, then packs the pieces into split_objects ensuring each object stays ≤ 1.25 × limit.
Inputs: full_code_mix: str, max_symbols: int.
Outputs: List[str] of code parts ready for LLM processing.
Side‑effects: logs progress via BaseLogger.

BaseLogger – Singleton Facade

BaseLogger implements the Borg‑style singleton via __new__, guaranteeing a single façade instance throughout the process. The façade holds a reference to a concrete BaseLoggerTemplate (e.g., FileLoggerTemplate) set by set_logger. Calls to log() delegate to logger_template.global_log(), which respects the configured log_level before emitting the message.

Structured Log Objects

BaseLog supplies the common payload (message, level) and a timestamp prefix (_log_prefix). Sub‑classes (ErrorLog, WarningLog, InfoLog) override format() to prepend a severity tag ([ERROR], [WARNING], [INFO]) to the timestamped text. The formatted string is what the logger templates write or print.

File‑Based Persistence

FileLoggerTemplate extends BaseLoggerTemplate. Its log() opens file_path in append mode and writes log.format() + "\n". Because BaseLoggerTemplate.log() is overridden, global_log() still applies the level filter before persisting.

Progress Reporting Implementations

LibProgress wraps rich’s Progress, creating a base task and optional sub‑tasks; update_task() advances either the current sub‑task or the base task.
ConsoleGtiHubProgress provides a lightweight, stdout‑only alternative using ConsoleTask. Both classes inherit from the abstract BaseProgress, which defines the required interface (create_new_subtask, update_task, remove_subtask).

Data flow: UI components invoke BaseLogger.log(ErrorLog(...))BaseLogger forwards to the active template → formatted string written to console or file. Progress objects receive create_new_subtask/update_task calls from the documentation pipeline, emitting visual feedback without side‑effects beyond stdout or rich rendering.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

autodocgenerator-0.8.9.1.tar.gz (32.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

autodocgenerator-0.8.9.1-py3-none-any.whl (31.9 kB view details)

Uploaded Python 3

File details

Details for the file autodocgenerator-0.8.9.1.tar.gz.

File metadata

  • Download URL: autodocgenerator-0.8.9.1.tar.gz
  • Upload date:
  • Size: 32.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.3.1 CPython/3.12.12 Linux/6.11.0-1018-azure

File hashes

Hashes for autodocgenerator-0.8.9.1.tar.gz
Algorithm Hash digest
SHA256 08aec57b0e0c01794c2ff25e4f0dc31b19566681805d5462f5b8188055cc0425
MD5 e4658663d8e25c00878e88034660e213
BLAKE2b-256 18d2ebb86711ecbac85af27b6037a036fc6080d0125219c68470ced60b095c37

See more details on using hashes here.

File details

Details for the file autodocgenerator-0.8.9.1-py3-none-any.whl.

File metadata

  • Download URL: autodocgenerator-0.8.9.1-py3-none-any.whl
  • Upload date:
  • Size: 31.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.3.1 CPython/3.12.12 Linux/6.11.0-1018-azure

File hashes

Hashes for autodocgenerator-0.8.9.1-py3-none-any.whl
Algorithm Hash digest
SHA256 ede342a40fae27564fe5d776049e7a4af5540f5b6c794671ccc511eda3642c93
MD5 74b98c66548945aa06e608939b0e8fb3
BLAKE2b-256 2ed5957fcf46ac8fb3af9b00d7ea779c1cb126d59f760b463c9fac7d3650311b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page