This Project helps you to create docs for your projects
Project description
Executive Navigation Tree
-
📂 Installation & Setup
-
⚙️ Configuration & Management
-
🧩 Core Models
-
📄 Documentation Generation
-
📦 Compression Utilities
-
📈 Logging & Progress
-
🧪 Assumptions & Limitations
-
🔧 Miscellaneous
Explain how install workflow with install.ps1 and install.sh scripts for install you should use links irm https://raw.githubusercontent.com/Drag-GameStudio/ADG/main/install.ps1 | iex for PowerShell and curl -sSL https://raw.githubusercontent.com/Drag-GameStudio/ADG/main/install.sh | bash for Linux‑based systems and also you have to add secret variable to GitHub Action GROCK_API_KEY with your API key from Grock docs https://grockdocs.com to make it work.
How to use the Manager class
The Manager class is instantiated with the following parameters (as shown in the provided context):
| Parameter | Type / Description | How it is passed |
|---|---|---|
project_path |
str – Path to the root of the project |
Positional argument |
project_settings |
ProjectSettings object – Settings for the project |
Positional argument |
pcs |
ProjectConfigSettings object – Configuration settings |
Positional argument |
sync_model |
GPTModel instance – Synchronous GPT model |
Named argument |
async_model |
AsyncGPTModel instance – Asynchronous GPT model |
Named argument |
ignore_files |
list[str] – List of file patterns to ignore |
Named argument |
progress_bar |
BaseProgress subclass instance – Progress UI |
Named argument |
language |
str – Language code (e.g., "en" for English) |
Named argument |
Full example of usage
from autodocgenerator.manage import Manager
from autodocgenerator.engine.models.gpt_model import GPTModel, AsyncGPTModel
from autodocgenerator.ui.progress_base import ConsoleGtiHubProgress
from autodocgenerator.preprocessor.settings import ProjectSettings
from .config_reader import Config, read_config, ProjectConfigSettings
# 1. Load configuration (as done in the context)
with open("autodocconfig.yml", "r", encoding="utf-8") as file:
config_data = file.read()
config: Config = read_config(config_data)
# 2. Retrieve required objects from the config
project_settings: ProjectSettings = config.get_project_settings()
pcs: ProjectConfigSettings = config.pcs
ignore_list: list[str] = config.ignore_files
# 3. Create GPT model instances
sync_model = GPTModel(API_KEY, use_random=False)
async_model = AsyncGPTModel(API_KEY)
# 4. Instantiate the Manager
manager = Manager(
project_path=".", # path to the current project
project_settings=project_settings,
pcs=pcs,
sync_model=sync_model,
async_model=async_model,
ignore_files=ignore_list,
progress_bar=ConsoleGtiHubProgress(),
language="en"
)
# 5. Use the manager to generate documentation (example sequence)
manager.generate_code_file()
manager.generate_global_info_file(use_async=False, max_symbols=8000)
manager.generete_doc_parts(use_async=False, max_symbols=5000)
# (Further steps such as factory generation, cache clearing, etc. can follow
# the pattern shown in the original `gen_doc` function.)
The autodocconfig.yml file is a YAML document that can contain the following top‑level keys, as shown in the repository’s example:
- project_name – string, the name of the project (e.g.,
"Auto Doc Generator"). - language – string, language code for the documentation (e.g.,
"en"). - ignore_files – optional list of glob patterns to exclude from processing (e.g.,
"*.pyc","__pycache__"). - project_settings – mapping with configuration for the generator itself:
- save_logs – boolean, whether to keep generation logs.
- log_level – integer, verbosity level (e.g.,
2).
- project_additional_info – mapping for arbitrary additional metadata; any key/value pair can be added (e.g.,
global idea: "This project was created to help developers..."). - custom_descriptions – list of strings, each describing a custom documentation request that will be processed by the generator.
These keys are read by autodocgenerator.auto_runner.config_reader.read_config and used to build the generation configuration. No other options are defined in the provided context.
# autodocgenerator package initialisation
# autodocgenerator/__init__.py
print("ADG")
from .ui.logging import BaseLogger, BaseLoggerTemplate, InfoLog, ErrorLog, WarningLog
logger = BaseLogger()
logger.set_logger(BaseLoggerTemplate())
Responsibility
This module is the entry point for the autodocgenerator package.
Its sole purpose is to bootstrap a global logger that can be used by every sub‑module (engine, factory, pre‑processor, UI, etc.) without the need to instantiate a logger repeatedly.
Interaction with the rest of the system
- ui.logging – imports the concrete logging classes (
BaseLogger,BaseLoggerTemplate, …) which implement a thin wrapper around Rich‑styled console output. - All other modules import
autodocgenerator.logger(e.g.from autodocgenerator import logger) and calllogger.info(...),logger.error(...), etc. This centralises log configuration and guarantees a consistent format across the whole application.
Key objects & flow
BaseLogger()– creates a logger instance with default settings (level, handlers).BaseLoggerTemplate()– provides the concrete Rich‑styled formatter and handler configuration.logger.set_logger(BaseLoggerTemplate())– binds the template to the logger, finalising the output pipeline.- The
print("ADG")statement is a harmless side‑effect used during development to confirm that the package has been imported correctly.
Assumptions
- The ui.logging module is importable and its classes conform to the expected interface (
set_loggeraccepts a template instance). - No additional configuration (e.g., log file path) is required at import time; defaults are sufficient for normal operation.
Inputs / Outputs
- Input: None – the module runs automatically on import.
- Output: A configured
loggerobject available asautodocgenerator.logger; console output of the string “ADG” the first time the package is imported.
Side effects
- Writes “ADG” to STDOUT on first import (can be silenced by removing the
printline). - Registers a global logger that influences the logging behaviour of every downstream component.
Config Reader – autodocgenerator.auto_runner.config_reader
Loads autodocconfig.yml, transforms YAML into a runnable Config object and a tiny ProjectConfigSettings holder.
The module is imported by run_file.py; no external calls are required.
- Attributes:
save_logs(bool),log_level(int). load_settings(data)copies any key/value from the supplied dict onto the instance viasetattr.
Used by the manager to control logging and other runtime flags.
- Holds defaults:
ignore_files,language,project_name,project_additional_info,custom_modules, and apcsinstance. - Fluent setters (
set_language,set_pcs, …) returnselffor chaining. get_project_settings()creates a ProjectSettings object and injects any extra info.get_doc_factory()builds two DocFactory instances – one for user‑definedcustom_modules, another for built‑in intro modules (IntroLinks, optionalIntroText).
yaml.safe_load→ dict.- Populates ignore patterns, language, name, additional info.
- Instantiates
ProjectConfigSettingsand appliesproject_settingssection. - Adds custom description modules via
CustomModule. - Returns a fully‑populated Config.
Inputs: raw YAML string. Outputs:Configobject. No side‑effects besides object creation.
Run File – autodocgenerator.auto_runner.run_file
Entry‑point script (python -m autodocgenerator.auto_runner.run_file).
- Instantiates synchronous (
GPTModel) and asynchronous (AsyncGPTModel) LLM wrappers using the globalAPI_KEY. - Creates a Manager with project path, settings,
pcs, ignore list, a console progress bar (ConsoleGtiHubProgress), and language. - Calls the manager’s pipeline:
generate_code_filegenerate_global_info_file(sync, 8 k symbols)generete_doc_parts(sync, 5 k symbols)factory_generate_docfor both factories (custom + intro)clear_cache
- Returns the final assembled document via
read_file_by_file_key("output_doc").
- Reads autodocconfig.yml.
- Calls
read_config→Config. - Extracts
project_settingsand both factories. - Invokes
gen_docwith those objects and the current directory. - Stores the generated documentation in
output_doc.
Assumptions – YAML follows the expected schema; API_KEY is defined; all imported factories and UI components conform to their interfaces. No external files are written until the manager’s clear_cache step.
Prompt‑Template Constants
The module defines a collection of multi‑line string constants (BASE_SYSTEM_TEXT, BASE_PART_COMPLITE_TEXT, BASE_INTRODACTION_CREATE_TEXT, BASE_INTRO_CREATE, BASE_SETTINGS_PROMPT).
These are static prompt templates used by the AutoDoc engine to instruct the LLM at various stages:
- BASE_SYSTEM_TEXT – Global instruction for incremental snippet analysis.
- BASE_PART_COMPLITE_TEXT – Guidelines for generating concise documentation of a code fragment.
- BASE_INTRODACTION_CREATE_TEXT – Rules for building the “Executive Navigation Tree”.
- BASE_INTRO_CREATE – Template for a high‑level project overview.
- BASE_SETTINGS_PROMPT – Prompt that turns the system into a persistent project knowledge base.
All templates are referenced by the PromptManager (or similar) to build the messages sent to the language model, ensuring consistent behavior across processing phases.
Helper Function get_BASE_COMPRESS_TEXT
def get_BASE_COMPRESS_TEXT(start, power):
return f""" … """
- Purpose – Dynamically creates a compression‑prompt that adapts to the size of an incoming code snippet (
start) and a scaling factor (power). - Interaction – Called by the CompressionEngine right before a large snippet is sent to the LLM, providing a concise analysis request and a strict usage‑example skeleton.
- Inputs –
start(approx. character count of the snippet) andpower(division factor controlling the allowed summary length). - Output – A formatted prompt string containing placeholders for analysis, summary length, and an example code block.
Environment Loading & Global Settings
import os
from dotenv import load_dotenv
load_dotenv()
API_KEY = os.getenv("API_KEY")
if API_KEY is None:
raise Exception("API_KEY is not set in environment variables.")
MODELS_NAME = ["openai/gpt-oss-120b", "llama-3.3-70b-versatile", "openai/gpt-oss-safeguard-20b"]
- Responsibility – Loads the
.envfile, extracts the requiredAPI_KEY, and aborts early if it is missing, guaranteeing that the LLM client can authenticate. MODELS_NAME– Provides a default list of model identifiers the engine may select from when generating prompts.- Interaction – These globals are imported by the LLMClient and ModelSelector components, which rely on
API_KEYfor authentication andMODELS_NAMEfor model resolution.
Summary
config.py centralises all static prompts, the dynamic compression‑prompt builder, and essential runtime configuration (environment variables and default model list). It serves as the single source of truth for textual instructions and credentials, enabling other engine modules (PromptManager, CompressionEngine, LLMClient) to operate without hard‑coded strings or duplicated logic. This separation keeps the core processing code clean and makes updates to prompts or credentials straightforward.
Exception ModelExhaustedException
class ModelExhaustedException(Exception):
"""Raised when no model in the rotation list is usable."""
- Role – Signals that every model in
regen_models_namehas failed, forcing the caller to abort or retry with a new configuration. - Used by –
GPTModelandAsyncGPTModelduring the retry loop.
Core hierarchy (model.py)
| Class | Responsibility | Key members |
|---|---|---|
| History | Holds the chat history sent to the LLM. Initialized with BASE_SYSTEM_TEXT (global system prompt). |
history list, add_to_history() |
| ParentModel | Supplies shared configuration: API key, history container, model‑list shuffling, and current‑model index. | api_key, history, current_model_index, regen_models_name |
| Model (sync) | Implements thin wrappers around generate_answer(). Provides get_answer() (records user → assistant exchange) and a no‑history shortcut. |
generate_answer() (placeholder), get_answer*() |
| AsyncModel (async) | Same contract as Model but with async methods. |
generate_answer(), get_answer*() (awaited) |
Assumptions: BASE_SYSTEM_TEXT, API_KEY, and MODELS_NAME are defined in config.config. The history starts with the system prompt unless overridden.
Concrete providers (gpt_model.py)
-
GPTModel– Synchronous wrapper around the Groq client.- Initializes
self.client = Groq(api_key=self.api_key)and aBaseLogger. generate_answer()builds the message payload (history or explicitprompt), then iterates overregen_models_nametrying each model until a successfulchat.completions.createcall. On failure it logs a warning, advances the index (wrap‑around), and retries. When a model succeeds it returnschat_completion.choices[0].message.contentand logs the result.
- Initializes
-
AsyncGPTModel– Asynchronous counterpart usingAsyncGroq.- Same retry logic, but
await‑ed and logs “Generating answer asynchronously…”.
- Same retry logic, but
Both raise ModelExhaustedException if the rotation list becomes empty.
Side effects: Mutates self.history (via parent methods), updates self.current_model_index, writes log entries via BaseLogger.
Factory package (factory/__init__.py)
The file is currently empty; it exists to make autodocgenerator.factory a Python package. Future factory functions (e.g., create_model()) can be added here to encapsulate the selection of GPTModel vs. AsyncGPTModel based on runtime configuration.
📦 autodocgenerator.factory – Overview
The factory package glues together document‑generation modules and the language model.
__init__.py is intentionally empty – it only marks the directory as a Python package. In the future it will expose helper functions (e.g. create_model()) that decide whether to instantiate GPTModel or AsyncGPTModel based on runtime settings.
🧩 BaseModule (abstract)
Location: factory/base_factory.py
- Defines the contract for every generation step.
- Sub‑classes must implement
generate(info: dict, model: Model) → str. - No state is required; the base
__init__is a placeholder.
🏗️ DocFactory
Location: factory/base_factory.py
-
Constructor
DocFactory(*modules)stores a list ofBaseModuleinstances. -
Method
generate_doc(info, model, progress)orchestrates:- Creates a sub‑task in
BaseProgresssized to the number of modules. - Calls each module’s
generate, concatenates results (output += … + "\n\n"). - Logs module activity via
BaseLogger(InfoLog). - Advances the progress bar and finally removes the sub‑task.
- Creates a sub‑task in
-
Returns the full documentation string.
-
Side‑effects: progress UI updates, log entries.
📄 Built‑in Modules
| Module | Purpose | Key Call |
|---|---|---|
modules.intro.IntroLinks |
Extracts HTML links from info["full_data"] and asks the model for a linked introduction. |
get_links_intro(…) |
modules.intro.IntroText |
Generates a textual introduction from info["global_data"]. |
get_introdaction(…) |
modules.general_modules.CustomModule |
Produces a custom description defined at instantiation (discription). |
generete_custom_discription(split_data(...), model, …) |
All modules inherit BaseModule, receive the same info dictionary and a Model instance, and return a markdown‑compatible string.
🔄 Integration Flow
- Configuration builds a list of desired modules (e.g.,
IntroLinks(),CustomModule("API overview")). DocFactoryis instantiated with that list.- The main application supplies
info(parsed code, language, etc.), a concreteModel, and aBaseProgressUI object. DocFactory.generate_docreturns the assembled documentation, while progress and logging keep the user informed.
Assumptions: info contains keys used by modules (code_mix, full_data, global_data, language).
Outputs: a single markdown string.
Side‑effects: UI progress updates, logging to the configured sink.
Future additions to factory/__init__.py will expose convenience constructors that hide the module‑selection logic from callers.
Manager – Central Orchestrator for Documentation Generation
The Manager class glues together all preprocessing, LLM‑model, factory and UI components of AutoDocGenerator.
It owns a per‑project cache folder (.auto_doc_cache) where intermediate artefacts are stored:
| key | filename |
|---|---|
code_mix |
code_mix.txt – raw repository dump |
global_info |
global_info.md – compressed project summary |
logs |
report.txt – run‑time log file |
output_doc |
output_doc.md – final markdown document |
Core responsibilities
- Initialise logging (
BaseLogger→FileLoggerTemplate) and a progress UI (BaseProgress). - Provide thin helpers (
read_file_by_file_key,get_file_path) for cache I/O. - Drive the three generation stages:
- Code mix –
generate_code_file()builds a plain‑text representation of the repository usingpreprocessor.code_mix.CodeMix. - Global info –
generate_global_info_file()(placeholder implementation) would compress the code mix viaspliter+compressor. - Doc parts –
generete_doc_parts()splits the mixed code and global info into chunks and feeds them to either a synchronous (gen_doc_parts) or asynchronous (async_gen_doc_parts) LLM model, producing a first draft (output_doc.md).
- Code mix –
- Post‑process the draft with a factory (
factory_generate_doc). The suppliedDocFactoryaggregates custom modules (e.g.,IntroText,IntroLinks,CustomModule) and callsDocFactory.generate_doc(info, sync_model, progress_bar). The factory result is prepended to the existing document. - Clean up cache artefacts (
clear_cache) respecting thesave_logsflag fromProjectConfigSettings.
Key Methods
| Method | What it does | Important I/O / side‑effects |
|---|---|---|
__init__ |
Stores configuration, creates cache folder, wires logger and progress bar. | Writes a log file path; creates CACHE_FOLDER_NAME if missing. |
read_file_by_file_key |
Reads a cached file by logical key. | Returns UTF‑8 string content. |
get_file_path |
Resolves the absolute path for a cached file. | Pure path construction. |
generate_code_file |
Instantiates CodeMix, writes repository dump to cache. |
Updates progress bar; logs start/completion. |
generate_global_info_file |
(stub) reads code mix, would compress it, writes placeholder "ss" to cache. |
Updates progress bar; logs start/completion. |
generete_doc_parts |
Calls either gen_doc_parts or async_gen_doc_parts to produce a draft. |
Writes output_doc.md; logs steps; updates progress. |
factory_generate_doc |
Loads cached artefacts, builds an info dict, runs the provided DocFactory. Prepends factory output to the draft. |
Writes updated output_doc.md; logs module list and input sizes; updates progress. |
clear_cache |
Removes the log file unless pcs.save_logs is True. |
File‑system side‑effect. |
Interaction with Other Sub‑systems
- Pre‑processor –
CodeMix,split_data,gen_doc_parts,async_gen_doc_parts(all underautodocgenerator.preprocessor). - LLM Engine –
Model/AsyncModelinstances supplied at construction; passed transparently to compression and doc‑part generators. - Factory – Concrete factories live in
autodocgenerator.factory;factory_generate_docsupplies them with the unifiedinfodict and the synchronous model. - UI –
BaseProgress(and subclasses) receivesupdate_task()calls after each major step;BaseLoggerwrites human‑readable logs to the cache.
Assumptions & Limitations
- The cache directory must be writable; otherwise initialization fails.
generate_global_info_filecurrently writes a dummy string – real implementation should invoke the commented‑out compression pipeline.generete_doc_partsexpects the supplied model to implement the same interface asgen_doc_parts/async_gen_doc_parts.- All file reads/writes are performed synchronously; large projects may benefit from streaming I/O in future revisions.
This documentation covers the final orchestration layer (Manager) of the AutoDocGenerator pipeline. It is intended for developers extending the generation workflow, adding new factory modules, or swapping model back‑ends.
CodeMix – Repository → Text Mixer
Responsibility
CodeMix walks a source‑tree, filters unwanted files/folders, and writes a single UTF‑8 document that first lists the directory hierarchy and then embeds every included file wrapped in <file path="…"> tags. The output is later consumed by the pre‑processor (Manager.generate_code_file) to feed the LLM pipeline.
Key API
| Member / Method | Purpose | I/O & Side‑effects |
|---|---|---|
__init__(root_dir=".", ignore_patterns=None) |
Resolve the repository root and store ignore patterns. Instantiates a lightweight BaseLogger. |
Creates Path objects; no FS writes. |
should_ignore(path: Path) -> bool |
Returns True if path matches any pattern in ignore_patterns (glob, basename, or any part). |
Pure calculation; uses fnmatch. |
build_repo_content(output_file="repomix-output.txt") |
Generates the mixed file: writes a “Repository Structure” tree, a separator, then each file’s content. Logs ignored entries via InfoLog. |
Writes (or overwrites) output_file; may raise if directory not writable. Errors while reading a file are captured and written as a line in the output. |
Interaction with the System
- Manager – Calls
CodeMix(root_dir, ignore_patterns).build_repo_content(cache_path)to obtain the repository dump before compression or doc‑part generation. - Logging UI – Uses
BaseLogger(fromautodocgenerator.ui.logging) to emit human‑readable messages that appear in the per‑run cache folder. - LLM factories – The generated file becomes the
code_mix_fileartefact loaded later byManager.factory_generate_doc.
Assumptions & Limitations
- The supplied
root_dirmust exist and be readable; the process aborts on permission errors. ignore_patternsare interpreted as Unix‑style glob strings; complex regexes are not supported.- File reading is performed synchronously with
errors="ignore"– binary or badly‑encoded files will be silently corrupted. - The placeholder newline sequence
"\n\n\n"after each file is intentional for later parsing but could be refined.
Typical Usage
from autodocgenerator.preprocessor.code_mix import CodeMix, ignore_list
mix = CodeMix(root_dir="my_project", ignore_patterns=ignore_list)
mix.build_repo_content("codemix.txt") # creates the mixed repository snapshot
The resulting codemix.txt is the canonical input for the rest of the AutoDocGenerator pipeline.
Overview
compressor.py provides the text‑compression & summarisation stage of the AutoDocGenerator pipeline.
It feeds raw code snippets (or any repository content) to a LLM model, asks the model to shrink the input according to a configurable compress power, and aggregates the results so that later stages receive a single, concise representation of the whole project.
Dependencies & Interaction
- Models –
Model/AsyncModelfromengine.models.gpt_modelare used to callget_answer_without_history. - ProjectSettings – supplies the system prompt (
project_settings.prompt). - Configuration –
get_BASE_COMPRESS_TEXTbuilds the LLM instruction that encodes the desired compression level. - Progress UI –
BaseProgressvisualises work; each public routine creates a sub‑task, updates it per element, and removes it on completion.
compress
compress(data: str, project_settings, model, compress_power) -> str
Creates a three‑part prompt (system‑prompt, compression instruction, user data) and returns the model’s compressed answer.
Assumptions – compress_power is a positive int; model implements the synchronous API.
compress_and_compare
Batch version for a list of strings.
- Splits the input list into chunks of size
compress_power. - Calls
compressfor each element, concatenating results per chunk. - Returns a list whose length ≈
len(data) / compress_power.
Side‑effect: updates the supplied progress_bar.
async_compress & async_compress_and_compare
Asynchronous counterparts that respect a semaphore (max 4 concurrent calls).
async_compress builds the same prompt as compress but awaits model.get_answer_without_history.
async_compress_and_compare launches all tasks, gathers results, then re‑chunks them exactly like compress_and_compare.
compress_to_one
Iteratively reduces a list of fragments to a single compressed document:
- While more than one chunk exists, call either the async or sync batch compressor with a dynamic
compress_power(downgraded to 2 when the list is short). - Increment an internal iteration counter (useful for logging).
Returns the final string.
generate_describtions_for_code
Takes the fully‑compressed code, asks the model to produce developer‑facing documentation per fragment.
The system prompt enforces a strict format (components, parameters, usage example) and forbids hallucination.
Outputs a list of markdown‑formatted descriptions, one per input element, while driving the progress UI.
Key Assumptions
- All model objects are already authenticated and reachable.
project_settings.promptis a valid system instruction.compress_power≤ length ofdataunless the fallback to 2 is applied.
Outputs
- Strings (compressed text or generated documentation).
- Progress bar side‑effects for CLI feedback.
## Post‑processing Helpers (postprocess.py)
Responsibility – After the LLM has generated raw markdown, this module extracts navigation data (section titles, HTML anchors) and creates short introductory texts that link to those sections. It does not modify the documentation content itself; it only builds auxiliary strings used by the final renderer.
Interactions
- LLM models –
Model(or its subclasses) fromengine.models.modelis called viaget_answer_without_historyto ask the model to write an introduction that lists provided links. - Configuration –
BASE_INTRODACTION_CREATE_TEXTandBASE_INTRO_CREATEsupply the system prompts that enforce language and style. - Logging UI –
BaseLogger+InfoLogproduce console feedback; no state is mutated outside the logger.
Key Functions
| Function | Purpose | Important I/O | Side‑effects |
|---|---|---|---|
generate_markdown_anchor(header: str) -> str |
Normalises a heading into a Git‑Hub‑style markdown anchor (#my‑title). |
header – raw heading text. Returns anchor string prefixed with #. |
None. |
get_all_topics(data: str) -> tuple[list[str], list[str]] |
Scans a markdown document for level‑2 headings (## …) and returns both the raw titles and their generated anchors. |
data – full markdown. Returns (titles, anchors). |
None. |
get_all_html_links(data: str) -> list[str] |
Finds custom HTML anchors (<a name="…">) inside the doc, extracts the name, and returns a list of #name links (max length 25). |
data – full markdown. Returns list of link strings. |
Logs extraction progress via BaseLogger. |
get_links_intro(links: list[str], model: Model, language: str = "en") -> str |
Sends the list of links to the LLM with a system prompt (BASE_INTRODACTION_CREATE_TEXT) and receives a short introductory paragraph that references them. |
links – list of #… strings. Returns generated intro text. |
Logs start/completion; makes a synchronous LLM call. |
get_introdaction(global_data: str, model: Model, language: str = "en") -> str |
Similar to get_links_intro but creates a full‑document introduction based on the entire compressed markdown (global_data). |
global_data – whole doc. Returns intro paragraph. |
Logs via BaseLogger; synchronous LLM call. |
generete_custom_discription(splited_data: str, model: Model, custom_description: str, language: str = "en") -> str |
Iterates over pre‑split fragments, asking the model to produce a custom description (title + <a name='…'> anchor) respecting strict “no‑hallucination” rules. Stops at the first fragment that yields a non‑empty answer. |
splited_data – iterable of fragments, custom_description – user‑requested topic. Returns the description or empty string. |
Logs through BaseLogger; may perform several LLM calls. |
Assumptions
- The input markdown follows the convention of
##headings and optional<a name="…">tags. modelimplements the synchronousget_answer_without_historyAPI and is already authenticated.- Language code (
en,ru, …) is supported by the LLM.
## Project Settings (settings.py)
Responsibility – Holds static project metadata used to build the system prompt for the LLM. It aggregates a base prompt (BASE_SETTINGS_PROMPT) with the project name and any key/value pairs supplied by the caller.
Key API
ProjectSettings(project_name: str)– ctor stores the name.add_info(key, value)– registers additional metadata.prompt(property) – concatenatesBASE_SETTINGS_PROMPT, the project name, and eachkey: valueline into a single string ready for insertion into LLM prompts.
Assumptions & Side‑effects
BASE_SETTINGS_PROMPTis a valid multi‑line instruction.- No I/O; purely in‑memory string assembly.
## Split‑Data Stub (spliter.py)
Only the beginning of the file is shown; the core function split_data(data: str, max_symbols: int) -> list[str] is intended to break a large markdown string into smaller chunks that respect a token/character limit (max_symbols).
Typical Interaction
- Called by the compression pipeline (see the compress section of the global documentation) to produce a list of fragments that are later fed to the LLM.
- Uses
BASE_PART_COMPLITE_TEXTas part of the prompt for each fragment, andBaseProgressto visualise progress.
Assumptions
datacontains the full documentation text.max_symbolsis a positive integer smaller than the total length; the function will ensure no fragment exceeds it.
Overall Role in the System
These three modules sit at the post‑compression stage: spliter prepares size‑limited fragments, settings supplies the contextual prompt, and postprocess extracts navigation anchors and asks the LLM to generate concise introductions and custom descriptions. Together they turn a raw, compressed markdown blob into a navigable, developer‑friendly documentation set ready for final rendering.
The loop normalises the raw splited_by_files list so that no fragment exceeds 1.5 × max_symbols.
- Over‑length chunks are cut in half and re‑inserted.
- Afterwards a second pass packs the normalised pieces into split_objects, each respecting 1.25 × max_symbols.
Inputs – splited_by_files (list of strings), max_symbols (int).
Outputs – split_objects (list of size‑limited fragments) used by the documentation‑generation pipeline.
Side‑effects – logs progress via BaseLogger.
Generates documentation for a single markdown fragment synchronously.
- Builds an LLM prompt:
- System message fixing the response language.
- System message containing
BASE_PART_COMPLITE_TEXT(the “write‑part” instruction). - Optional system message with the previous part’s output (
prev_info). - User message with the current
part.
- Calls
model.get_answer_without_history. - Strips surrounding markdown fences (
…) and returns the clean text.
Parameters
part: str– fragment to document.model: Model– synchronous LLM wrapper.global_info: str– (currently unused, reserved for future context).prev_info: str | None– tail of the previous generation to keep continuity.language: str– target language (default “en”).
Returns – generated markdown string.
Async counterpart of the above. It receives an AsyncModel, runs the same prompt logic inside an async with semaphore block, optionally calls update_progress, and returns the cleaned answer.
Orchestrates full documentation creation in a synchronous pipeline.
- Calls
split_data(full_code_mix, max_symbols)to obtain size‑limited fragments. - Iterates over fragments, invoking
write_docs_by_parts. - Concatenates results, keeping the last 3 000 characters as context for the next call (
result = result[-3000:]). - Updates a
BaseProgresssub‑task after each fragment.
Outputs – a single markdown string containing the whole generated documentation.
Async version of gen_doc_parts.
- Splits the input, creates a semaphore (default 4 concurrent calls) and a progress sub‑task.
- Fires off
async_write_docs_by_partsfor every fragment, gathers results withasyncio.gather, and joins them.
Both generators feed the post‑compression stage of the system, turning chunked code into a navigable, developer‑friendly documentation set ready for final rendering.
Logging subsystem – runtime diagnostics and persistence
The logging module defines a tiny hierarchy of log objects (BaseLog, ErrorLog, WarningLog, InfoLog) that know how to format themselves with a timestamp and severity tag.
BaseLoggerTemplate implements the filtering logic (log_level < 0 → all, otherwise only logs with level ≤ log_level) and a generic log() that prints to stdout.
FileLoggerTemplate overrides log() to append the formatted text to a user‑specified file, enabling persistent build‑time diagnostics.
BaseLogger is a singleton façade used throughout the generator pipeline. The generator creates a single BaseLogger instance, injects a concrete template via set_logger(), and calls log() wherever status messages are needed. The façade forwards calls to the configured template’s global_log(), guaranteeing a single point of control for all log output.
Assumptions: callers provide a BaseLog subclass instance; log_level is an integer where higher values represent more detailed logs.
Side‑effects: writing to stdout or appending to a file; no mutation of log objects after creation.
Progress subsystem – visual feedback for long‑running steps
BaseProgress declares the abstract API (create_new_subtask, update_task, remove_subtask).
LibProgress implements this API using Rich’s Progress widget: it creates a base “General progress” task, spawns optional sub‑tasks, and advances the appropriate task on each update_task() call. Removing a sub‑task simply discards the reference, letting Rich finish the base task.
ConsoleGtiHubProgress is a fallback that prints simple textual updates to the console. It uses the helper ConsoleTask to track current progress and emit percentage‑based messages.
The generator injects one of these concrete progress objects into its orchestration layer, allowing the same orchestration code to drive either rich terminal UI or plain console output without modification.
Install scripts – bootstrap for CI workflow
install.ps1 (PowerShell) and install.sh (Bash) create the .github/workflows directory, write a reusable GitHub Actions workflow (autodoc.yml), and generate autodocconfig.yml containing the current folder name as project_name and a fixed language (en).
Both scripts are idempotent (they use -Force / mkdir -p), and they output a short success message. The generated workflow file is later used by the CI system to invoke the AutoDoc generator, while the config file supplies project‑specific metadata required by the generator at runtime.
Project metadata (pyproject.toml)
This pyproject.toml is the canonical source of truth for the AutoDoc generator package.
It supplies the CI‑generated workflow and the runtime config with the information the
generator needs to resolve templates, locate source files and embed project‑specific
metadata (name, version, authors, license, etc.) into the produced documentation.
Core responsibilities
- Declares the package name
autodocgenerator, version0.7.9, description and licensing – values later interpolated into the generatedREADMEand docs. - Lists all runtime dependencies (e.g.
pyyaml,pydantic,openai) that the AutoDoc engine imports when the CI workflow runs. - Defines the build‑system (
poetry-core) so the CI job canpip install .before invoking the generator.
Interaction with other components
- The workflow‑creation script (the part that uses
-Force/mkdir -p) copies this file into the repository’s root; the CI runner reads it to install exact versions, guaranteeing reproducible documentation builds. - The config generator reads the
[project]fields to fill placeholders in theautodoc_config.ymlthat the generator consumes at runtime.
Assumptions & side‑effects
- Assumes Python 3.11 – 3.12 (as declared by
requires-python). - The presence of this file triggers Poetry to resolve and lock the dependency graph; missing entries will cause the CI step to fail.
- No runtime side‑effects beyond package installation; it is purely declarative.
Success: The script reports “✅ Workflow and config generated” after writing this file, signalling that the CI pipeline can now safely invoke the AutoDoc generator.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file autodocgenerator-0.8.0.tar.gz.
File metadata
- Download URL: autodocgenerator-0.8.0.tar.gz
- Upload date:
- Size: 43.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.3.1 CPython/3.12.12 Linux/6.11.0-1018-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b9addff300309ee98d411c5006e854c81bf5278ac35b42d166baa46cba40d4be
|
|
| MD5 |
a853aad783aaee494e07ac69820090ec
|
|
| BLAKE2b-256 |
c7e4f985ed0c19d692ca8189615bf18fb3cb5a2067469c564fbaee8911099a69
|
File details
Details for the file autodocgenerator-0.8.0-py3-none-any.whl.
File metadata
- Download URL: autodocgenerator-0.8.0-py3-none-any.whl
- Upload date:
- Size: 36.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.3.1 CPython/3.12.12 Linux/6.11.0-1018-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dbe6bb3dbf24a0f91e4ceb40499930847369e9abd3f5d490afa2b5237b8d46cd
|
|
| MD5 |
a1220e01ebe576029344ababe043f882
|
|
| BLAKE2b-256 |
cb3242cc326abf346bdced727e616ff608e780848bce4e8eacd3cce78679e1b5
|