This Project helps you to create docs for your projects
Project description
Project Title
Auto‑Doc Generator
1. Project Goal
To automate the creation of comprehensive, Markdown‑style documentation for any codebase.
Given a repository, the tool asks a Groq LLM to generate natural‑language sections, optionally enriches those sections with vector embeddings, and emits a single, ready‑to‑publish document. It supports custom sections, ordering, and fine‑grained control over the output through a YAML configuration file.
2. Core Logic & Principles
| Layer | Responsibility | How It Works |
|---|---|---|
| CLI / UI | Entry point (run_file.py) and progress display (ConsoleGitHubProgress) |
Parses arguments, loads the configuration file (autodocconfig.yml), and starts a Manager. |
| Configuration | Config, StructureSettings, and custom module imports |
Defines ignore lists, language, metadata, chunk sizes, and which introductory modules to render. |
| Manager | Orchestrates the whole pipeline | 1️⃣ Pre‑processes source code: splits and compresses into ≤ max_symbols chunks. 2️⃣ Uses the LLM wrapper (GPTModel) to summarise or render each chunk. 3️⃣ Passes the resulting text through a DocFactory that creates an ordered list of BaseModule subclasses, each rendering a Markdown fragment into a shared DocSchema. 4️⃣ Optionally produces embeddings with the Google Gemini API. 5️⃣ Re‑orders and finalises the document before writing output_doc.md or returning the string. |
| Factory | DocFactory |
Instantiates BaseModule objects (e.g., IntroText, IntroLinks, user‑supplied modules) and calls their render() methods in a defined sequence. |
| LLM Layer | GPTModel / AsyncGPTModel |
Thin wrapper around the Groq API that handles key rotation, request batching, and history‑based conversation context. |
| Post‑processing | Embedding, semantic re‑ordering, anchor splitting | Stores 768‑dim vectors per doc part and can reorder sections based on semantic similarity. |
| Schema | DocSchema |
A typed in‑memory representation of the document (headers, parts, global sections) that all modules manipulate before final output. |
| Utilities | Splitters, compressors, settings helpers | Deterministic text chunking (split_text_by_anchors) and iterative LLM compression (compress_to_one). |
The architecture follows a Layered + Factory pattern: UI → Service (Manager) → Model → Post‑processor, ensuring that every component has a single, well‑defined responsibility and can be swapped independently.
3. Key Features
- CLI and API Entry Points – run the generator from the command line or import
gen_docin Python code. - Configurable Workflow –
autodocconfig.ymlcontrols ignore patterns, language, metadata, chunk size, and custom module injection. - Dynamic Sectioning –
DocFactorycreates any number of markdown sections; users can add newBaseModulesubclasses for bespoke documentation blocks. - LLM‑Driven Content – utilizes Groq for natural‑language generation; supports both synchronous and asynchronous operation.
- Embedding Layer – optional vector embeddings via Google Gemini API for semantic indexing or future search features.
- Compression and Chunking – source files are split into manageable chunks, optionally compressed using iterative LLM summarisation.
- Progress Reporting – Rich progress bars (
ConsoleGitHubProgress) provide real‑time feedback during long runs. - Extensibility – easy addition of new modules, change of LLM provider, or disabling embeddings without touching the core pipeline.
4. Dependencies
| Category | Package / Tool | Purpose |
|---|---|---|
| LLM | groq (Python SDK) |
Communicates with Groq APIs. |
| Embeddings | googleai (or relevant Google Gemini SDK) |
Generates 768‑dim vector embeddings. |
| CLI/Progress | rich, typer |
Rich console output and progress bars. |
| Configuration | pyyaml |
Reads autodocconfig.yml. |
| Data Structures | pydantic / typing |
Defines schema types (DocSchema, DocContent, etc.). |
| Utilities | tqdm, textwrap |
Optional progress helpers and text formatting. |
| Environment | dotenv (optional) |
Loads API keys from .env. |
Environment Variables
GROQ_API_KEYS– comma‑separated list of Groq API keys.GOOGLE_EMBEDDING_API_KEY– key for the Google Gemini embedding endpoint.
Python ≥ 3.10 is required for type annotations and pattern matching used in the code.
In short, Auto‑Doc Generator is a modular, LLM‑powered tool that turns raw source code into a polished, Markdown README‑style document, all driven by a clear, layered architecture and a user‑configurable pipeline.
Executive Navigation Tree
-
📘 Overview
-
⚙️ Configuration
- pyproject
- data-contract
- gpt-model-documentation
- async-gpt-model-documentation
- module-documentation
- base-module
- doc-factory
- schema-doc_schema
- autodocconfig-structure-explanation
- custom-module
- custom-module-without-context
- postprocessor-custom_intro
- custom-intro-utilities
- example-usage
- integration-highlights
- intro-links
- intro-text
-
🧩 Manager
-
🔧 Preprocessor
-
📦 Postprocessor
-
🛠️ Utilities
-
⚙️ Settings & Logging
-
📦 Installation
autodocgenerator Module Initializer
Functional Role
Bootstraps the Auto‑Doc Generator library on import. Prints a styled ASCII banner and exposes the central logging singleton (logger) for the entire package.
Visible Interactions
- Calls
BaseLoggerfromautodocgenerator.ui.logging. - Invokes
BaseLoggerTemplateto configure log handlers. - No I/O beyond stdout; does not touch the filesystem or external services.
Logic Flow
- Define
_print_welcome()– builds colour constants, assembles an ASCII logo, prints banner and version line. - Immediately execute
_print_welcome()at import time. - Import
BaseLogger,BaseLoggerTemplate,InfoLog,ErrorLog,WarningLog. - Instantiate
logger = BaseLogger(). - Attach a template via
logger.set_logger(BaseLoggerTemplate()). - Export
logger(module‑level singleton) for downstream modules.
Data Contract
| Entity | Type | Role | Notes |
|---|---|---|---|
_print_welcome |
function → None |
Side‑effect: prints banner | Executes on import; no return. |
logger |
BaseLogger instance |
Central logging hub | Shared across package; configured with BaseLoggerTemplate. |
BaseLogger, BaseLoggerTemplate |
classes (from ui.logging) |
Provide logging API | Imported but not instantiated beyond logger. |
InfoLog, ErrorLog, WarningLog |
classes | Log‑level helpers | Imported for external use; not instantiated here. |
Assumption – The banner display is purely cosmetic; it does not affect documentation generation logic.
Usage
import autodocgenerator as adg
adg.logger.info("Library loaded")
The module performs no conditional checks; any import will emit the welcome banner and prepare logger for immediate use.
Project Context: Auto Doc Generator
The provided code snippets are part of the Auto Doc Generator project, which aims to help developers generate documentation for their projects. The project utilizes a layered architecture, incorporating components such as a config reader, a manager for the documentation generation pipeline, and various modules for customizing the output.
The code is organized into several modules and submodules, including:
autodocgenerator.auto_runner: Contains theconfig_readerandrun_filemodules, responsible for reading configuration files and generating documentation, respectively.autodocgenerator.config: Defines theConfigandProjectBuildConfigclasses, which store project settings and build configurations.autodocgenerator.engine: Includes theGPTModelandAsyncGPTModelclasses, which interact with the Groq API for language modeling tasks.autodocgenerator.factory: Contains theDocFactoryclass and various module classes (e.g.,CustomModule,IntroLinks,IntroText) that contribute to the documentation generation process.autodocgenerator.postprocessor: Includes theEmbeddingclass, which handles embedding generation for the documentation.autodocgenerator.preprocessor: Contains modules for preprocessing and splitting code files (not shown in the provided snippets).autodocgenerator.ui: Defines theConsoleGtiHubProgressclass, which displays progress updates during the documentation generation process.
Key Components and Responsibilities
The following components play crucial roles in the Auto Doc Generator project:
- Config Reader: Reads configuration files (e.g.,
autodocconfig.yml) and populates theConfigobject with project settings. - Manager: Orchestrates the documentation generation pipeline, utilizing various modules and models to produce the final output.
- Doc Factory: Instantiates and coordinates the rendering of custom modules, including intro text and links.
- GPT Model: Interacts with the Groq API to generate language model outputs for the documentation.
- Embedding: Generates embeddings for the documentation using the Google Gemini API.
The documentation generation process involves the following steps:
- Config Reading: The
config_readermodule reads the configuration file and populates theConfigobject. - Manager Initialization: The
Managerclass is instantiated with theConfigobject, project path, and other necessary components (e.g.,GPTModel,Embedding). - Code File Generation: The
Managergenerates code files based on the project path and configuration settings. - Custom Module Rendering: The
DocFactoryrenders custom modules, including intro text and links, using theManagerinstance. - Language Model Interaction: The
GPTModelinteracts with the Groq API to generate language model outputs for the documentation. - Embedding Generation: The
Embeddingclass generates embeddings for the documentation using the Google Gemini API. - Final Output: The
Managersaves the generated documentation to a file (e.g.,output_doc.md).
The technical logic flow involves the following steps:
- Config Loading: The
config_readermodule loads the configuration file and populates theConfigobject. - Manager Initialization: The
Managerclass is instantiated with theConfigobject, project path, and other necessary components. - Code File Generation: The
Managergenerates code files based on the project path and configuration settings. - Custom Module Rendering: The
DocFactoryrenders custom modules, including intro text and links, using theManagerinstance. - Language Model Interaction: The
GPTModelinteracts with the Groq API to generate language model outputs for the documentation. - Embedding Generation: The
Embeddingclass generates embeddings for the documentation using the Google Gemini API. - Final Output: The
Managersaves the generated documentation to a file.
The following constraints are critical to the Auto Doc Generator project:
- Layered Architecture: The project follows a layered architecture, with separate components for configuration, documentation generation, and embedding generation.
- Modular Design: The project uses a modular design, with separate modules for custom documentation components, language modeling, and embedding generation.
- Configurability: The project allows for configuration settings to be loaded from a file, enabling users to customize the documentation generation process.
To further develop the Auto Doc Generator project, the following steps can be taken:
- Implement Additional Custom Modules: Develop new custom modules for specific documentation needs, such as code snippets or diagrams.
- Integrate with Other Tools: Integrate the Auto Doc Generator with other development tools, such as IDEs or version control systems.
- Enhance Configurability: Expand the configuration settings to allow for more customization options, such as output formats or styling.
install-workflow-using-remote-scripts-and-github-secret
To set up the project you can run a remote installer that pulls the required files straight from the repository’s main branch.
PowerShell (Windows) – execute the command
irm raw.githubusercontent.com/Drag-GameStudio/ADG/main/install.ps1 | iex
This invokes the PowerShell script that performs all necessary installations and configuration steps.
Bash (Linux/Unix) – run
curl -sSL raw.githubusercontent.com/Drag-GameStudio/ADG/main/install.sh | bash
This downloads the shell script and pipes it directly into the shell interpreter.
Both scripts expect an API key from the Grock service. To supply this key in a GitHub Actions workflow, create a repository secret named GROCK_API_KEY and paste the key obtained from the Grock documentation site. The workflow can then reference this secret using ${{ secrets.GROCK_API_KEY }} so the installer can authenticate with Grock during automated runs.
Pyproject.toml – Package Metadata
Purpose
Defines the packaging, dependencies, and build configuration for the autodocgenerator project, enabling poetry or pip to install the library in an isolated environment.
Data Contract
| Section | Key | Value | Notes |
|---|---|---|---|
[project] |
name |
"autodocgenerator" |
Package identifier. |
version |
"1.0.3.5" |
Semantic versioning. | |
description |
"This Project helps you to create docs for your projects" |
Short package description. | |
authors |
list | Maintainer contact. | |
license |
"MIT" |
License identifier. | |
readme |
"README.md" |
Documentation entry. | |
requires-python |
">=3.11,<4.0" |
Python runtime constraint. | |
dependencies |
list | Runtime libraries (e.g., groq, google-genai, rich). |
|
[build-system] |
requires |
["poetry-core>=2.0.0"] |
Build backend requirement. |
build-backend |
"poetry.core.masonry.api" |
Build backend implementation. |
Key Dependencies
- LLM & Embedding:
groq,google-genai,openai. - I/O & Caching:
CacheControl,filelock,zstandard. - Logging & UI:
rich,rich_progress. - Configuration:
pyyaml,python-dotenv. - Type Checking:
pydantic,typing_extensions.
Side effect: This file is read during packaging and installation; any missing or incompatible dependency will prevent the library from running.
These documents provide a concise, accurate view of the installation helper script and package configuration, aligned with the Auto‑Doc Generator architecture. Data Contract
The following data entities are exchanged between components:
| Entity | Type | Role | Notes |
|---|---|---|---|
Config |
Object | Project settings | Stores project name, language, ignore files, and additional information. |
ProjectBuildConfig |
Object | Build settings | Stores settings for the build process, such as log level and save logs. |
CustomModule |
Object | Custom documentation module | Represents a custom module, such as intro text or links. |
GPTModel |
Object | Language model | Interacts with the Groq API for language modeling tasks. |
Embedding |
Object | Embedding generator | Generates embeddings for the documentation using the Google Gemini API. |
DocFactory |
Object | Documentation factory | Instantiates and coordinates the rendering of custom modules. |
Overview of GPT Model
The GPT Model is a core component of the Auto Doc Generator project, responsible for generating answers to user prompts using the Groq API. The model is designed to handle multiple APIs and models, allowing for flexibility and fault tolerance.
Technical Logic Flow
The technical logic flow of the GPT Model involves the following steps:
- Initialization: The GPT Model is initialized with an API key, history, and a list of models.
- Prompt Processing: The model receives a prompt, which can be either a list of dictionaries or a single string.
- History Management: The model checks if it should use the history or not. If it should, it retrieves the history from the
Historyclass. - Model Selection: The model selects a model from the list of available models. If a model fails, it tries the next one.
- API Call: The model makes an API call to the selected model using the Groq API.
- Answer Generation: The model generates an answer based on the API response.
- History Update: The model updates the history with the user's prompt and the generated answer.
Data Contract
The following data entities are exchanged between the GPT Model and other components:
| Entity | Type | Role | Notes |
|---|---|---|---|
api_key |
str | API key | Used to authenticate with the Groq API |
history |
History |
History | Stores the conversation history |
models_list |
list[str] | List of models | List of available models to use |
prompt |
list[dict[str, str]] or str | Prompt | User's prompt to generate an answer for |
answer |
str | Answer | Generated answer to the user's prompt |
Critical Constraints
The following constraints are critical to the GPT Model:
- Model Availability: The model must be able to handle multiple models and APIs, allowing for flexibility and fault tolerance.
- History Management: The model must be able to manage the conversation history, including adding and retrieving entries.
- API Call: The model must be able to make API calls to the selected model using the Groq API.
Next Steps
To further develop the GPT Model, the following steps can be taken:
- Improve Model Selection: Improve the model selection logic to choose the best model based on the prompt and conversation history.
- Add More Models: Add more models to the list of available models, allowing for greater flexibility and fault tolerance.
- Enhance History Management: Enhance the history management logic to store and retrieve more context, allowing for better answer generation.
The Async GPT Model is an asynchronous version of the GPT Model, designed to handle asynchronous API calls and improve performance.
Technical Logic Flow
The technical logic flow of the Async GPT Model involves the following steps:
- Initialization: The Async GPT Model is initialized with an API key, history, and a list of models.
- Prompt Processing: The model receives a prompt, which can be either a list of dictionaries or a single string.
- History Management: The model checks if it should use the history or not. If it should, it retrieves the history from the
Historyclass. - Model Selection: The model selects a model from the list of available models. If a model fails, it tries the next one.
- API Call: The model makes an asynchronous API call to the selected model using the Groq API.
- Answer Generation: The model generates an answer based on the API response.
- History Update: The model updates the history with the user's prompt and the generated answer.
Data Contract
The following data entities are exchanged between the Async GPT Model and other components:
| Entity | Type | Role | Notes |
|---|---|---|---|
api_key |
str | API key | Used to authenticate with the Groq API |
history |
History |
History | Stores the conversation history |
models_list |
list[str] | List of models | List of available models to use |
prompt |
list[dict[str, str]] or str | Prompt | User's prompt to generate an answer for |
answer |
str | Answer | Generated answer to the user's prompt |
Critical Constraints
The following constraints are critical to the Async GPT Model:
- Model Availability: The model must be able to handle multiple models and APIs, allowing for flexibility and fault tolerance.
- History Management: The model must be able to manage the conversation history, including adding and retrieving entries.
- API Call: The model must be able to make asynchronous API calls to the selected model using the Groq API.
Next Steps
To further develop the Async GPT Model, the following steps can be taken:
- Improve Model Selection: Improve the model selection logic to choose the best model based on the prompt and conversation history.
- Add More Models: Add more models to the list of available models, allowing for greater flexibility and fault tolerance.
- Enhance History Management: Enhance the history management logic to store and retrieve more context, allowing for better answer generation.
The module documentation provides an overview of the various modules used in the Auto Doc Generator project.
The BaseModule is an abstract base class that defines the interface for all modules.
Technical Logic Flow
The technical logic flow of the BaseModule involves the following steps:
- Initialization: The
BaseModuleis initialized with no parameters. - Generation: The
generatemethod is called with aninfodictionary and amodelobject. - Result: The
generatemethod returns a result, which is a string or a dictionary.
Data Contract
The following data entities are exchanged between the BaseModule and other components:
| Entity | Type | Role | Notes |
|---|---|---|---|
info |
dict | Information | Dictionary containing information about the project |
model |
Model | Model | Model object used for generation |
result |
str or dict | Result | Result of the generation process |
The DocFactory is a class that manages the generation of documentation using multiple modules.
Technical Logic Flow
The technical logic flow of the DocFactory involves the following steps:
- Initialization: The
DocFactoryis initialized with a list of modules and a boolean flagwith_splited. - Generation: The
generate_docmethod is called with aninfodictionary, amodelobject, and aprogressobject. - Result: The
generate_docmethod returns aDocHeadSchemaobject.
Data Contract
The following data entities are exchanged between the DocFactory and other components:
| Entity | Type | Role | Notes |
|---|---|---|---|
info |
dict | Information | Dictionary containing information about the project |
model |
Model | Model | Model object used for generation |
progress |
BaseProgress | Progress | Progress object used to track the generation process |
result |
DocHeadSchema | Result | Result of the generation process |
DocContent, DocHeadSchema, DocInfoSchema – In‑Memory Document Model
Purpose
Represent generated documentation as structured data, enabling embedding association, ordering, and merging.
| Class | Role | Key Methods |
|---|---|---|
DocContent |
Leaf node | init_embedding(Embedding) |
DocHeadSchema |
Container of parts | add_parts(name, DocContent), get_full_doc(), __add__ |
DocInfoSchema |
Root schema | Holds global_info, code_mix, and doc |
DocContent
content: str– Raw markdown snippet.embedding_vector: Any | None– Optional vector produced byEmbedding.get_vector.init_embedding– Stores the vector for future similarity searches.
DocHeadSchema
- Maintains
content_orders(preserved order) andparts(name →DocContent). add_partsensures unique names by suffixing incremented integers.get_full_docconcatenates parts in order using a specified separator.__add__merges anotherDocHeadSchema, appending its ordered parts.
DocInfoSchema
- Holds a consolidated view: project metadata, the raw code mix, and the final
DocHeadSchema.
Usage Context
- The
Managerbuilds aDocInfoSchemaduring pipeline execution. - After LLM generation, each part is wrapped in a
DocContent, added toDocHeadSchema, and optionally embedded.
Note: All classes reside under
autodocgenerator.schema.doc_schema.
No external validation beyond Pydantic’sBaseModelis performed; the code relies on the correctness of upstream utilities (Embedding,Model,ProjectSettings).
The document is organized into several top‑level fields that control different aspects of the generator:
- project-name – short text identifying the project.
- language – language code used for the generated documentation.
- ignore-files – list of glob patterns telling the tool which files or directories to exclude. Typical entries include build outputs, caches, virtual environments, IDE settings, binary data, and log directories.
- build-settings – controls runtime behaviour:
- save-logs – flag to keep the internal logs of the generation process.
- log-level – numeric level controlling verbosity of output.
- structure-settings – dictates how the final document is assembled:
- include-intro-links – whether a link to the introductory section is added.
- include-intro-text – whether the introductory text itself is inserted.
- include-order – whether a section order is generated.
- use-global-file – whether a single global file is used for all content.
- max-doc-part-size – maximum size in characters for each generated document fragment.
- project-additional-info – free‑text field for a high‑level description or tagline of the project.
- custom-descriptions – a sequence of strings that can be used to override or supplement the autogenerated text. These may reference installation scripts and instructions for using the manager class or configuring the generator.
Each of these sections is optional, but the generator expects the keys in the shown form to correctly parse the settings. Adjust the patterns in ignore-files and the booleans in structure-settings to tailor the output to the repository’s layout and documentation style. Custom Module
The CustomModule is a class that generates custom documentation based on a description.
Technical Logic Flow
The technical logic flow of the CustomModule involves the following steps:
- Initialization: The
CustomModuleis initialized with a description. - Generation: The
generatemethod is called with aninfodictionary and amodelobject. - Result: The
generatemethod returns a result, which is a string.
Data Contract
The following data entities are exchanged between the CustomModule and other components:
| Entity | Type | Role | Notes |
|---|---|---|---|
info |
dict | Information | Dictionary containing information about the project |
model |
Model | Model | Model object used for generation |
result |
str | Result | Result of the generation process |
The CustomModuleWithOutContext is a class that generates custom documentation without context.
Technical Logic Flow
The technical logic flow of the CustomModuleWithOutContext involves the following steps:
- Initialization: The
CustomModuleWithOutContextis initialized with a description. - Generation: The
generatemethod is called with aninfodictionary and amodelobject. - Result: The
generatemethod returns a result, which is a string.
Data Contract
The following data entities are exchanged between the CustomModuleWithOutContext and other components:
| Entity | Type | Role | Notes |
|---|---|---|---|
info |
dict | Information | Dictionary containing information about the project |
model |
Model | Model | Model object used for generation |
result |
str | Result | Result of the generation process |
autodocgenerator/postprocessor/custom_intro.py
Purpose
Collects and enriches introductory documentation fragments.
The module contains helper functions that
- extract anchor links from a markdown string,
- generate an introduction section via an LLM,
- produce link lists for a navigation header,
- create custom descriptions for arbitrary code snippets.
All interactions with the LLM are performed through the Model interface passed to each helper.
| Function | Inputs | Outputs | Notes |
|---|---|---|---|
get_all_html_links(data: str) -> list[str] |
data: markdown document |
links: list of #anchor strings |
Uses regex <a name="…"> to capture anchors longer than five characters. Logs extraction steps via BaseLogger. |
get_links_intro(links: list[str], model: Model, language: str = "en") -> str |
links: list of anchor strings, model: LLM wrapper, language: optional |
intro_links: generated markdown section |
Builds a 3‑message prompt, sends it to model.get_answer_without_history. Logs before/after. |
get_introdaction(global_data: str, model: Model, language: str = "en") -> str |
global_data: project‑wide summary, model, language |
intro: introductory markdown |
Prompt contains BASE_INTRO_CREATE template. |
generete_custom_discription(splited_data: str, model: Model, custom_description: str, language: str = "en") -> str |
splited_data: iterable of code or text chunks, model, custom_description, language |
result: description or empty string |
Iterates over chunks; stops when result contains “!noinfo” or “No information found” is absent. |
generete_custom_discription_without(model: Model, custom_description: str, language: str = "en") -> str |
model, custom_description, language |
result: tagged description |
Enforces a strict tag format at the beginning of the LLM response. |
Internal Prompt Flow
system → "use language X"
system → specific instruction template
user → data or task text
The LLM response is returned directly; no post‑processing occurs in this module.
from autodocgenerator.manage import Manager
from autodocgenerator.config import Config
from autodocgenerator.engine.models import Model
from autodocgenerator.postprocessor.embedding import Embedding
from autodocgenerator.ui.progress_base import BaseProgress
# Create a configuration object
config = Config()
# Create an LLM model
llm_model = Model()
# Create an embedding model
embedding_model = Embedding()
# Create a progress bar
progress_bar = BaseProgress()
# Create a manager object
manager = Manager(project_directory="path/to/project", config=config, llm_model=llm_model, embedding_model=embedding_model, progress_bar=progress_bar)
# Generate the code mix file
manager.generate_code_file()
# Generate the global information file
manager.generate_global_info()
# Generate the documentation parts
manager.generete_doc_parts()
# Generate the documentation using the factory
manager.factory_generate_doc(DocFactory())
# Create the embedding layer
manager.create_embedding_layer()
# Order the documentation
manager.order_doc()
# Save the documentation
manager.save()
Notes
- The
Managerclass is responsible for orchestrating the entire documentation generation process. - The
Managerclass takes in several parameters, including the project directory, configuration, LLM model, embedding model, and progress bar. - The
Managerclass has several methods that are used to perform different tasks, such as generating the code mix file, global information file, documentation parts, and documentation using the factory. - The
Managerclass has several attributes that are used to store information, such as the documentation information, configuration, project directory, progress bar, LLM model, and embedding model. - The
Managerclass exchanges several data entities with other components, such as the code mix, global information, documentation parts, factory documentation, and embedding.
Interaction Flow in the Pipeline
- Anchor Extraction –
manager.generete_doc_parts()producesdoc_parts;get_all_html_linkspulls anchors → feeds toget_links_intro. - Introduction Generation –
global_infois supplied toget_introdaction; custom sections are rendered viagenerete_custom_discription. - Vector Embedding –
manager.create_embedding_layer()creates anEmbeddinginstance;get_vectoris called per doc chunk to build semantic indices. - Sorting –
sort_vectorsis used downstream in post‑processing to order sections by similarity to a root vector.
All LLM prompts are built from constants in engine.config.config and executed through the Model interface, ensuring a single point for API interaction. Logging is performed through BaseLogger for traceability.
Intro Links
The IntroLinks is a class that generates introduction links.
Technical Logic Flow
The technical logic flow of the IntroLinks involves the following steps:
- Generation: The
generatemethod is called with aninfodictionary and amodelobject. - Result: The
generatemethod returns a result, which is a string.
Data Contract
The following data entities are exchanged between the IntroLinks and other components:
| Entity | Type | Role | Notes |
|---|---|---|---|
info |
dict | Information | Dictionary containing information about the project |
model |
Model | Model | Model object used for generation |
result |
str | Result | Result of the generation process |
The IntroText is a class that generates introduction text.
Technical Logic Flow
The technical logic flow of the IntroText involves the following steps:
- Generation: The
generatemethod is called with aninfodictionary and amodelobject. - Result: The
generatemethod returns a result, which is a string.
Data Contract
The following data entities are exchanged between the IntroText and other components:
| Entity | Type | Role | Notes |
|---|---|---|---|
info |
dict | Information | Dictionary containing information about the project |
model |
Model | Model | Model object used for generation |
result |
str | Result | Result of the generation process |
The Manager class is responsible for orchestrating the entire documentation generation process. It takes in several parameters, including the project directory, configuration, LLM model, embedding model, and progress bar.
| Entity | Type | Role | Notes |
|---|---|---|---|
project_directory |
str | Project directory | The directory where the project is located |
config |
Config | Configuration | The configuration object containing project settings |
llm_model |
Model | LLM model | The LLM model used for generation |
embedding_model |
Embedding | Embedding model | The embedding model used for generating embeddings |
progress_bar |
BaseProgress | Progress bar | The progress bar used to track the generation process |
The Manager class has several methods that are used to perform different tasks:
generate_code_file: Generates the code mix file.generate_global_info: Generates the global information file.generete_doc_parts: Generates the documentation parts.factory_generate_doc: Generates the documentation using the factory.create_embedding_layer: Creates the embedding layer.order_doc: Orders the documentation.clear_cache: Clears the cache.save: Saves the documentation.
The Manager class has several attributes that are used to store information:
doc_info: Stores the documentation information.config: Stores the configuration.project_directory: Stores the project directory.progress_bar: Stores the progress bar.llm_model: Stores the LLM model.embedding_model: Stores the embedding model.logger: Stores the logger.
The following data entities are exchanged between the Manager and other components:
| Entity | Type | Role | Notes |
|---|---|---|---|
code_mix |
str | Code mix | The code mix generated by the CodeMix class |
global_info |
str | Global information | The global information generated by the compress_to_one function |
doc_parts |
str | Documentation parts | The documentation parts generated by the gen_doc_parts function |
factory_doc |
str | Factory documentation | The documentation generated by the factory |
embedding |
Embedding | Embedding | The embedding generated by the Embedding class |
Manager Data Contract – Relevance to Custom Intro & Embedding
The Manager orchestration layer passes the following entities to the functions described above:
| Entity | Type | Role | Notes |
|---|---|---|---|
code_mix |
str |
Source mix | Handed to generete_custom_discription for content extraction. |
global_info |
str |
Project overview | Used with get_introdaction. |
doc_parts |
str |
Chunked docs | Input for get_all_html_links and get_links_intro. |
factory_doc |
str |
Rendered intro blocks | May contain anchor tags extracted by get_all_html_links. |
embedding |
Embedding |
Vector provider | Created by Manager for sort_vectors usage. |
Warning
Thegenerete_custom_discriptionfunction assumes thatsplited_datais iterable over chunk strings; passing a single string will iterate character‑by‑character, causing erroneous prompts. Ensuresplited_datais a list of meaningful code or documentation fragments.
The Manager class is the central orchestrator for generating documentation.
Its constructor requires a project root path and a set of dependencies – a language‑model instance (llm_model), an embedding model (embedding_model), a progress bar object, and a Config instance that holds ignore patterns, language, project name, and additional metadata.
# Example: basic Manager instantiation and workflow
from autodocgenerator.manage import Manager
from autodocgenerator.engine.models.gpt_model import GPTModel
from autodocgenerator.postprocessor.embedding import Embedding
from autodocgenerator.ui.progress_base import ConsoleGtiHubProgress
# create required sub‑systems
llm = GPTModel(GROQ_API_KEYS, use_random=False)
embedder = Embedding(GOOGLE_EMBEDDING_API_KEY)
# project path and configuration supplied elsewhere
project_path = "./myproject"
config = ... # Config instance produced by read_config
manager = Manager(
project_path,
config=config,
llm_model=llm,
embedding_model=embedder,
progress_bar=ConsoleGtiHubProgress()
)
After the object is created, the following public methods can be called in the order shown below (the order is dictated by the typical documentation‑generation pipeline):
| Method | Purpose | Key parameters |
|---|---|---|
generate_code_file() |
Walks the project, collects source files, and builds an internal representation of the code structure. | None |
generate_global_info(compress_power=4) |
Aggregates global project information (e.g., README, package metadata) and optionally compresses the data using the supplied power. | compress_power – integer compression factor |
generete_doc_parts(max_symbols, with_global_file) |
Splits large documents into parts not exceeding max_symbols. If with_global_file is true, global information is included. |
max_symbols – int, with_global_file – bool |
factory_generate_doc(factory, to_start=False, with_splited=False) |
Uses a DocFactory instance to produce documentation sections from custom or built‑in modules. The to_start flag indicates whether the generated content should be prepended to the document. with_splited controls whether content should be split. |
factory – DocFactory instance, to_start – bool, with_splited – bool |
order_doc() |
Reorders the document sections according to a predefined or custom sequence. | None |
create_embedding_layer() |
Builds an embedding representation for the entire document, useful for semantic search or summarization. | None |
clear_cache() |
Removes temporary files and caches created during processing. | None |
save() |
Persists the generated documentation to disk (e.g., writes a markdown file). | None |
Once all transformations are complete, the full rendered document is accessible via:
full_text = manager.doc_info.doc.get_full_doc()
Typical usage pattern (as shown in run_file.py):
manager.generate_code_file()
if structure_settings.use_global_file:
manager.generate_global_info(compress_power=4)
manager.generete_doc_parts(
max_symbols=structure_settings.max_doc_part_size,
with_global_file=structure_settings.use_global_file
)
manager.factory_generate_doc(DocFactory(*custom_modules))
if structure_settings.include_order:
manager.order_doc()
# Add introductory sections if requested
extra_modules = []
if structure_settings.include_intro_text:
extra_modules.append(IntroText())
if structure_settings.include_intro_links:
extra_modules.append(IntroLinks())
manager.factory_generate_doc(
DocFactory(*extra_modules, with_splited=False),
to_start=True
)
manager.create_embedding_layer()
manager.clear_cache()
manager.save()
# retrieve the final document
document = manager.doc_info.doc.get_full_doc()
This workflow shows how the Manager interacts with custom modules, global file handling, ordering, and post‑processing steps to produce the final documentation output.
## Pre‑processor: CodeMix
Purpose – CodeMix compiles a repository’s file tree and content into a single markdown string suitable for LLM ingestion.
| Attribute | Type | Role | Notes |
|---|---|---|---|
root_dir |
Path |
Base directory | Resolved absolute path. |
ignore_patterns |
list[str] |
Wildcards to skip | Default empty; can be overridden. |
logger |
BaseLogger |
Logging instance | Reports ignored paths. |
Methods
| Method | Parameters | Returns | Notes |
|---|---|---|---|
__init__(self, root_dir=".", ignore_patterns=None) |
root_dir, ignore_patterns |
– | Stores state; creates a BaseLogger. |
should_ignore(self, path: str) -> bool |
path |
bool |
Determines if a file/dir matches any ignore pattern. |
build_repo_content(self) -> str |
– | str |
Concatenates a tree view and file contents. |
Logic
- Compute
relative_pathfromroot_dir. - For each
patterninignore_patternscheck:fnmatchagainst the full relative string,- against the file/directory name,
- against any part of the path.
- Return
Trueon first match.
Logic Flow
- Start with
"Repository Structure:". - Recursively walk
root_dirsorted by name. - For each item that is not ignored, append an indented name (
/for directories). - Append a separator of 20
"=". - Second traversal: for every file (non‑ignored), append a
<file path="...">tag, the file’s UTF‑8 content, and a newline. - If file reading fails, capture the exception string.
- Join all lines with
"\n"and return.
Side Effect – The method logs every ignored path at level 1.
A module‑level list of glob patterns and directory names to exclude, such as *.pyc, __pycache__, .git, venv, etc. It is passed to CodeMix when instantiated.
Instantiates CodeMix on a hard‑coded project directory and prints the confirmation message "Файл успешно создан!". (The message is in Russian and indicates success.)
## Data Contract – CodeMix ↔ Manager
| Entity | Type | Role | Notes |
|---|---|---|---|
repo_content |
str |
Repository dump | Returned by build_repo_content. |
ignore_patterns |
list[str] |
Exclusion rules | Influences traversal and logging. |
root_dir |
Path |
Repository root | Determines relative paths. |
The Manager consumes
repo_contentas the initial source mix for downstream compression and LLM prompting. No direct interaction withignore_list; it is purely configuration.
settings.py – Project Prompt Builder
autodocgenerator/postprocessor/embedding.py
Purpose
Provide vector embeddings for document parts using Google Gemini, and utility helpers to sort by semantic distance.
## Post‑processor: Sorting Logic
Purpose – The sorting module is responsible for extracting anchor links from a markdown document, dividing the text into named chunks, and re‑ordering those chunks using an LLM.
The script can be executed standalone, opening a local README and printing the anchor mapping produced by split_text_by_anchors.
extract_links_from_startfeedssplit_text_by_anchors, which yields a dictionary of anchored sections for the post‑processor.get_orderreceives aModelinstance from the Manager, re‑orders the extracted anchor titles, and the result is used for semantic re‑arrangement in later stages.CodeMix.build_repo_contentsupplies theManagerwith a single string of repository structure and file bodies, which becomescode_mixpassed togenerete_custom_discription.
Key Dependencies
engine.models.model.Model– LLM interface for ordering.ui.logging.BaseLogger– Unified logging for both modules.re,fnmatch,pathlib,os– Standard library utilities.
All logic is strictly derived from the provided snippets; no external assumptions have been made.
### extract_links_from_start(chunks)
| Entity | Type | Role | Notes |
|---|---|---|---|
chunks |
list[str] |
Chunk collection | Input list of markdown sections. |
links |
list[str] |
Anchor URLs | Captured from leading <a name…> tags. |
have_to_del_first |
bool |
Indicates if the first chunk is a placeholder | Set true when no anchor is found at start of a chunk. |
Logic Flow
- Iterate over each
chunk. - Strip whitespace and apply regex
^<a name=["']?(.*?)["']?</a>to find a leading anchor. - If an anchor name longer than 5 chars is found, append
#anchortolinks. - If no anchor is found for a chunk, flag
have_to_del_firstto true. - Return
(links, have_to_del_first).
Warning – The function assumes that anchors, if present, are strictly at the beginning of the chunk. Any leading markdown or whitespace may cause a false negative.
### split_text_by_anchors(text: str)
| Entity | Type | Role | Notes |
|---|---|---|---|
text |
str |
Full markdown document | Input to split. |
chunks |
list[str] |
Preliminary split by anchor boundaries | Uses look‑ahead regex. |
result_chanks |
list[str] |
Stripped, non‑empty chunks | Result of split. |
all_links |
list[str] |
Anchor URLs extracted | From extract_links_from_start. |
result |
dict[str, str] |
Mapping of anchor → chunk | Returned value. |
Logic Flow
- Split
texton the pattern that precedes an anchor ((?=<a name…>)). - Strip each fragment, discard empty ones.
- Retrieve links and flag via
extract_links_from_start. - If the first anchor occurs far into the file (
start_link_index > 10) or the flag is set, remove the first chunk (assumed placeholder). - Verify that the number of links matches the number of chunks; otherwise raise
Exception. - Build a dictionary mapping each link to its corresponding chunk.
- Return the mapping.
Critical – The function throws an exception if the anchor count diverges from the chunk count, guarding against malformed documents.
### get_order(model: Model, chanks: list[str]) -> list
| Entity | Type | Role | Notes |
|---|---|---|---|
model |
Model |
LLM wrapper | Imported from engine.models.model. |
chanks |
list[str] |
Anchor titles | Titles to be ordered. |
result |
str |
LLM response | Raw comma‑separated titles. |
new_result |
list[str] |
Ordered list | Stripped titles. |
Logic Flow
- Log “Start ordering” using
BaseLogger. - Build a user prompt that requests a semantic ordering of the titles.
- Call
model.get_answer_without_history(prompt). - Split the response on commas, strip whitespace, and log the final list.
- Return the ordered list.
Warning – The prompt string is hard‑coded; any changes to the LLM template require editing this file.
| Function | Inputs | Outputs | Notes |
|---|---|---|---|
bubble_sort_by_dist(arr: list) -> list |
arr: list of tuples (key, distance) |
Sorted list by ascending distance | Implements an O(n²) bubble sort, used only for small result sets. |
get_len_btw_vectors(vector1, vector2) -> float |
Two NumPy arrays | Euclidean distance | Calls np.linalg.norm(vector1, vector2). |
sort_vectors(root_vector, other: dict[str, Any]) -> list[str] |
root_vector: reference vector, other: mapping id → vector |
List of keys sorted by proximity to root_vector |
Builds a distance list, bubble sorts it, and extracts keys. |
| Attribute | Type | Role | Notes |
|---|---|---|---|
client |
genai.Client |
Google GenAI client | Initialized with an API key. |
Methods
| Method | Parameters | Returns | Notes |
|---|---|---|---|
__init__(api_key: str) |
api_key: Gemini API key |
– | Stores client instance. |
get_vector(prompt: str) |
prompt: text to embed |
np.ndarray of shape (768,) |
Calls self.client.models.embed_content with model gemini-embedding-2-preview. Raises if embeddings are missing. |
Usage Pattern
embed = Embedding(api_key="YOUR_KEY")
vec = embed.get_vector("sample document text")
compressor.py – Text Compression Pipeline
compress(data: str, project_settings: ProjectSettings, model: Model, compress_power) -> str
Purpose – Off‑load a single large text chunk to the LLM, requesting a compressed summary of the specified power.
| Entity | Type | Role | Notes |
|---|---|---|---|
data |
str |
Raw source chunk | Input to the LLM |
project_settings |
ProjectSettings |
System‑level prompt builder | Supplies project_settings.prompt |
model |
Model |
LLM client | Must expose get_answer_without_history |
compress_power |
int |
Compression factor | Influences the base prompt length |
Logic
- Build a three‑step prompt list: system instruction from
project_settings.prompt, a base compression hintget_BASE_COMPRESS_TEXT(len(data), compress_power), then the rawdataas user content. - Call
model.get_answer_without_history(prompt=prompt)– no history retained for this operation. - Return the LLM answer as a string.
compress_and_compare(data: list, model: Model, project_settings: ProjectSettings, compress_power: int = 4, progress_bar: BaseProgress = BaseProgress()) -> list
Purpose – Batch‑compress a list of text fragments, concatenating every compress_power items into one compressed block while tracking progress.
| Entity | Type | Role | Notes |
|---|---|---|---|
data |
list[str] |
Collection of fragments | Each element is compressed independently |
model |
Model |
LLM client | Same as in compress |
project_settings |
ProjectSettings |
Prompt source | |
compress_power |
int |
Chunk grouping size | Defaults to 4 |
progress_bar |
BaseProgress |
UI progress helper | Instantiated with default if not supplied |
Logic
- Pre‑allocate
compress_and_compare_datawith a size equal toceil(len(data)/compress_power). - Create a sub‑task in
progress_bartitled “Compare all files”. - For each element
elindata:- Determine its destination index
curr_index = i // compress_power. - Append the compressed result of
elplus a newline to the corresponding slot. - Update progress.
- Determine its destination index
- Remove the sub‑task after iteration and return the list of compressed blocks.
compress_to_one(data: list, model: Model, project_settings: ProjectSettings, compress_power: int = 4, progress_bar: BaseProgress = BaseProgress())
Purpose – Iteratively reduce a list of fragments into a single string by repeatedly applying compress_and_compare.
| Entity | Type | Role | Notes |
|---|---|---|---|
data |
list[str] |
Initial fragments | Will be mutated each iteration |
model |
Model |
LLM client | |
project_settings |
ProjectSettings |
Prompt source | |
compress_power |
int |
Base grouping factor | If remaining items < compress_power+1, the power is lowered to 2 |
progress_bar |
BaseProgress |
Progress indicator |
Logic
- Loop while
len(data) > 1:- If the current list length is less than
compress_power+1, setnew_compress_power=2to force final merge. - Call
compress_and_comparewith the currentdataandnew_compress_power. - Replace
datawith the returned list.
- If the current list length is less than
- Return the lone string once only one item remains.
spliter.py – Text Partitioning
split_data – Text Partitioning
Purpose
Break a single source string into a list of smaller chunks that respect an upper symbol limit.
The algorithm first trims overly long fragments by repeatedly halving any piece that exceeds max_symbols * 1.5.
After that, it greedily packs fragments into split_objects ensuring each does not grow beyond max_symbols * 1.25.
| Entity | Type | Role | Notes |
|---|---|---|---|
data |
str |
Raw source code | The text that will be partitioned |
max_symbols |
int |
Size threshold | Maximum allowed length for a chunk |
split_objects |
list[str] |
Resulting chunks | Returned to the caller |
logger |
BaseLogger |
Diagnostics | Emits split progress |
Step‑by‑Step Flow
- Initial split –
splited_by_files = data.split("\n"). - Recursive halving – While any fragment exceeds 150 % of
max_symbols, it is split in half and re‑inserted. - Packing – Fragments are appended to the current chunk until adding one would exceed 125 % of
max_symbols; a new chunk is started. - Logging – Reports the number of parts produced.
Critical Assumptions
The algorithm assumes line‑based splitting is a suitable approximation for logical code boundaries.
No external token counting is performed; length is measured in raw characters.
write_docs_by_parts – LLM Chunk Documentation
Purpose
Send a single code fragment (part) to the LLM and receive its markdown documentation, optionally attaching contextual information from preceding or global sections.
| Entity | Type | Role | Notes |
|---|---|---|---|
part |
str |
Chunk to document | The text produced by split_data |
model |
Model |
LLM client | Uses get_answer_without_history |
project_settings |
ProjectSettings |
Prompt base | Supplies prompt property |
prev_info |
`str | None` | Tail of previous docs |
language |
str |
Localization | Prepends language instruction |
global_info |
`str | None` | Project‑wide context |
logger |
BaseLogger |
Logging | Records start/end and output length |
Logic
- Assemble a system prompt hierarchy:
- Language directive.
- Project meta‑information (
project_settings.prompt). - Base template
BASE_PART_COMPLITE_TEXT. - Optional global relations or prior documentation snippet.
- Append the user message containing the code
part. - Call
model.get_answer_without_history. - Strip leading and trailing Markdown fences (
```) if present. - Return the cleaned answer string.
Warning
The function mutates
answeronly when fence delimiters are detected.
If the LLM returns a string without fences, it is returned unchanged.
gen_doc_parts – Full‑Project Documentation Assembly
Purpose
Coordinate splitting and LLM generation for an entire source mix, producing a contiguous markdown document.
| Entity | Type | Role | Notes |
|---|---|---|---|
full_code_mix |
str |
Aggregated source | Input from pre‑processor |
max_symbols |
int |
Chunk size | Passed to split_data |
model |
Model |
LLM client | Shared with write_docs_by_parts |
project_settings |
ProjectSettings |
Prompt source | Provides context |
language |
str |
Localization | e.g., "en" |
progress_bar |
BaseProgress |
UI helper | Optional; defaults to empty instance |
global_info |
`str | None` | Project relations |
logger |
BaseLogger |
Diagnostics | Reports progress and total length |
Flow
- Split
full_code_mixviasplit_data. - Sub‑task creation –
progress_bar.create_new_subtask. - Iterative generation – For each
elinsplitted_data:- Call
write_docs_by_parts(el, ...). - Append result to
all_result, separated by two newlines. - Truncate
resultto the last 3000 characters to preserve context for the next part. - Update progress bar.
- Call
- Cleanup – Remove sub‑task.
- Return the concatenated markdown string
all_result.
Side Effects
- Emits detailed
InfoLogentries (lengths, per‑part content at level 2). - Progress bar state changes do not affect the content.
ProjectSettings(project_name: str)
Purpose – Assemble a dynamic system prompt incorporating project metadata.
| Entity | Type | Role | Notes |
|---|---|---|---|
project_name |
str |
Identifier for the target repo | Passed at construction |
info |
dict |
Arbitrary key/value pairs | Added via add_info |
Methods
add_info(key, value)– Stores a new metadata entry.prompt(property) – Generates a full prompt string by concatenatingBASE_SETTINGS_PROMPT, the project name line, and every key/value pair frominfo. Each entry ends with a newline.
Logging Infrastructure – BaseLogger & Templates
Purpose
Provide a lightweight, level‑aware logging façade that can be swapped between console and file output without affecting downstream code.
Component Overview
| Class | Responsibility | Key Methods |
|---|---|---|
BaseLog |
Base log message holder | format() – string representation |
ErrorLog / WarningLog / InfoLog |
Sub‑class with severity prefix | format() overrides |
BaseLoggerTemplate |
Strategy for output (console / file) | log(), global_log() |
FileLoggerTemplate |
Appends logs to a file | log() |
BaseLogger |
Singleton façade for the rest of the system | set_logger(), log() |
Interaction Flow
- Instantiate a concrete
BaseLoggerTemplate(ConsoleviaprintorFileLoggerTemplate). - Inject it into the singleton
BaseLogger(set_logger). - Generate a log instance (
ErrorLog(...), etc.). - Route through
BaseLogger.log()→logger_template.global_log()→printor file write.
Note:
global_logenforces the configuredlog_level.
Iflog_level < 0, all messages are printed.
Data Contract
| Entity | Type | Role | Notes |
|---|---|---|---|
log |
BaseLog |
Input to BaseLogger.log |
message: str, level: int |
log_level |
int |
Configurable threshold | >= message level triggers output |
file_path |
str |
Destination for file logs | Optional; defaults to console |
Example Usage
logger = BaseLogger()
logger.set_logger(FileLoggerTemplate("my.log"))
logger.log(InfoLog("Process started", level=1))
Progress Tracking – Rich and Console Back‑ends
Purpose
Offer a minimal progress interface that can be replaced by a rich terminal bar or a simple console printout, enabling UI‑agnostic task progress reporting.
Component Overview
| Class | Responsibility | Key Methods |
|---|---|---|
BaseProgress |
Abstract progress interface | create_new_subtask, update_task, remove_subtask |
LibProgress |
Rich‑based implementation | Overrides all abstract methods |
ConsoleTask |
Lightweight console counter | progress() prints percent |
ConsoleGtiHubProgress |
Console fallback with a default “General” task | Overrides progress methods |
Interaction Flow
- Create a concrete progress instance (
LibProgressorConsoleGtiHubProgress). - Create a sub‑task:
create_new_subtask(name, total_len). - Advance with
update_task()per iteration. - Remove sub‑task once done:
remove_subtask().
If no sub‑task exists, the General task advances.
Assumption: The
Progressobject passed toLibProgressis arich.progress.Progressinstance; itsadd_taskreturns a task ID used byupdate_task.
Data Contract
| Entity | Type | Role | Notes |
|---|---|---|---|
name |
str |
Sub‑task identifier | Displayed in console or progress bar |
total_len |
int |
Total units of work | Determines progress percentage |
progress |
rich.progress.Progress |
Rich progress bar instance | Only used by LibProgress |
Installation Helper – PowerShell Script (install.ps1)
Purpose
Automate the creation of a GitHub Actions workflow and a default autodocconfig.yml file for a project, ensuring the Auto‑Doc Generator is ready to run without manual setup.
Key Steps
- Create
.github/workflowsdirectory if missing. - Write
autodoc.ymlworkflow that references the reusablereuseble_agd.ymltemplate. - Generate a YAML configuration file containing:
- Project name (derived from current folder)
- Language
- Ignored file patterns
- Build and structure settings
Side effect: Emits a green “✅ Done!” message to the console upon completion.
Data Contract
| Entity | Type | Role | Notes |
|---|---|---|---|
project_name |
str |
Derived from current folder | Used for configuration metadata |
language |
str |
Localization setting | Default en |
ignore_files |
list[str] |
File glob patterns to skip | Passed to Config |
build_settings |
dict |
Runtime flags | e.g., save_logs, log_level |
structure_settings |
dict |
Document layout flags | e.g., include_intro_text |
Install.sh Script – Workflow Automation
Purpose
Creates a minimal CI setup for Auto‑Doc Generator in an existing repository by:
- Generating a GitHub Actions workflow that re‑uses the shared
reuseble_agd.ymltemplate. - Writing a default
autodocconfig.ymlthat captures project metadata, ignore patterns, and build/structure settings.
This eliminates manual configuration steps, enabling immediate autodocgenerator runs from the command line or within CI.
Data Contract
| Entity | Type | Role | Notes |
|---|---|---|---|
project_name |
str |
Derived from the current working directory | Populates project_name in autodocconfig.yml. |
language |
str |
Localization flag | Defaults to "en" in the config. |
ignore_files |
list[str] |
File glob patterns to skip | Directly copied into the config. |
build_settings |
dict |
Runtime flags for the generator | e.g., save_logs, log_level. |
structure_settings |
dict |
Document layout flags | e.g., include_intro_*, use_global_file. |
workflow_path |
str |
Target path for GitHub Action | .github/workflows/autodoc.yml. |
config_path |
str |
Target path for config | autodocconfig.yml. |
Side effect: prints a green “✅ Done!” message once each file is written.
Step‑by‑Step Logic
-
Create workflow directory
mkdir -p .github/workflows
Creates
.github/workflowsif missing, ensuring GitHub can discover the workflow. -
Write GitHub Actions workflow
cat <<EOF > .github/workflows/autodoc.yml name: AutoDoc on: [workflow_dispatch] jobs: run: permissions: contents: write uses: Drag-GameStudio/ADG/.github/workflows/reuseble_agd.yml@main secrets: GROCK_API_KEY: \${{ secrets.GROCK_API_KEY }} EOF
A single‑file workflow that triggers manually and forwards the
GROCK_API_KEYsecret to the reusable template. -
Emit completion message
echo "✅ Done! .github/workflows/autodoc.yml has been created."
Immediate visual confirmation for the user.
-
Write default
autodocconfig.ymlcat <<EOF > autodocconfig.yml project_name: "$(basename "$PWD")" language: "en" ... EOF
The file is populated with:
- Project metadata – name, language.
ignore_files– a comprehensive list covering Python bytecode, caches, virtualenvs, logs, VCS data, etc.build_settings– flags controlling log persistence and verbosity.structure_settings– toggles for intro sections, ordering, and global file generation, plusmax_doc_part_size.
-
Emit second completion message
echo "✅ Done! autodocconfig.yml has been created."
Outputs
- File
.github/workflows/autodoc.yml: ready to run the Auto‑Doc Generator in a CI context. - File
autodocconfig.yml: initial configuration that can be edited to suit a particular project. - Console output: two success messages indicating successful file creation.
Note: The script assumes a POSIX‑compatible shell (Bash) and that the current directory is a valid repository root.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file autodocgenerator-1.0.4.0.tar.gz.
File metadata
- Download URL: autodocgenerator-1.0.4.0.tar.gz
- Upload date:
- Size: 57.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.3.2 CPython/3.12.13 Linux/6.14.0-1017-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
44853a3a697fdb9c9b2a40ee83c3c6879c0642d023fb6d9ad138d6a12a54e5e5
|
|
| MD5 |
fe6e26e9e23530110fb80fea40f98373
|
|
| BLAKE2b-256 |
f6c22fdea67a7e9a27f158c134a0c3fcef0d4a6793fe9998445c7bc54dce5bdf
|
File details
Details for the file autodocgenerator-1.0.4.0-py3-none-any.whl.
File metadata
- Download URL: autodocgenerator-1.0.4.0-py3-none-any.whl
- Upload date:
- Size: 46.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.3.2 CPython/3.12.13 Linux/6.14.0-1017-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
af1c382d434ef9afb7e4360b6a420552c8f1bfac7742e43810648c7b0c820d34
|
|
| MD5 |
e23af4360499f533703e591bcb37c87b
|
|
| BLAKE2b-256 |
36c23851849dae2504e2df6e88e27e897d117ba385486d803828622a62c28935
|