This Project helps you to create docs for your projects
Project description
Executive Navigation Tree
- 📦 Installation & Workflow
- ⚙️ Configuration
- 🏗️ Model & Architecture
- 📂 Modules & Management
- 📝 Document Generation
- 📄 Content & Descriptions
- 🔗 Link & Text Processing
- 🛠️ Utilities
- 📦 Compression
- ❓ Miscellaneous
Installation workflow overview
-
Windows PowerShell execution
- Open a PowerShell terminal with administrative rights.
- Run the following one‑liner, which downloads the PowerShell installer script directly from the project's repository and executes it in the same session:
irm <raw‑script‑url> | iex
- The command uses
Invoke‑WebRequest(irm) to fetch the script content and pipes it toiex(Invoke‑Expression) for immediate execution.
-
Linux/macOS shell execution
- Open a terminal.
- Execute the following command to retrieve the shell installer script from the repository and run it with
bash:curl -sSL <raw‑script‑url> | bash
curlfetches the script silently (-s) while following redirects (-L). The output is streamed tobashfor execution.
-
GitHub Actions secret configuration
- In the GitHub repository, navigate to Settings → Secrets and variables → Actions.
- Add a new secret named
GROCK_API_KEY. - Paste the API key you obtained from the Grock documentation into the value field.
- Save the secret; the workflow will now have access to
GROCK_API_KEYas an environment variable during runs.
-
Workflow behavior
- When the GitHub Action triggers, it will reference the
GROCK_API_KEYsecret to authenticate calls to the Grock service. - The appropriate installer command (PowerShell on Windows runners, Bash on Linux/macOS runners) will be invoked, pulling the latest installer script from the repository and executing it automatically.
- When the GitHub Action triggers, it will reference the
Key points to remember
- Use the raw file URL from the repository for both
irmandcurlcommands. - Ensure the secret is correctly named and stored; GitHub masks its value in logs.
- Run the commands in a clean environment to avoid conflicts with existing installations. The configuration file uses a top‑level mapping with several sections:
Project information
project_name: a short title for the documentation generator.language: the language code for the generated text (e.g., “en”).
Build section
save_logs: set totrueto keep generation logs,falseto discard them.log_level: numeric level controlling verbosity (higher values give more detail).
Structure section
include_intro_links:trueadds navigation links at the beginning.include_order:truekeeps the original order of the processed files.max_doc_part_size: maximum size of each documentation chunk, expressed as an integer.
Additional information
global idea: a free‑form description that will be inserted into the documentation as a project overview.
Custom descriptions
- A list of strings that define extra prompts for the generator. Each item can contain placeholders and URLs for installation instructions or other guidance.
When creating the file, follow the YAML syntax shown above, using proper indentation for nested mappings and list items. Use boolean values (true/false) and integers where indicated. The custom description strings can be written on separate lines prefixed with a hyphen.
Config Reader – YAML Parsing
The read_config function deserialises a YAML string into three concrete objects used throughout the runner.
| Entity | Type | Role | Notes |
|---|---|---|---|
file_data |
str |
Raw YAML payload | Must be UTF‑8 encoded |
config |
Config |
Global project configuration | Populated via Config setters |
custom_modules |
list[CustomModule│CustomModuleWithOutContext] |
Extension points for documentation generators | Determined by leading “%” token |
structure_settings_object |
StructureSettings |
Controls output segmentation and linking | Loads arbitrary keys from structure_settings dict |
Logic flow
yaml.safe_load→data(dict).- Instantiate
Config&ProjectBuildConfig. - Pull
ignore_files,language,project_name,project_additional_info,build_settingsfromdata. pcs.load_settings(build_settings), then chainconfig.set_language(...).set_project_name(...).set_pcs(pcs).- Iterate
ignore_files→config.add_ignore_file. - Iterate
project_additional_info→config.add_project_additional_info. - Build
custom_moduleslist:%prefix →CustomModuleWithOutContext, elseCustomModule. - Load
structure_settingsinto a freshStructureSettingsviaload_settings. - Return
(config, custom_modules, structure_settings_object).
Deterministic: No conditionals beyond data‑driven branches; identical input yields identical output.
Project Build Config Model (ProjectBuildConfig)
A simple container for build‑time flags.
| Entity | Type | Role | Notes |
|---|---|---|---|
save_logs |
bool |
Enable persistent logging | Default False |
log_level |
int |
Verbosity selector | Default -1 (unspecified) |
load_settings |
method |
Populate attributes from dict | Direct setattr loop |
No methods beyond load_settings; the object is attached to Config via set_pcs.
ProjectSettings – Prompt Builder
| Entity | Type | Role |
|---|---|---|
project_name |
str |
Identifier inserted into prompt |
info |
dict |
Additional key‑value pairs |
prompt (property) |
str |
Concatenation of BASE_SETTINGS_PROMPT, project name, and each info entry (each on its own line) |
Logic
add_infostores arbitrary metadata.promptassembles base prompt, project name, then iteratesself.infoto append"{key}: {value}"lines.
Note: All functions rely exclusively on the LLM interface (
get_answer_without_history) and a progress‑bar abstraction; no file I/O occurs here.
Data Contract
| Entity | Type | Role | Notes |
|---|---|---|---|
print("ADG") |
side‑effect (stdout) | Simple identification signal emitted at import time. | No return value; executed once per interpreter session. |
BaseLogger |
class (import) | Core logging facility used throughout the package. | Imported but not instantiated elsewhere in this file. |
BaseLoggerTemplate |
class (import) | Provides the default formatting/handler configuration for the logger. | Passed to logger.set_logger. |
logger |
BaseLogger instance |
Shared logger instance exposed as a module‑level variable. | Other modules can from autodocgenerator import logger. |
InfoLog, ErrorLog, WarningLog |
classes (import) | Specialized log record types. | Imported for external use; not instantiated here. |
⚠️ Note – The module does not perform file I/O, network calls, or alter global state beyond the stdout side‑effect and logger creation.
Execution Flow (Step‑by‑Step)
- Import phase – Python evaluates the file linearly.
printexecution – Immediately writes"ADG"to the console.- Symbol import – Retrieves logger‑related classes from
autodocgenerator.ui.logging. - Logger instantiation – Calls
BaseLogger()→ creates a logger object. - Template binding – Calls
logger.set_logger(BaseLoggerTemplate())→ attaches the default template to the logger. - Export – The module’s namespace now contains the ready‑to‑use
loggerand the imported log‑type classes.
No additional functions or conditional branches are present; the module’s behavior is fully deterministic and repeatable on each import.
Core Model Hierarchy (ParentModel, Model, AsyncModel)
Responsibility – Supplies shared state (API key, history, model rotation) for concrete generators.
Visible interactions – Other modules import Model/AsyncModel via gpt_model.py; they receive a pre‑configured instance from the orchestrator.
| Entity | Type | Role | Notes |
|---|---|---|---|
api_key |
str |
Authentication token | Defaulted to API_KEY from config |
history |
History |
Conversational buffer | Injected or created lazily |
use_random |
bool |
Controls shuffling of MODELS_NAME |
Randomised on each instantiation |
current_model_index |
int |
Index of the active model | Starts at 0 |
regen_models_name |
list[str] |
Rotation list of model identifiers | Shuffled when use_random=True |
Logic flow
ParentModel.__init__storesapi_key&history.- Copies global
MODELS_NAME; shuffles ifuse_random. - Exposes
regen_models_name¤t_model_indexfor child classes.
Abstract Base Module (BaseModule)
| Entity | Type | Role | Notes |
|---|---|---|---|
BaseModule |
ABC |
Contract for all doc‑generation blocks | Requires generate(info: dict, model: Model) |
__init__ |
method |
No‑op constructor | Allows subclass‑specific init |
generate |
abstractmethod |
Core payload generator | Must return a string fragment |
Assumption – Sub‑classes provide concrete logic; the base class itself does not produce output.
Documentation Orchestrator (DocFactory)
| Entity | Type | Role | Notes |
|---|---|---|---|
modules |
list[BaseModule] |
Ordered generators supplied at construction | Stored as‑is |
logger |
BaseLogger |
Centralised logging | Uses InfoLog |
generate_doc |
method |
Executes each module, aggregates results, updates progress | Returns the full markdown document |
Logic flow
- Initialise
output = "". - Call
progress.create_new_subtask("Generate parts", len(self.modules)). - Iterate
moduleinself.modules:module_result = module.generate(info, model)- Append
module_resultand two newlines tooutput. - Log module completion (
InfoLog). - Log raw module output at level 2.
progress.update_task().
- After loop,
progress.remove_subtask()and returnoutput.
Warning – The
__main__guard instantiatesBaseModule()directly, which is abstract and would raiseTypeErrorif executed.
Custom Content Modules (CustomModule, CustomModuleWithOutContext)
| Entity | Type | Role | Notes |
|---|---|---|---|
discription |
str |
User‑provided header for the custom block | Set in ctor |
generate (both) |
method |
Calls post‑processor to build a custom description | Returns a string |
CustomModule –
- Split
info["code_mix"]into ≤ 5000‑symbol chunks viasplit_data. - Invoke
generete_custom_discriptionwith the chunks, model, description, and language.
CustomModuleWithOutContext –
- Directly call
generete_custom_discription_withoutwith model, description, and language (no code context).
Both rely exclusively on the imported post‑processor functions; no side effects beyond the returned string.
Intro Extraction Modules (IntroLinks, IntroText)
| Entity | Type | Role | Notes |
|---|---|---|---|
generate |
method |
Produces introductory material | Returns a string |
links / intro |
str |
Intermediate data from helpers | Obtained from info dict |
IntroLinks –
get_all_html_links(info["full_data"])→links.get_links_intro(links, model, info["language"])→intro_links.
IntroText –
get_introdaction(info["global_data"], model, info["language"])→intro.
Both modules delegate all heavy lifting to the imported custom_intro helpers and simply forward the resulting markdown snippet.
Manager – Orchestrator of Project‑wide Documentation Pipeline
| Entity | Type | Role | Notes |
|---|---|---|---|
CACHE_FOLDER_NAME |
str |
Fixed cache directory name | ".auto_doc_cache" |
FILE_NAMES |
dict[str,str] |
Maps logical keys to cache filenames | Used by get_file_path |
__init__ |
method |
Sets configuration, logger, progress UI, creates cache folder | progress_bar defaults to a fresh BaseProgress() instance |
read_file_by_file_key |
method |
Returns raw text of a cached file | Reads UTF‑8, key resolved via FILE_NAMES |
get_file_path |
method |
Constructs absolute cache path for a given key | Combines project_directory, CACHE_FOLDER_NAME, and FILE_NAMES |
generate_code_file |
method |
Builds a code‑mix file from the repository | Uses CodeMix.build_repo_content |
generete_doc_parts |
method |
Splits code_mix into ≤ 5 000‑symbol chunks and generates markdown via gen_doc_parts |
Writes result to output_doc |
factory_generate_doc |
method |
Invokes a DocFactory to prepend additional modules to the existing doc |
Merges new fragments with current output |
order_doc |
method |
Re‑orders markdown sections by anchor using split_text_by_anchors & get_order |
Overwrites output_doc |
clear_cache |
method |
Optionally removes the log file based on config.pbc.save_logs |
No other side‑effects |
Warning – The default argument
progress_bar: BaseProgress = BaseProgress()creates a mutable instance at import time; repeatedManagerconstructions share the same progress object.
Initialization Flow
- Store
project_directory,config, optional models, andprogress_bar. - Initialise
BaseLoggerand attach aFileLoggerTemplatetargeting the cachelogsfile. - Ensure the cache folder exists (
os.mkdirif absent).
Core Operations
1. generate_code_file
- Log start (
InfoLog). - Instantiate
CodeMixwithproject_directoryandconfig.ignore_files. - Call
cm.build_repo_content→ writescode_mix.txt. - Log completion and advance the progress bar.
2. generete_doc_parts
- Load
code_mix.txt. - Log start, invoke
gen_doc_parts(full_code_mix, max_symbols, sync_model, config.language, progress_bar). - Persist returned markdown to
output_doc.md. - Log finish and update progress.
3. factory_generate_doc
- Load current
output_doc.mdandcode_mix.txt. - Assemble
infodict (language,full_data,code_mix). - Log detailed start message including module names and input sizes.
- Call
doc_factory.generate_doc(info, sync_model, progress_bar). - Prepend new fragments to the existing doc and write back.
- Update progress.
4. order_doc
- Read current
output_doc.md. - Split by markdown anchors (
split_text_by_anchors). - If split succeeded, reorder sections via
get_order(sync_model, parts). - Overwrite
output_doc.mdwith ordered content.
5. clear_cache
- If
config.pbc.save_logsisFalse, delete thereport.txtlog file.
All side‑effects are confined to file system writes within the hidden cache directory and logger emissions; no network or external state is accessed beyond the injected Model instances.
!noinfo
Module Initialization & Logger Configuration
The autodocgenerator/__init__.py module performs three concrete actions when the package is imported:
- Emits a literal string
"ADG"to stdout viaprint. - Imports the public logger classes from
autodocgenerator.ui.logging:from .ui.logging import BaseLogger, BaseLoggerTemplate, InfoLog, ErrorLog, WarningLog
- Instantiates a singleton‑style logger and binds a default template:
logger = BaseLogger() logger.set_logger(BaseLoggerTemplate())
These steps make a ready‑to‑use logger object available to any sub‑module that imports autodocgenerator.
Asynchronous Generator (AsyncGPTModel)
| Entity | Type | Role | Notes |
|---|---|---|---|
client |
AsyncGroq |
Async LLM client | Instantiated with api_key |
logger |
BaseLogger |
Async‑compatible logger | Same log classes as sync version |
generate_answer |
async method |
Async request/response loop | Returns awaitable str |
Logic flow (mirrors GPTModel but using await):
- Log async start.
- Resolve
messagesfrom history orprompt. while Trueloop with the same exhaustion check and model rotation.await self.client.chat.completions.create(...).- On failure: log warning, rotate index, continue.
- After success, extract
result, log both model used and answer, thenreturn result.
Interaction pattern – Consumed by the orchestrator (gen_doc) via await model.generate_answer(...); shares the same rotation logic as the sync counterpart.
Synchronous Generator (GPTModel)
| Entity | Type | Role | Notes |
|---|---|---|---|
client |
Groq |
Remote LLM client | Created with api_key |
logger |
BaseLogger |
Structured logging | Uses InfoLog, ErrorLog, WarningLog |
generate_answer |
method |
Core request/response loop | Returns str |
Logic flow
- Log start of generation.
- Choose
messagesfromhistoryor suppliedprompt. - Loop:
- If
regen_models_nameempty → log error & raiseModelExhaustedException. - Pick
model_nameatcurrent_model_index. - Attempt
self.client.chat.completions.create(messages=messages, model=model_name). - On exception: log warning, advance index (wrap‑around), retry.
- If
- Extract
resultfromchat_completion.choices[0].message.content. - Log success & result (level 2).
- Return
result.
Determinism – Outcome depends only on input data and external API responses; no hidden branches.
Document Generation Orchestrator (gen_doc)
Coordinates model instantiation, manager setup, and final document retrieval.
| Entity | Type | Role | Notes |
|---|---|---|---|
project_path |
str |
Root of source tree | Passed to Manager |
config |
Config |
Project‑wide settings | From read_config |
custom_modules |
list[CustomModule│CustomModuleWithOutContext] |
Doc factories | Forwarded to DocFactory |
structure_settings |
StructureSettings |
Output segmentation flags | Controls ordering & intro links |
Step‑by‑step
- Instantiate
GPTModel(sync) &AsyncGPTModel(async) with globalAPI_KEY. - Build
Managerwith path, config, models, and aConsoleGtiHubProgressbar. - Call
manager.generate_code_file(). - Split docs via
manager.generete_doc_parts(max_symbols=structure_settings.max_doc_part_size). - Feed custom factories:
manager.factory_generate_doc(DocFactory(*custom_modules)). - If
include_order→manager.order_doc(). - If
include_intro_links→manager.factory_generate_doc(DocFactory(IntroLinks())). - Clean temporary cache, then
manager.read_file_by_file_key("output_doc")is returned.
generate_descriptions_for_code – LLM‑driven Doc Generation
| Entity | Type | Role |
|---|---|---|
data |
list[str] |
Code snippets |
model |
Model |
LLM |
project_settings |
ProjectSettings |
Unused (present for signature) |
progress_bar |
BaseProgress |
Progress |
| return | list[str] |
Model answers (descriptions) |
Logic
- For each
codecreate a two‑message prompt (instruction block +CONTEXT: {code}), callmodel.get_answer_without_history, append answer, update progress.
gen_doc_parts – Synchronous Batch Documentation
| Entity | Type | Role | Notes |
|---|---|---|---|
full_code_mix |
str |
Complete source to split | |
max_symbols |
int |
Chunk size for split_data |
|
model |
Model |
LLM used for each part | |
language |
str |
Output language | |
progress_bar |
BaseProgress |
Sub‑task progress tracker | |
| return | str |
Concatenated documentation of all parts |
Logic
- Call
split_data→ list of parts. - Create a sub‑task in
progress_barwith total length = number of parts. - Iterate parts: invoke
write_docs_by_parts, append result toall_result, keep last 3000 characters of the current result for next iteration (prev_info). Update progress bar each loop. - Remove sub‑task, log final length, and return the assembled document.
async_gen_doc_parts – Asynchronous Batch Documentation
| Entity | Type | Role | Notes |
|---|---|---|---|
full_code_mix |
str |
Source code | |
global_info |
str |
Passed to each async task (unused in prompt) | |
max_symbols |
int |
Chunk size | |
model |
AsyncModel |
Async LLM | |
language |
str |
Output language | |
progress_bar |
BaseProgress |
Sub‑task progress manager | |
| return | str |
Full documentation assembled from async tasks |
Logic
- Split source via
split_data. - Initialise a sub‑task in
progress_bar. - Create a semaphore (
4permits). - Build a list of
async_write_docs_by_partstasks, each receiving the shared semaphore and a lambda that updates the progress bar. await asyncio.gather(*tasks)→ list of part documents.- Concatenate results with double newlines, clean up sub‑task, log final length, and return.
Critical assumption: All logging is performed through
BaseLogger; no file I/O occurs in this module.
write_docs_by_parts – Synchronous Part‑wise Doc Generation
| Entity | Type | Role | Notes |
|---|---|---|---|
part |
str |
Code fragment to document | |
model |
Model |
Synchronous LLM interface | Provides get_answer_without_history |
prev_info |
str |
Optional prior output | Inserted into prompt when present |
language |
str |
Target language for docs | Default "en" |
| return | str |
Generated documentation for the part | May be trimmed of surrounding ````` markers |
Logic
- Build a system‑message list: language hint,
BASE_PART_COMPLITE_TEXT, optional previous info, then the user message containingpart. - Call
model.get_answer_without_history(prompt). - Strip leading/trailing markdown fences (`````), log length and content, and return the cleaned answer.
async_write_docs_by_parts – Async Part‑wise Doc Generation
| Entity | Type | Role | Notes |
|---|---|---|---|
part |
str |
Code fragment | |
async_model |
AsyncModel |
Async LLM interface | Provides await get_answer_without_history |
global_info |
str |
Unused in prompt construction | Present for signature compatibility |
semaphore |
asyncio.Semaphore |
Concurrency limiter | Acquired via async with |
prev_info |
str |
Optional prior output | |
language |
str |
Target language | |
update_progress |
callable |
Optional progress callback | Invoked after answer received |
| return | str |
Documentation for the part | Fence‑stripped like the sync version |
Logic mirrors the synchronous variant, wrapped in async with semaphore: and awaiting the model call. Progress is reported if update_progress is supplied.
` tag |
Logic
- Create a prompt with three system messages: language, analyst role, and a rule‑enforced template demanding a single anchor tag with no filenames, extensions, generic terms, or URLs.
- Append a user message containing the task.
- Call
model.get_answer_without_history. - Return the raw answer.
Cross‑Component Interaction
All functions rely on BaseLogger for internal diagnostics and on a Model implementation (e.g., GPTModel) to obtain LLM responses. No other modules are referenced; constants are imported from engine.config.config. The module therefore acts as a post‑processing helper that extracts navigation anchors and orchestrates LLM‑driven intro and custom description creation.
generete_custom_discription – Context‑Sensitive Custom Description
| Entity | Type | Role | Notes |
|---|---|---|---|
splited_data |
str (iterable) |
Chunked documentation pieces | Iterated until a satisfactory result |
model |
Model |
LLM interface | |
custom_description |
str |
User‑specified description task | |
language |
str |
Prompt language | Default "en" |
| return | str |
First LLM answer that passes filters | Empty string if none succeed |
Logic
- Loop over each
sp_datainsplited_data. - Build a multi‑system‑message prompt: language, analyst role, context (
sp_data), constantBASE_CUSTOM_DISCRIPTIONS, and the task. - Invoke
model.get_answer_without_history. - If the result does not contain
"!noinfo"or"No information found"(or those markers appear after position 30), break and keep the answer. - Otherwise reset
resultand continue. - Return the final
result.
generete_custom_discription_without – Stand‑Alone Description Generation
extract_links_from_start – Anchor Extraction
| Entity | Type | Role | Notes |
|---|---|---|---|
chunks |
list[str] |
Text blocks to scan | Expected to start with an <a name=…> tag |
links |
list[str] |
Collected anchors | Prefixed with “#” |
pattern |
str |
Regex ^<a name=["']?(.*?)["']?</a> |
Captures the name attribute at the very start of a chunk |
| return | list[str] |
Anchor list (only names > 5 chars) | Empty list if none match |
Logic
- Initialise empty
links. - For each
chunk→chunk.strip()→re.search(pattern). - If a match and
len(anchor_name) > 5→ append"#"+anchor_name. - Return
links.
Assumption: Only leading anchors are considered; embedded anchors are ignored.
get_all_html_links – HTML Anchor Extraction
| Entity | Type | Role | Notes |
|---|---|---|---|
data |
str |
Source markdown/HTML text | Expected to contain <a name="…"></a> anchors |
| return | list[str] |
Collected link identifiers | Each returned as #anchor_name (anchors longer than 5 chars) |
Logic
- Instantiate a fresh
BaseLogger. - Log start message.
- Compile regex
r'<a name=["\']?(.*?)["\']?></a>'. - Iterate over
re.finditer; for each match, capture group 1. - If captured name length > 5, prepend
#and append tolinks. - Log count and list of links (debug level 1).
- Return the list.
Note – No filesystem or network access; pure string processing.
get_introdaction – Global Introduction Generation
| Entity | Type | Role | Notes |
|---|---|---|---|
global_data |
str |
Full documentation content | Sent as user prompt |
model |
Model |
LLM interface | Same contract as above |
language |
str |
Prompt language | Default "en" |
| return | str |
Generated introduction text | No logging performed in this fragment |
Logic
- Assemble prompt: language system message, constant
BASE_INTRO_CREATE, andglobal_data. - Call
model.get_answer_without_history. - Return the answer.
get_links_intro – Intro Generation with Links
| Entity | Type | Role | Notes |
|---|---|---|---|
links |
list[str] |
Anchor list from get_all_html_links |
Serialized via str() for prompt |
model |
Model |
LLM interface | Must implement get_answer_without_history |
language |
str |
Prompt language selector | Default "en" |
| return | str |
Generated introductory markdown | Contains the supplied links |
Logic
- Create
BaseLogger. - Build a system‑user prompt array: set language, inject constant
BASE_INTRODACTION_CREATE_LINKS, and pass stringifiedlinks. - Log generation start.
- Call
model.get_answer_without_history(prompt=prompt). - Log completion and raw result (debug level 1).
- Return the LLM’s answer.
split_text_by_anchors – Chunk Segmentation
| Entity | Type | Role | Notes |
|---|---|---|---|
text |
str |
Full markdown source | Contains <a name=…> anchors |
pattern |
str |
Look‑ahead regex (?=<a name=["']?[^"\'>\s]{6,200}["']?</a>) |
Splits before each valid anchor |
result_chanks |
list[str] |
Trimmed non‑empty chunks | One per anchor |
all_links |
list[str] |
Output of extract_links_from_start |
Must align with result_chanks |
| return | dict[str,str] or None |
Mapping #anchor → chunk |
None if counts differ |
Logic
re.splitonpattern→ rawchunks.- Strip and filter empty entries →
result_chanks. - Call
extract_links_from_start(result_chanks)→all_links. - If
len(all_links) != len(result_chanks)→return None. - Build dict pairing each link with its corresponding chunk.
get_order – Semantic Title Ordering
| Entity | Type | Role | Notes |
|---|---|---|---|
model |
Model |
LLM interface | Provides get_answer_without_history |
chanks |
dict[str,str] |
Anchor‑to‑content map | Keys are #anchor strings |
logger |
BaseLogger |
Diagnostic output | Uses InfoLog at various levels |
| return | str |
Concatenated content in LLM‑suggested order | Ends with newline after each chunk |
Logic
- Log start and input keys/values.
- Build single‑message prompt asking the model to return a comma‑separated list of the titles (keys) sorted semantically, preserving the leading “#”.
- Call
model.get_answer_without_history(prompt). - Split result on commas, strip whitespace →
new_result. - Iterate
new_result; for each keyelappendchanks[el]and a newline toorder_output, logging each addition. - Return
order_output.
split_data – Text Chunking Engine
| Entity | Type | Role | Notes |
|---|---|---|---|
data |
str |
Raw source text | May contain newline separators |
max_symbols |
int |
Upper size limit for a chunk (symbols) | Used with 1.25 × and 1.5 × heuristics |
| return | list[str] |
List of chunk strings | Each ≤ max_symbols ≈ target size |
Logic
- Split
dataon newline ("\n"). - Repeatedly scan the list; any element longer than
1.5 × max_symbolsis cut in half (first half kept, second half inserted after). Loop until no element exceeds the threshold. - Accumulate elements into
split_objects, starting a new chunk when the current one would exceed1.25 × max_symbols. Newlines are inserted between concatenated parts. - Log start and completion via
BaseLogger.
CodeMix – Repository Snapshot Builder
| Entity | Type | Role | Notes |
|---|---|---|---|
root_dir |
Path |
Base directory for scanning | Resolved at init |
ignore_patterns |
list[str] |
Glob patterns to exclude | Defaults to empty list |
logger |
BaseLogger |
Progress logger | Uses InfoLog |
should_ignore(path) |
bool |
Determines exclusion | Checks path, basename, and each part against patterns |
build_repo_content(output_file) |
None |
Writes repository tree and file contents to output_file |
Inserts <file path="…"> tags before each file block |
| return | None |
Side‑effect: file creation | Prints a completion message in __main__ |
Logic
- Open
output_filefor writing. - Write “Repository Structure:” header.
- Walk
root_dir.rglob("*")sorted; for eachpathnot ignored, compute depth → indentation → write directory or file line. - Write separator line (
"="*20). - Walk again; for each non‑ignored file, write
<file path="relative_path">, then the file’s raw text, then two newlines. Errors are caught and written as"Error reading …".
Warning: Files matching any pattern in
ignore_patterns(e.g.,*.pyc,venv,.git) are silently skipped.
compress – Single‑File LLM Compression
| Entity | Type | Role | Notes |
|---|---|---|---|
data |
str |
Raw source text | – |
project_settings |
ProjectSettings |
Supplies system prompt via project_settings.prompt |
– |
model |
Model |
LLM interface, provides get_answer_without_history |
– |
compress_power |
int |
Controls token budget for BASE_COMPRESS_TEXT |
– |
| return | str |
LLM‑generated compressed text | – |
Logic
- Build
promptlist: system prompt from settings, token‑budget prompt fromget_BASE_COMPRESS_TEXT(10000, compress_power), then user contentdata. - Call
model.get_answer_without_history(prompt=prompt). - Return the answer unchanged.
compress_and_compare – Sync Batch Compression
| Entity | Type | Role | Notes |
|---|---|---|---|
data |
list[str] |
Files to compress | – |
model |
Model |
LLM instance | – |
project_settings |
ProjectSettings |
Prompt source | – |
compress_power |
int |
Chunk size (default 4) | – |
progress_bar |
BaseProgress |
Visual progress | Default instance |
| return | list[str] |
Concatenated chunks, one per compress_power files |
– |
Logic
- Allocate result list sized
ceil(len(data)/compress_power). - Initialise sub‑task on
progress_bar. - For each element
elat indexi: computecurr_index = i // compress_power; appendcompress(el, …)+ newline to that slot; update progress. - Remove sub‑task and return the list.
async_compress – Async Single Compression
| Entity | Type | Role |
|---|---|---|
data |
str |
Source text |
project_settings |
ProjectSettings |
Prompt source |
model |
AsyncModel |
Async LLM |
compress_power |
int |
Token budget |
semaphore |
asyncio.Semaphore |
Concurrency guard |
progress_bar |
BaseProgress |
Progress update |
| return | str |
Compressed result |
Logic
- Acquire semaphore, build identical prompt as
compress, awaitmodel.get_answer_without_history, update progress, release semaphore, return answer.
async_compress_and_compare – Async Batch
| Entity | Type | Role |
|---|---|---|
data |
list[str] |
Files |
model |
AsyncModel |
LLM |
project_settings |
ProjectSettings |
Prompt |
compress_power |
int |
Chunk size |
progress_bar |
BaseProgress |
Sub‑task |
| return | list[str] |
Chunked concatenations |
Logic
- Semaphore = 4, spawn
async_compresstasks for each file. await asyncio.gather→compressed_elements.- Group results by
compress_power, join with newlines, add trailing newline.
compress_to_one – Iterative Reduction
| Entity | Type | Role |
|---|---|---|
data |
list[str] |
Initial chunks |
model |
Model |
LLM |
project_settings |
ProjectSettings |
Prompt |
compress_power |
int |
Base chunk size |
use_async |
bool |
Switch between sync/async |
progress_bar |
BaseProgress |
Progress |
| return | str |
Single aggregated compressed block |
Logic
- Loop while
len(data) > 1; adjustcompress_power(minimum 2); call eitherasync_compress_and_compareviaasyncio.runorcompress_and_compare; increment iteration counter. Final element returned.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file autodocgenerator-0.9.0.1.tar.gz.
File metadata
- Download URL: autodocgenerator-0.9.0.1.tar.gz
- Upload date:
- Size: 39.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.3.1 CPython/3.12.12 Linux/6.11.0-1018-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f69fbceb219c8bfe9d1885d18d7e6580f8eb7acb75c825827ec807bf35947c94
|
|
| MD5 |
0a97208244ced88c59ca61b4f9c41f5f
|
|
| BLAKE2b-256 |
89c2b016b9f03d9ab9a61f36f51d97b7aee439dd452572dbdf06dd25459e3198
|
File details
Details for the file autodocgenerator-0.9.0.1-py3-none-any.whl.
File metadata
- Download URL: autodocgenerator-0.9.0.1-py3-none-any.whl
- Upload date:
- Size: 36.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.3.1 CPython/3.12.12 Linux/6.11.0-1018-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
940103fdb11f07e3c4c5a34c2a54f77ec889fad341ae82c3750066c4f7921eec
|
|
| MD5 |
4bc6196dabc825bbcf0057831822dd2f
|
|
| BLAKE2b-256 |
1197180c8a22c9cdc13a7f506c359efa85540b1ef990f705bf227619c044c550
|