No project description provided

Project description

ManualForge

Configuration-driven management manual generation framework. Define your data sources, fields, and templates in YAML — get a formatted report.

Built on Kedro pipelines with Polars for data processing and Typst for document rendering.

Philosophy

ManualForge separates what you want to produce from how it's produced.

What: Defined in conf/base/parameters_manualforge.yml — your data sources, expected columns, standardization rules, sort orders, summary dimensions, and report templates.
How: Implemented by the pipeline nodes — reusable data processing functions that read from your config.

To create a new manual for a different domain, you only need to edit the config file (and optionally provide new templates). No Python code changes required.

Features

Capability	Description
Multi-sheet Excel ingestion	Auto-detect headers, filter cover sheets, merge into structured DataFrames
Field standardization	Mapping files + exact matching + fuzzy matching (difflib / duckdb)
Config-driven summaries	Define group-by dimensions, sort orders, ability categories, and output paths in YAML
Typst report generation	Jinja2 templates → Typst source → PDF compilation
Pipeline hooks	Shell command hooks at pipeline/node granularity for pre/post processing

Quick Start

# 1. Install dependencies
pip install -r requirements.txt

# 2. Copy and customize configuration
cp conf/examples/parameters_manualforge.yml.example conf/base/parameters_manualforge.yml
cp conf/examples/catalog.yml.example          conf/base/catalog.yml
cp conf/examples/hooks.yml.example            conf/base/hooks.yml
cp conf/examples/parameters.yml.example       conf/base/parameters.yml
cp conf/examples/credentials.yml.example      conf/local/credentials.yml

# 3. Edit the config files to point to your data sources
#    (conf/base/ is gitignored — your real configs stay local)

# 4. Run the pipeline
kedro run

# Run specific node groups
kedro run --tags conversion        # Excel → Parquet only
kedro run --tags standardization   # Standardization only
kedro run --tags csv               # Summary tables only

Project Structure

├── conf/
│   ├── base/                          # ★ Gitignored — copy from examples/
│   │   ├── parameters_manualforge.yml # Central project configuration
│   │   ├── catalog.yml                # Kedro data catalog
│   │   ├── hooks.yml                  # Pipeline hooks (shell commands)
│   │   └── parameters.yml             # Pipeline parameters
│   ├── examples/                      # ★ Tracked example templates
│   │   ├── parameters_manualforge.yml.example
│   │   ├── catalog.yml.example
│   │   ├── hooks.yml.example
│   │   ├── parameters.yml.example
│   │   └── credentials.yml.example
│   ├── local/                         # Local-only (gitignored)
│   │   └── credentials.yml
│   └── logging.yml
├── data/                              # Gitignored except .gitkeep
│   ├── 01_raw/                        # Raw Excel/CSV + mapping files
│   ├── 02_intermediate/              # Parquet, reconcile reports
│   ├── 03_primary/                   # Standardized data
│   ├── 04_feature/                   # Summary tables (CSV + Markdown)
│   └── 08_reporting/                 # Typst sources & compiled PDFs
├── scripts/                          # Auxiliary scripts
│   ├── convert_csv_to_md.py          # CSV → Markdown conversion
│   ├── extract_rule_field_mapping.py # Rule field extraction
│   ├── extract_rule_overview.py      # Rule overview extraction
│   └── render_with_forge.py          # Markdown → DOCX/PDF rendering
├── src/manualforge/                  # Framework source code
│   ├── config.py                     # Configuration helper utilities
│   ├── hooks.py                      # Kedro pipeline hooks
│   ├── io/                           # Custom Kedro datasets (PolarsExcelDataset)
│   ├── pipelines/                    # Pipeline definitions & node functions
│   └── settings.py                   # Kedro project settings
├── templates/                        # Jinja2 Typst templates
│   └── report.typ.j2
├── pyproject.toml                    # Project metadata & dependencies
└── requirements.txt

Configuration Guide

The central configuration file is conf/base/parameters_manualforge.yml. Copy from conf/examples/ and customize:

1. Data Sources

Define your Excel files, expected headers, and sheet filtering rules:

datasources:
  primary_data:
    filepath: "data/01_raw/your_data.xlsx"
    sheet:
      exclude_names: ["封面", "封皮"]
      name_becomes_column: "sheet_name"
    header_detection:
      mode: keyword_match
      expected_headers:
        - "column_a"
        - "column_b"
    cleaning:
      drop_rows_where:
        column_a: ["column_a"]   # drop residual header rows
      fill_null: forward
      deduplicate: true

2. Field Standardization

Define which fields to standardize, their mapping files, and special corrections:

standardization:
  fields:
    - name: "dept_name"
      mapping_file: "data/01_raw/dept_list"
      case_corrections:
        wrong_name: "correct_name"
      special_mappings:
        alias: "canonical_name"
      fuzzy:
        enabled: true
        threshold: 0.8
        method: difflib             # difflib | duckdb

3. Sort Orders

Define reusable sort order lists referenced by summaries:

sort_orders:
  model_names:
    - "Model A"
    - "Model B"
  dep_names:
    - "HR"
    - "Finance"

4. Summaries

Define what summary tables to generate:

summaries:
  my_summary:
    description: "Fields grouped by model and department"
    group_by: ["model", "department"]
    struct_columns: ["module", "system", "field_name"]
    sort_by:
      department: dep_names
    output:
      csv: "data/04_feature/my_summary.csv"

5. Reports

Define report templates and output:

reports:
  my_report:
    description: "Rules cookbook"
    template_source: inline
    data_source: rules_data
    output_typ: "data/08_reporting/output.typ"
    typst_compile:
      enabled: true

Data Layers

Layer	Directory	Description
Raw	`data/01_raw/`	Source Excel/CSV files, mapping files
Intermediate	`data/02_intermediate/`	Parquet, reconcile reports
Primary	`data/03_primary/`	Standardized data
Feature	`data/04_feature/`	Summary tables (CSV + Markdown)
Reporting	`data/08_reporting/`	Typst sources & PDF output

Requirements

Python >= 3.10
Typst CLI (for PDF compilation)

Development

pip install -e ".[dev]"
ruff check src/
pytest

Project details

Release history Release notifications | RSS feed

0.3.1

Jun 29, 2026

0.3.0

Jun 29, 2026

0.2.0

Jun 17, 2026

0.1.5

Jun 16, 2026

0.1.4

Jun 16, 2026

0.1.3

Jun 12, 2026

This version

0.1.2

Jun 12, 2026

0.1.1

Jun 12, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

manualforge-0.1.2.tar.gz (36.2 kB view details)

Uploaded Jun 12, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

manualforge-0.1.2-py3-none-any.whl (36.8 kB view details)

Uploaded Jun 12, 2026 Python 3

File details

Details for the file manualforge-0.1.2.tar.gz.

File metadata

Download URL: manualforge-0.1.2.tar.gz
Upload date: Jun 12, 2026
Size: 36.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.5

File hashes

Hashes for manualforge-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`c594070712b220229b445611f5e418947c6e42fc799ae487b025aa7def64de73`
MD5	`897e7d4d244e5127469e405bba20fa87`
BLAKE2b-256	`5b48f32c86d8d267bc67c54660e2e476c15dc07eb32375dece1e76135b982716`

See more details on using hashes here.

File details

Details for the file manualforge-0.1.2-py3-none-any.whl.

File metadata

Download URL: manualforge-0.1.2-py3-none-any.whl
Upload date: Jun 12, 2026
Size: 36.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.5

File hashes

Hashes for manualforge-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2af054120b0718f9735e74b7c31932f01e194fa71bf17447d1a4604c16e1bd5a`
MD5	`060b2d1558412f4c715e742e9dd67772`
BLAKE2b-256	`712f796c9c319ecfcf2880a4c53107cbb191b1fcb29d48ab43b986b898f99883`

See more details on using hashes here.

manualforge 0.1.2

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

ManualForge

Philosophy

Features

Quick Start

Project Structure

Configuration Guide

1. Data Sources

2. Field Standardization

3. Sort Orders

4. Summaries

5. Reports

Data Layers

Requirements

Development

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes