Skip to main content

Schema-driven Python library for typed Excel import/export workflows with Pydantic and locale-aware workbooks.

Project description

ExcelAlchemy

CI Codecov Python Lint Typing

中文 README · About · Architecture · Locale Policy · Changelog · Migration Notes

ExcelAlchemy is a schema-driven Python library for Excel import and export workflows. It turns Pydantic models into typed workbook contracts: generate templates, validate uploads, map failures back to rows and cells, and produce locale-aware result workbooks.

This repository is also a design artifact. It documents a series of deliberate engineering choices: src/ layout, Pydantic v2 migration, pandas removal, pluggable storage, uv-based workflows, and locale-aware workbook output.

The current stable release line is 2.0.0, the first public stable release of ExcelAlchemy 2.0.

At a Glance

  • Build Excel templates directly from typed Pydantic schemas
  • Validate uploaded workbooks and write failures back to rows and cells
  • Keep storage pluggable through ExcelStorage
  • Render workbook-facing text in zh-CN or en
  • Stay lightweight at runtime with openpyxl instead of pandas
  • Protect behavior with contract tests, ruff, and pyright

Screenshots

Template Import Result
Excel template screenshot Excel import result screenshot

Minimal Example

from pydantic import BaseModel

from excelalchemy import ExcelAlchemy, FieldMeta, ImporterConfig, Number, String


class Importer(BaseModel):
    age: Number = FieldMeta(label='Age', order=1)
    name: String = FieldMeta(label='Name', order=2)


alchemy = ExcelAlchemy(ImporterConfig(Importer, locale='en'))
template = alchemy.download_template_artifact(filename='people-template.xlsx')

excel_bytes = template.as_bytes()
template_data_url = template.as_data_url()  # compatibility path for older browser integrations

Modern Annotated Example

from typing import Annotated

from pydantic import BaseModel, Field

from excelalchemy import Email, ExcelAlchemy, ExcelMeta, ImporterConfig


class Importer(BaseModel):
    email: Annotated[
        Email,
        Field(min_length=10),
        ExcelMeta(label='Email', order=1, hint='Use your work email'),
    ]


alchemy = ExcelAlchemy(ImporterConfig(Importer, locale='en'))
template = alchemy.download_template_artifact(filename='people-template.xlsx')

For browser downloads, prefer template.as_bytes() with a Blob, or return the bytes from your backend with Content-Disposition: attachment. A top-level navigation to a long data: URL is less reliable in modern browsers.

Repository Scope

  • A library for building Excel workflows from typed schemas.
  • A reference implementation of “facade outside, focused components inside”.
  • A portfolio project that emphasizes architecture, migration strategy, and maintainability.

Non-Goals

  • Not a general spreadsheet analysis library.
  • Not a pandas-first data wrangling tool.
  • Not a GUI spreadsheet editor.
  • Not a fully generic forms framework.

Why This Exists

Many internal systems still receive business data through Excel. The painful part is rarely “reading a file”; it is keeping templates, validation rules, row-level error reporting, and backend integration consistent across projects.

ExcelAlchemy treats Excel as a typed contract:

  • the model defines the shape
  • field metadata defines the workbook experience
  • import execution is separated from parsing
  • storage is an interchangeable strategy, not a hard-coded implementation

Architecture

ExcelAlchemy exposes a small public surface and delegates the real work to internal components.

flowchart TD
    A[ExcelAlchemy Facade]
    A --> B[ExcelSchemaLayout]
    A --> C[ExcelHeaderParser / Validator]
    A --> D[RowAggregator]
    A --> E[ImportExecutor]
    A --> F[ExcelRenderer / writer.py]
    A --> G[ExcelStorage Protocol]

    G --> H[MinioStorageGateway]
    G --> I[Custom Storage]

    B --> J[FieldMeta / FieldMetaInfo]
    E --> K[Pydantic Adapter]
    F --> L[i18n Display Messages]
    E --> M[Runtime Error Messages]

See the full breakdown in docs/architecture.md.

Workflow

flowchart LR
    A[Pydantic model + FieldMeta] --> B[ExcelAlchemy facade]
    B --> C[Template rendering]
    B --> D[Worksheet parsing]
    D --> E[Header validation]
    D --> F[Row aggregation]
    F --> G[Import executor]
    G --> H[Import result workbook]
    C --> I[Workbook for users]
    H --> I

Design Principles

This repository is guided by explicit design principles rather than accidental convenience. The full mapping is in ABOUT.md; the short version is:

  1. Schema first.
  2. Explicit metadata over implicit conventions.
  3. Composition over monoliths.
  4. Adapters at integration boundaries.
  5. Protocols over concrete backends.
  6. Progressive modernization over one-shot rewrites.
  7. Runtime simplicity over hidden magic.
  8. User-facing clarity over clever internals.
  9. Tests should protect behavior, not implementation accidents.
  10. Migration-friendly seams are part of the design.

Quick Start

Install

pip install ExcelAlchemy

If you want the built-in Minio backend:

pip install "ExcelAlchemy[minio]"

Locale-Aware Workbook Output

locale affects workbook-facing display text such as:

  • header hint text
  • column comments
  • result workbook column titles
  • row validation status labels

The public locale policy is documented in docs/locale.md. In short:

  • runtime exceptions are standardized in English
  • workbook display locales currently support zh-CN and en
  • workbook display defaults to zh-CN for the 2.x line
from excelalchemy import ExcelAlchemy, FieldMeta, ImporterConfig, Number, String
from pydantic import BaseModel


class Importer(BaseModel):
    age: Number = FieldMeta(label='Age', order=1)
    name: String = FieldMeta(label='Name', order=2)


zh_template = ExcelAlchemy(ImporterConfig(Importer, locale='zh-CN')).download_template_artifact()
en_template = ExcelAlchemy(ImporterConfig(Importer, locale='en')).download_template_artifact()

The same locale also controls import result workbooks:

alchemy = ExcelAlchemy(
    ImporterConfig(
        Importer,
        creator=create_func,
        storage=storage,
        locale='en',
    )
)
result = await alchemy.import_data("people.xlsx", "people-result.xlsx")

Storage Protocol

Storage is modeled as a protocol, not a product decision.

from excelalchemy import ExcelAlchemy, ExcelStorage, ExporterConfig, UrlStr
from excelalchemy.core.table import WorksheetTable


class InMemoryExcelStorage(ExcelStorage):
    def read_excel_table(self, input_excel_name: str, *, skiprows: int, sheet_name: str) -> WorksheetTable:
        ...

    def upload_excel(self, output_name: str, content_with_prefix: str) -> UrlStr:
        ...


alchemy = ExcelAlchemy(ExporterConfig(Importer, storage=InMemoryExcelStorage()))

Use the built-in Minio implementation when you want it, but the library no longer requires Minio to define its architecture.

Why These Design Choices

Why no pandas?

ExcelAlchemy uses openpyxl plus an internal WorksheetTable abstraction. WorksheetTable is intentionally narrow and only models the operations the core workflow needs; it is not a pandas-compatible public table layer. The project was not using pandas for analysis, joins, or vectorized computation; it was mostly using it as a transport layer. Removing pandas:

  • simplified installation
  • removed the numpy dependency chain
  • made behavior more explicit
  • better aligned the code with the actual problem domain

Why a Pydantic adapter layer?

The project used to lean on Pydantic internals more directly. That becomes fragile during major-version upgrades. Now the design is:

  • FieldMeta owns Excel metadata
  • the Pydantic adapter reads model structure
  • the adapter does not own the domain semantics

This is what made the Pydantic v2 migration practical without rewriting the public API.

Why a facade?

The public object should stay small. The internal object graph can evolve. ExcelAlchemy is the facade; parsing, rendering, execution, storage, and schema layout are delegated to separate collaborators.

Why a storage protocol?

Excel workflows should not be locked to Minio, S3, or any one persistence strategy. ExcelStorage keeps the boundary stable while allowing object storage, local filesystem adapters, in-memory test doubles, and custom infrastructure integrations to share the same import/export contract.

Evolution

This repository intentionally records its evolution:

  • src/ layout migration
  • CI and release modernization
  • Pydantic metadata decoupling
  • Pydantic v2 migration
  • Python 3.12-3.14 modernization
  • internal architecture split
  • pandas removal
  • storage abstraction
  • i18n foundation and locale-aware workbook text

These are not incidental refactors; they are the story of the codebase. See ABOUT.md for the migration rationale behind each step.

Pydantic v1 vs v2

The short version:

Topic v1-style risk Current v2 design
Field access Tight coupling to __fields__ / ModelField Adapter over model_fields
Metadata ownership Excel metadata mixed with validation internals FieldMetaInfo owns Excel metadata
Validation integration Deep reliance on internals Adapter + explicit runtime validation
Upgrade path Brittle Layered

More detail is documented in ABOUT.md.

Docs Map

Development

The project uses uv for local development and CI.

uv sync --extra development
uv run pre-commit install
uv run ruff check .
uv run pyright
uv run pytest --cov=excelalchemy --cov-report=term-missing:skip-covered tests
uv build

License

MIT. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

excelalchemy-2.0.0.tar.gz (56.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

excelalchemy-2.0.0-py3-none-any.whl (80.7 kB view details)

Uploaded Python 3

File details

Details for the file excelalchemy-2.0.0.tar.gz.

File metadata

  • Download URL: excelalchemy-2.0.0.tar.gz
  • Upload date:
  • Size: 56.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for excelalchemy-2.0.0.tar.gz
Algorithm Hash digest
SHA256 507c09652454fbca5dbd80b4c7f2c61ac85d357b9c824750589a34ed6e8fb6a3
MD5 e0a53429a3612d715c4475cd08cd01c7
BLAKE2b-256 f4b27299de2eee689769857ca37c346202e6064b1ec32e77ff0e30c6644b27cc

See more details on using hashes here.

Provenance

The following attestation bundles were made for excelalchemy-2.0.0.tar.gz:

Publisher: python-publish.yml on RayCarterLab/ExcelAlchemy

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file excelalchemy-2.0.0-py3-none-any.whl.

File metadata

  • Download URL: excelalchemy-2.0.0-py3-none-any.whl
  • Upload date:
  • Size: 80.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for excelalchemy-2.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6ef76c227ebe7afab762301155328856feec18a1d04b7049c5d3c7d7f8ddb199
MD5 ea1967e482d8f339ee1bd844d4a32369
BLAKE2b-256 24a360083ce4595e7e18324f16f7d7500d1ccfb7393e6c58225925a48781d234

See more details on using hashes here.

Provenance

The following attestation bundles were made for excelalchemy-2.0.0-py3-none-any.whl:

Publisher: python-publish.yml on RayCarterLab/ExcelAlchemy

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page