A Python library for reading and writing Excel files with Pydantic-based schemas.
Project description
ExcelAlchemy
中文 README · About · Architecture · Locale Policy · Changelog · Migration Notes
ExcelAlchemy is a schema-driven Excel import/export library for Python. It turns Pydantic models into Excel templates, validates spreadsheet input back into application data, and keeps the import/export workflow explicit, typed, and extensible.
This repository is also a design artifact.
It documents a series of deliberate engineering choices: src/ layout, Pydantic v2 migration, pandas removal, pluggable storage, uv-based workflows, and locale-aware workbook output.
The current release track being prepared is 2.0.0rc1, the first public release candidate for ExcelAlchemy 2.0.
What This Project Is
- A library for building Excel workflows from typed schemas.
- A reference implementation of “facade outside, focused components inside”.
- A portfolio project that emphasizes architecture, migration strategy, and maintainability.
What This Project Is Not
- Not a general spreadsheet analysis library.
- Not a pandas-first data wrangling tool.
- Not a GUI spreadsheet editor.
- Not a fully generic forms framework.
Why This Exists
Many internal systems still receive business data through Excel. The painful part is rarely “reading a file”; it is keeping templates, validation rules, row-level error reporting, and backend integration consistent across projects.
ExcelAlchemy treats Excel as a typed contract:
- the model defines the shape
- field metadata defines the workbook experience
- import execution is separated from parsing
- storage is an interchangeable strategy, not a hard-coded implementation
Highlights
- Pydantic v2-based schema extraction and validation
- Locale-aware workbook text with
locale='zh-CN' | 'en' - Pluggable storage via
ExcelStorage - No pandas runtime dependency
- Python 3.12-3.14 support, with 3.14 as the primary target
uv-based development and CI workflow- Contract tests that protect import/export behavior during refactors
Architecture
ExcelAlchemy exposes a small public surface and delegates the real work to internal components.
flowchart TD
A[ExcelAlchemy Facade]
A --> B[ExcelSchemaLayout]
A --> C[ExcelHeaderParser / Validator]
A --> D[RowAggregator]
A --> E[ImportExecutor]
A --> F[ExcelRenderer / writer.py]
A --> G[ExcelStorage Protocol]
G --> H[MinioStorageGateway]
G --> I[Custom Storage]
B --> J[FieldMeta / FieldMetaInfo]
E --> K[Pydantic Adapter]
F --> L[i18n Display Messages]
E --> M[Runtime Error Messages]
See the full breakdown in docs/architecture.md.
Workflow
flowchart LR
A[Pydantic model + FieldMeta] --> B[ExcelAlchemy facade]
B --> C[Template rendering]
B --> D[Worksheet parsing]
D --> E[Header validation]
D --> F[Row aggregation]
F --> G[Import executor]
G --> H[Import result workbook]
C --> I[Workbook for users]
H --> I
Design Principles
This repository is guided by explicit design principles rather than accidental convenience. The full mapping is in ABOUT.md; the short version is:
- Schema first.
- Explicit metadata over implicit conventions.
- Composition over monoliths.
- Adapters at integration boundaries.
- Protocols over concrete backends.
- Progressive modernization over one-shot rewrites.
- Runtime simplicity over hidden magic.
- User-facing clarity over clever internals.
- Tests should protect behavior, not implementation accidents.
- Migration-friendly seams are part of the design.
Quick Start
Install
pip install ExcelAlchemy
If you want the built-in Minio backend:
pip install "ExcelAlchemy[minio]"
Minimal Example
from pydantic import BaseModel
from excelalchemy import ExcelAlchemy, FieldMeta, ImporterConfig, Number, String
class Importer(BaseModel):
age: Number = FieldMeta(label='Age', order=1)
name: String = FieldMeta(label='Name', order=2)
alchemy = ExcelAlchemy(ImporterConfig(Importer, locale='en'))
template_base64 = alchemy.download_template()
Locale-Aware Workbook Output
locale affects workbook-facing display text such as:
- header hint text
- column comments
- result workbook column titles
- row validation status labels
The public locale policy is documented in docs/locale.md. In short:
- runtime exceptions are standardized in English
- workbook display locales currently support
zh-CNanden - workbook display defaults to
zh-CNfor the 2.x line
from excelalchemy import ExcelAlchemy, FieldMeta, ImporterConfig, Number, String
from pydantic import BaseModel
class Importer(BaseModel):
age: Number = FieldMeta(label='Age', order=1)
name: String = FieldMeta(label='Name', order=2)
zh_template = ExcelAlchemy(ImporterConfig(Importer, locale='zh-CN')).download_template()
en_template = ExcelAlchemy(ImporterConfig(Importer, locale='en')).download_template()
The same locale also controls import result workbooks:
alchemy = ExcelAlchemy(
ImporterConfig(
Importer,
creator=create_func,
storage=storage,
locale='en',
)
)
result = await alchemy.import_data("people.xlsx", "people-result.xlsx")
Storage Extension Point
Storage is modeled as a protocol, not a product decision.
from excelalchemy import ExcelAlchemy, ExcelStorage, ExporterConfig
from excelalchemy.core.table import WorksheetTable
from excelalchemy.types.identity import UrlStr
class InMemoryExcelStorage(ExcelStorage):
def read_excel_table(self, input_excel_name: str, *, skiprows: int, sheet_name: str) -> WorksheetTable:
...
def upload_excel(self, output_name: str, content_with_prefix: str) -> UrlStr:
...
alchemy = ExcelAlchemy(ExporterConfig(Importer, storage=InMemoryExcelStorage()))
Use the built-in Minio implementation when you want it, but the library no longer requires Minio to define its architecture.
Why These Design Choices
Why no pandas?
ExcelAlchemy uses openpyxl plus an internal WorksheetTable abstraction.
The project was not using pandas for analysis, joins, or vectorized computation; it was mostly using it as a transport layer.
Removing pandas:
- simplified installation
- removed the
numpydependency chain - made behavior more explicit
- better aligned the code with the actual problem domain
Why a Pydantic adapter layer?
The project used to lean on Pydantic internals more directly. That becomes fragile during major-version upgrades. Now the design is:
FieldMetaowns Excel metadata- the Pydantic adapter reads model structure
- the adapter does not own the domain semantics
This is what made the Pydantic v2 migration practical without rewriting the public API.
Why a facade?
The public object should stay small.
The internal object graph can evolve.
ExcelAlchemy is the facade; parsing, rendering, execution, storage, and schema layout are delegated to separate collaborators.
Evolution
This repository intentionally records its evolution:
src/layout migration- CI and release modernization
- Pydantic metadata decoupling
- Pydantic v2 migration
- Python 3.12-3.14 modernization
- internal architecture split
- pandas removal
- storage abstraction
- i18n foundation and locale-aware workbook text
These are not incidental refactors; they are the story of the codebase. See ABOUT.md for the migration rationale behind each step.
Pydantic v1 vs v2
The short version:
| Topic | v1-style risk | Current v2 design |
|---|---|---|
| Field access | Tight coupling to __fields__ / ModelField |
Adapter over model_fields |
| Metadata ownership | Excel metadata mixed with validation internals | FieldMetaInfo owns Excel metadata |
| Validation integration | Deep reliance on internals | Adapter + explicit runtime validation |
| Upgrade path | Brittle | Layered |
More detail is documented in ABOUT.md.
Docs Map
- README.md: product + design overview
- README_cn.md: Chinese usage-oriented guide
- ABOUT.md: engineering rationale and evolution notes
- docs/architecture.md: component map and boundaries
Development
The project uses uv for local development and CI.
uv sync --extra development
uv run pre-commit install
uv run ruff check .
uv run pyright
uv run pytest --cov=excelalchemy --cov-report=term-missing:skip-covered tests
uv build
License
MIT. See LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file excelalchemy-2.0.0rc1.tar.gz.
File metadata
- Download URL: excelalchemy-2.0.0rc1.tar.gz
- Upload date:
- Size: 49.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e3d3220dd79faba6c1e9f5578323860a34a6167b85b87833f3fcc1d8098c247f
|
|
| MD5 |
93f6389e7a8783f5bf12e841bb14a61e
|
|
| BLAKE2b-256 |
d413534beb6d94c2404f97535a30b170522d638c683881c47a6dd42d5113256d
|
Provenance
The following attestation bundles were made for excelalchemy-2.0.0rc1.tar.gz:
Publisher:
python-publish.yml on RayCarterLab/ExcelAlchemy
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
excelalchemy-2.0.0rc1.tar.gz -
Subject digest:
e3d3220dd79faba6c1e9f5578323860a34a6167b85b87833f3fcc1d8098c247f - Sigstore transparency entry: 1188605020
- Sigstore integration time:
-
Permalink:
RayCarterLab/ExcelAlchemy@095804f280dbe04b251c4f0d8004f6ba2c26f91d -
Branch / Tag:
refs/tags/2.0.0rc1 - Owner: https://github.com/RayCarterLab
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@095804f280dbe04b251c4f0d8004f6ba2c26f91d -
Trigger Event:
release
-
Statement type:
File details
Details for the file excelalchemy-2.0.0rc1-py3-none-any.whl.
File metadata
- Download URL: excelalchemy-2.0.0rc1-py3-none-any.whl
- Upload date:
- Size: 66.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
490d479f06da5c2ab19a1d2d1f0e22847c1aefa5dd56c3744a2bc86678fccd75
|
|
| MD5 |
4121ccc8e99183c15801a2b47163c3f1
|
|
| BLAKE2b-256 |
cc7b342ff90e565cffdd2ac1766a6afff45b8f404f1cb776b189d264f224be09
|
Provenance
The following attestation bundles were made for excelalchemy-2.0.0rc1-py3-none-any.whl:
Publisher:
python-publish.yml on RayCarterLab/ExcelAlchemy
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
excelalchemy-2.0.0rc1-py3-none-any.whl -
Subject digest:
490d479f06da5c2ab19a1d2d1f0e22847c1aefa5dd56c3744a2bc86678fccd75 - Sigstore transparency entry: 1188605023
- Sigstore integration time:
-
Permalink:
RayCarterLab/ExcelAlchemy@095804f280dbe04b251c4f0d8004f6ba2c26f91d -
Branch / Tag:
refs/tags/2.0.0rc1 - Owner: https://github.com/RayCarterLab
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@095804f280dbe04b251c4f0d8004f6ba2c26f91d -
Trigger Event:
release
-
Statement type: