The world's first universal framework for standardized data digitization
Project description
airalogy
The world's first universal framework for data digitization and automation
Key Features
Airalogy lets you create fully custom protocols (Airalogy Protocols) for defining how data is collected, validated, and processed.
| Area | Highlights |
|---|---|
| Airalogy Markdown (AIMD) | Define rich, custom data fields directly in Markdown—variables ({{var}}), procedural steps ({{step}}), checkpoints ({{check}}), and more. |
| Model-based Data Validation | Attach a model to every protocol for strict type checking—supports datetime, enums, nested models, lists, etc.; and Airalogy-specific built-in types (UserName, CurrentTime, AiralogyMarkdown, file IDs, ...). |
| Assigner for Auto-Computation | Use the declarative @assigner decorator to compute field values automatically. |
Requirements
Python ≥ 3.13
Installation
pip install airalogy
Release and PyPI publishing steps for maintainers are documented in RELEASING.md.
Quick Start
Use one typed AIMD
protocol.aimd
# Serum sample collection
Participant: {{var|subject_name: UserName, title="Participant name"}}
Collection time: {{var|collected_at: CurrentTime}}
Serum volume (mL): {{var|serum_volume: float, gt=0}}
Ice-bath time (min): {{var|ice_time: int = 0, ge=0}}
Sample photo: {{var|sample_photo: FileIdPNG, description="Upload collection photo"}}
{{step|collect}} Collect serum sample as per standard procedure.
{{step|verify_labels, 2}} Verify labels and IDs.
{{step|ice_hold, 2, duration="10m", timer="countdown"}} Immediately place sample on ice.
{{check|info_confirmed}} Confirm details and metadata.
- Run
airalogy checkto validate the AIMD and use it directly. - Need an explicit model file?
airalogy generate_model protocol.aimd -o model.pyauto-generates the Pydantic model that matches these types.
Extended: add model and assigner
protocol/
├─ protocol.aimd # Airalogy Markdown
├─ model.py # Optional: Define data validation model
└─ assigner.py # Optional: Define auto-computation logic
protocol.aimd
# Reagent preparation
Solvent name: {{var|solvent_name}}
Target solution volume (L): {{var|target_solution_volume}}
Solute name: {{var|solute_name}}
Solute molar mass (g/mol): {{var|solute_molar_mass}}
Target molar concentration (mol/L): {{var|target_molar_concentration}}
Required solute mass (g): {{var|required_solute_mass}}
model.py
from pydantic import BaseModel, Field
class VarModel(BaseModel):
solvent_name: str
target_solution_volume: float = Field(gt=0)
solute_name: str
solute_molar_mass: float = Field(gt=0)
target_molar_concentration: float = Field(gt=0)
required_solute_mass: float = Field(gt=0)
assigner.py
from airalogy.assigner import AssignerResult, assigner
@assigner(
assigned_fields=["required_solute_mass"],
dependent_fields=[
"target_solution_volume",
"solute_molar_mass",
"target_molar_concentration",
],
mode="auto",
)
def calculate_required_solute_mass(dependent_fields: dict) -> AssignerResult:
target_solution_volume = dependent_fields["target_solution_volume"]
solute_molar_mass = dependent_fields["solute_molar_mass"]
target_molar_concentration = dependent_fields["target_molar_concentration"]
required_solute_mass = (
target_solution_volume * target_molar_concentration * solute_molar_mass
)
return AssignerResult(
assigned_fields={
"required_solute_mass": required_solute_mass,
},
)
Command Line Interface
Airalogy provides a CLI tool for common operations. After installation, you can use the airalogy command:
$ airalogy --help
usage: airalogy [-h] [-v] {check,c,generate_model,gm,generate_assigner,ga,pack,unpack} ...
Airalogy CLI - Tools for Airalogy
positional arguments:
{check,c,generate_model,gm,generate_assigner,ga,pack,unpack}
Available commands
check (c) Check AIMD syntax
generate_model (gm)
Generate VarModel
generate_assigner (ga)
Generate Assigner
pack Pack a protocol directory or record JSON files into a single-file archive
unpack Unpack an Airalogy archive
options:
-h, --help show this help message and exit
-v, --version show program's version number and exit
Syntax Checking
Check AIMD syntax:
# Check default protocol.aimd file
airalogy check
# Check specific AIMD file
airalogy check my_protocol.aimd
# Using alias
airalogy c my_protocol.aimd
Model Generation
Generate VarModel from AIMD file:
# Generate model.py from protocol.aimd
airalogy generate_model
# Generate with custom output file
airalogy generate_model my_protocol.aimd -o my_model.py
# Using alias
airalogy gm my_protocol.aimd -o custom_model.py
Assigner Extraction
Extract inline assigner code blocks into assigner.py:
# Generate assigner.py from protocol.aimd (and strip inline blocks)
airalogy generate_assigner
# Using alias
airalogy ga my_protocol.aimd -o assigner.py
Single-file Archives
Airalogy uses one unified archive suffix, .aira. The concrete payload type is stored in the internal manifest as kind, for example protocol or records.
Pack a protocol directory into a shareable .aira file:
airalogy pack ./my_protocol -o my_protocol.aira
Pack one or more record JSON files into a .aira file:
airalogy pack ./record.json ./record-history.json -o records.aira
If you want the record bundle to carry the related protocol definition as well, embed the protocol directory:
airalogy pack ./record.json -o record_bundle.aira --protocol-dir ./my_protocol
Unpack either archive type:
airalogy unpack ./my_protocol.aira -o ./extracted_protocol
airalogy unpack ./record_bundle.aira -o ./extracted_bundle
Notes:
- Protocol archives preserve the original protocol directory layout, including
files/. - Record archives accept JSON files containing either one record object or a list of record objects.
- Both archive kinds use the same
.airasuffix; inspect_airalogy_archive/manifest.jsonto determine whether the payload is a protocol archive or a record bundle. - Protocol packing excludes
.envand common cache artifacts by default so local secrets are not bundled accidentally. - Record archives currently bundle JSON records and optional embedded protocol directories. They do not automatically dereference remote Airalogy file IDs into raw file bytes.
Document Conversion (MarkItDown)
Airalogy provides a unified API to convert documents into Markdown.
pip install "airalogy[markitdown]"
# or (uv)
uv add "airalogy[markitdown]"
from airalogy.convert import to_markdown
print(to_markdown("report.pdf", backend="markitdown").text)
See docs: docs/airalogy/en/apis/convert.md / docs/airalogy/zh/apis/convert.md.
Development Setup
We use uv for environment management and build, ruff for lint/format.
setup project environment:
uv sync
Install all optional backends (extras) as well:
uv sync --all-extras
Or install a specific extra (example: markitdown):
uv sync --extra markitdown
Testing
uv run pytest
License
Apache 2.0
Cite This Framework
@misc{yang2025airalogyaiempowereduniversaldata,
title={Airalogy: AI-empowered universal data digitization for research automation},
author={Zijie Yang and Qiji Zhou and Fang Guo and Sijie Zhang and Yexun Xi and Jinglei Nie and Yudian Zhu and Liping Huang and Chou Wu and Yonghe Xia and Xiaoyu Ma and Yingming Pu and Panzhong Lu and Junshu Pan and Mingtao Chen and Tiannan Guo and Yanmei Dou and Hongyu Chen and Anping Zeng and Jiaxing Huang and Tian Xu and Yue Zhang},
year={2025},
eprint={2506.18586},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2506.18586},
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file airalogy-0.11.0-py3-none-any.whl.
File metadata
- Download URL: airalogy-0.11.0-py3-none-any.whl
- Upload date:
- Size: 99.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cd627396835cd9ca7be631df9f44637e92c3f6ac287006a5c2ca36193b517e9b
|
|
| MD5 |
58ca9d011f6b80b131a167d95c693f9b
|
|
| BLAKE2b-256 |
337143c41332c2c814f6dfdd64f89cf5379e610a4fda315fbe2ed42d08e7a115
|
Provenance
The following attestation bundles were made for airalogy-0.11.0-py3-none-any.whl:
Publisher:
release.yml on airalogy/airalogy
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
airalogy-0.11.0-py3-none-any.whl -
Subject digest:
cd627396835cd9ca7be631df9f44637e92c3f6ac287006a5c2ca36193b517e9b - Sigstore transparency entry: 1670888331
- Sigstore integration time:
-
Permalink:
airalogy/airalogy@e0b1ada423e95e5f06668b6745194a870d6d6f41 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/airalogy
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@e0b1ada423e95e5f06668b6745194a870d6d6f41 -
Trigger Event:
push
-
Statement type: