Skip to main content

The world's first universal framework for standardized data digitization

Project description

airalogy

中文 README

PyPI version Checks

The world's first universal framework for data digitization and automation

Key Features

Airalogy lets you create fully custom protocols (Airalogy Protocols) for defining how data is collected, validated, and processed.

Area Highlights
Airalogy Markdown (AIMD) Define rich, custom data fields directly in Markdown—variables ({{var}}), procedural steps ({{step}}), checkpoints ({{check}}), and more.
Model-based Data Validation Attach a model to every protocol for strict type checking—supports datetime, enums, nested models, lists, etc.; and Airalogy-specific built-in types (UserName, CurrentTime, AiralogyMarkdown, file IDs, ...).
Assigner for Auto-Computation Use the declarative @assigner decorator to compute field values automatically.

Requirements

Python ≥ 3.13

Installation

pip install airalogy

Release and PyPI publishing steps for maintainers are documented in RELEASING.md.

Quick Start

Use one typed AIMD

protocol.aimd

# Serum sample collection
Participant: {{var|subject_name: UserName, title="Participant name"}}
Collection time: {{var|collected_at: CurrentTime}}
Serum volume (mL): {{var|serum_volume: float, gt=0}}
Ice-bath time (min): {{var|ice_time: int = 0, ge=0}}
Sample photo: {{var|sample_photo: FileIdPNG, description="Upload collection photo"}}

{{step|collect}} Collect serum sample as per standard procedure.
{{step|verify_labels, 2}} Verify labels and IDs.
{{step|ice_hold, 2, duration="10m", timer="countdown"}} Immediately place sample on ice.

{{check|info_confirmed}} Confirm details and metadata.
  • Run airalogy check to validate the AIMD and use it directly.
  • Need an explicit model file? airalogy generate_model protocol.aimd -o model.py auto-generates the Pydantic model that matches these types.

Extended: add model and assigner

protocol/
├─ protocol.aimd  # Airalogy Markdown
├─ model.py       # Optional: Define data validation model
└─ assigner.py    # Optional: Define auto-computation logic

protocol.aimd

# Reagent preparation
Solvent name: {{var|solvent_name}}
Target solution volume (L): {{var|target_solution_volume}}
Solute name: {{var|solute_name}}
Solute molar mass (g/mol): {{var|solute_molar_mass}}
Target molar concentration (mol/L): {{var|target_molar_concentration}}
Required solute mass (g): {{var|required_solute_mass}}

model.py

from pydantic import BaseModel, Field

class VarModel(BaseModel):
    solvent_name: str
    target_solution_volume: float = Field(gt=0)
    solute_name: str
    solute_molar_mass: float = Field(gt=0)
    target_molar_concentration: float = Field(gt=0)
    required_solute_mass: float = Field(gt=0)

assigner.py

from airalogy.assigner import AssignerResult, assigner


@assigner(
    assigned_fields=["required_solute_mass"],
    dependent_fields=[
        "target_solution_volume",
        "solute_molar_mass",
        "target_molar_concentration",
    ],
    mode="auto",
)
def calculate_required_solute_mass(dependent_fields: dict) -> AssignerResult:
    target_solution_volume = dependent_fields["target_solution_volume"]
    solute_molar_mass = dependent_fields["solute_molar_mass"]
    target_molar_concentration = dependent_fields["target_molar_concentration"]

    required_solute_mass = (
        target_solution_volume * target_molar_concentration * solute_molar_mass
    )

    return AssignerResult(
        assigned_fields={
            "required_solute_mass": required_solute_mass,
        },
    )

Command Line Interface

Airalogy provides a CLI tool for common operations. After installation, you can use the airalogy command:

$ airalogy --help
usage: airalogy [-h] [-v] {check,c,generate_model,gm,generate_assigner,ga,pack,unpack} ...

Airalogy CLI - Tools for Airalogy

positional arguments:
  {check,c,generate_model,gm,generate_assigner,ga,pack,unpack}
                        Available commands
    check (c)           Check AIMD syntax
    generate_model (gm)
                        Generate VarModel
    generate_assigner (ga)
                        Generate Assigner
    pack                Pack a protocol directory or record JSON files into a single-file archive
    unpack              Unpack an Airalogy archive

options:
  -h, --help            show this help message and exit
  -v, --version         show program's version number and exit

Syntax Checking

Check AIMD syntax:

# Check default protocol.aimd file
airalogy check

# Check specific AIMD file
airalogy check my_protocol.aimd

# Using alias
airalogy c my_protocol.aimd

Model Generation

Generate VarModel from AIMD file:

# Generate model.py from protocol.aimd
airalogy generate_model

# Generate with custom output file
airalogy generate_model my_protocol.aimd -o my_model.py

# Using alias
airalogy gm my_protocol.aimd -o custom_model.py

Assigner Extraction

Extract inline assigner code blocks into assigner.py:

# Generate assigner.py from protocol.aimd (and strip inline blocks)
airalogy generate_assigner

# Using alias
airalogy ga my_protocol.aimd -o assigner.py

Single-file Archives

Airalogy uses one unified archive suffix, .aira. The concrete payload type is stored in the internal manifest as kind, for example protocol or records.

Pack a protocol directory into a shareable .aira file:

airalogy pack ./my_protocol -o my_protocol.aira

Pack one or more record JSON files into a .aira file:

airalogy pack ./record.json ./record-history.json -o records.aira

If you want the record bundle to carry the related protocol definition as well, embed the protocol directory:

airalogy pack ./record.json -o record_bundle.aira --protocol-dir ./my_protocol

Unpack either archive type:

airalogy unpack ./my_protocol.aira -o ./extracted_protocol
airalogy unpack ./record_bundle.aira -o ./extracted_bundle

Notes:

  • Protocol archives preserve the original protocol directory layout, including files/.
  • Record archives accept JSON files containing either one record object or a list of record objects.
  • Both archive kinds use the same .aira suffix; inspect _airalogy_archive/manifest.json to determine whether the payload is a protocol archive or a record bundle.
  • Protocol packing excludes .env and common cache artifacts by default so local secrets are not bundled accidentally.
  • Record archives currently bundle JSON records and optional embedded protocol directories. They do not automatically dereference remote Airalogy file IDs into raw file bytes.

Document Conversion (MarkItDown)

Airalogy provides a unified API to convert documents into Markdown.

pip install "airalogy[markitdown]"
# or (uv)
uv add "airalogy[markitdown]"
from airalogy.convert import to_markdown
print(to_markdown("report.pdf", backend="markitdown").text)

See docs: docs/airalogy/en/apis/convert.md / docs/airalogy/zh/apis/convert.md.

Development Setup

We use uv for environment management and build, ruff for lint/format.

setup project environment:

uv sync

Install all optional backends (extras) as well:

uv sync --all-extras

Or install a specific extra (example: markitdown):

uv sync --extra markitdown

Testing

uv run pytest

License

Apache 2.0

Cite This Framework

@misc{yang2025airalogyaiempowereduniversaldata,
      title={Airalogy: AI-empowered universal data digitization for research automation}, 
      author={Zijie Yang and Qiji Zhou and Fang Guo and Sijie Zhang and Yexun Xi and Jinglei Nie and Yudian Zhu and Liping Huang and Chou Wu and Yonghe Xia and Xiaoyu Ma and Yingming Pu and Panzhong Lu and Junshu Pan and Mingtao Chen and Tiannan Guo and Yanmei Dou and Hongyu Chen and Anping Zeng and Jiaxing Huang and Tian Xu and Yue Zhang},
      year={2025},
      eprint={2506.18586},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2506.18586}, 
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

airalogy-0.11.0-py3-none-any.whl (99.3 kB view details)

Uploaded Python 3

File details

Details for the file airalogy-0.11.0-py3-none-any.whl.

File metadata

  • Download URL: airalogy-0.11.0-py3-none-any.whl
  • Upload date:
  • Size: 99.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for airalogy-0.11.0-py3-none-any.whl
Algorithm Hash digest
SHA256 cd627396835cd9ca7be631df9f44637e92c3f6ac287006a5c2ca36193b517e9b
MD5 58ca9d011f6b80b131a167d95c693f9b
BLAKE2b-256 337143c41332c2c814f6dfdd64f89cf5379e610a4fda315fbe2ed42d08e7a115

See more details on using hashes here.

Provenance

The following attestation bundles were made for airalogy-0.11.0-py3-none-any.whl:

Publisher: release.yml on airalogy/airalogy

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page