Skip to main content

Maya-inspired numerical encodings for machine learning: Vigesimal Feature Decomposition (VFD) and Maya Calendar Encoding (MCE)

Project description

maya-encoding

CI PyPI version Python 3.9+ License: MIT Downloads Docs

Maya-inspired numerical encodings for machine learning.

Documentation · PyPI · Examples

Two scikit-learn compatible transformers that use the mathematical structure of the ancient Maya number system and calendar to create richer feature representations.

Overview

Encoder Input What it does Use case
VFDEncoder Numeric features Decomposes into base-20 digits, bars (÷5), dots (%5) Multi-scale numeric patterns
MayaCalendarEncoder Dates Extracts Tzolk'in (260d), Haab' (365d), Long Count cycles Temporal feature engineering

Installation

pip install maya-encoding

With optional dependencies:

pip install maya-encoding[viz]         # matplotlib visualization
pip install maya-encoding[benchmarks]  # xgboost, seaborn for benchmarks
pip install maya-encoding[dev]         # development tools (ruff, pytest)

Quick Start

VFD: Numeric Feature Encoding

import numpy as np
from maya_encoding import VFDEncoder
from sklearn.pipeline import Pipeline
from sklearn.ensemble import RandomForestRegressor

# VFD decomposes numbers into vigesimal digits, bars, and dots
encoder = VFDEncoder(components='full')

# Works seamlessly in sklearn pipelines
pipe = Pipeline([
    ('encode', VFDEncoder()),
    ('model', RandomForestRegressor())
])
pipe.fit(X_train, y_train)

How it works — the number 347 becomes:

347 = 17×20 + 7

Level 0 (ones):     digit=7,  bars=1, dots=2
Level 1 (twenties): digit=17, bars=3, dots=2

Feature vector: [7, 1, 2, 17, 3, 2]  →  normalized: [0.37, 0.33, 0.50, 0.89, 1.00, 0.50]

Three "zoom levels" per number: coarse magnitude (digits), medium grouping (bars), and fine residual (dots).

Passthrough Mode: Best of Both Worlds

Use passthrough=True to keep original features alongside VFD features — ideal for tree-based models:

# Original features + VFD features combined
pipe = Pipeline([
    ('encode', VFDEncoder(passthrough=True)),
    ('model', GradientBoostingRegressor())
])

MCE: Temporal Feature Encoding

import numpy as np
from maya_encoding import MayaCalendarEncoder

# Encode dates using Maya calendar cycles
encoder = MayaCalendarEncoder(
    components=['tzolkin', 'haab', 'long_count'],
    cyclical=True,  # sine/cosine for smooth cycle boundaries
)

dates = np.array(["2024-01-01", "2024-06-15", "2024-12-21"])
features = encoder.fit_transform(dates)

The Maya calendar provides interlocking cycles of coprime periods (13, 20, 260, 365, 360), capturing multi-scale temporal patterns that standard encoding requires manual period selection to achieve.

Explore Maya Numbers

from maya_encoding import maya_decompose, to_vigesimal, to_bars_dots

# Convert to vigesimal
digits = to_vigesimal(347)  # [7, 17] (LSB first)

# Full decomposition
info = maya_decompose(347)
# {'digits': [7, 17], 'bars': [1, 3], 'dots': [2, 2], 'n_levels': 2}

# Visualize
from maya_encoding.visualization.glyphs import render_maya_text
print(render_maya_text(347))

Explore Maya Calendar

from maya_encoding.core.calendar import (
    gregorian_to_jdn, jdn_to_tzolkin, jdn_to_haab, jdn_to_long_count
)

# December 21, 2012 — end of the 13th b'ak'tun
jdn = gregorian_to_jdn("2012-12-21")
print(jdn_to_tzolkin(jdn))     # (4, 19) → 4 Ajaw
print(jdn_to_haab(jdn))        # (13, 3) → month 13, day 3
print(jdn_to_long_count(jdn))  # (13, 0, 0, 0, 0) → 13.0.0.0.0

Results at a Glance

VFD — California Housing Regression (R², 5-fold CV)

Encoding Linear Regression Ridge Random Forest Gradient Boosting
Raw + Scaled 0.5530 0.5530 0.6561 0.6852
VFD-lite 0.5832 0.5812 0.5445 0.5742
VFD-full 0.5742 0.5723 0.5891 0.6184
VFD-lite + passthrough 0.5985 0.5968 0.6588 0.6899
VFD-full + passthrough 0.5908 0.5881 0.6615 0.6937

MCE — Temporal Cycle Detection (R², synthetic data)

Configuration Train R² Test R²
All components + cyclical 0.9875 0.9146
Tzolk'in only 0.3656 0.0707
Haab' only 0.6212 0.5891

Fraud Detection (F1, 5-fold stratified CV)

Pipeline Logistic Regression Random Forest Gradient Boosting
Baseline (PCA) 0.7082 0.8961 0.8729
VFD (replace amount) 0.6876 0.8971 0.8816
VFD + passthrough 0.6903 0.8993 0.8816

Rule of thumb: Linear models → use VFD directly. Tree-based models → always use passthrough=True.

When to Use Maya Encoding

Encoder Strong Fit Acceptable Fit
VFDEncoder Discrete/count data (retail, events, scores), linear models Continuous features with passthrough=True for tree models
MayaCalendarEncoder Tropical/biological time series (agriculture, epidemiology, climate) General time series with unexplained seasonal variance

VFD decomposes numbers into a natural hierarchy — digits (×20), bars (×5), dots (×1). This is a strict information superset: the model gets multi-scale structure for free. Linear models see +3–4% R²; tree-based models benefit with passthrough=True.

MCE provides orthogonal cycles with coprime periods (13, 20, 260, 365) that capture patterns Gregorian features miss. The 260-day Tzolk'in correlates with human gestation, maize growing cycles, and tropical astronomical events.

Full guide: When to Use Maya Encoding

API Reference

VFDEncoder

Parameter Default Description
n_levels 'auto' Vigesimal levels (auto-detected from data)
components 'full' 'full', 'lite' (digits only), 'bars_dots'
normalize True Normalize features to [0, 1]
handle_negative 'abs_sign' 'abs_sign', 'shift', 'error'
handle_float 'scale' 'scale', 'round', 'integer_part'
passthrough False Keep original features alongside VFD output
scale_factor 'auto' Decimal precision auto-detection

MayaCalendarEncoder

Parameter Default Description
components ['tzolkin', 'haab', 'long_count'] Calendar systems to use
tzolkin_encoding 'separate' 'separate' (number + name) or 'combined' (position 0-259)
haab_encoding 'hierarchical' 'hierarchical' (with bars/dots) or 'flat' (day 0-364)
long_count_levels 3 1–5: k'in, uinal, tun, k'atun, b'ak'tun
cyclical True Add sine/cosine pairs for smooth cycle boundaries
epoch 'gmt' 'gmt' (standard), 'spinden', or custom JDN
wayeb_flag True Binary flag for the 5-day Wayeb' period

Examples

See the examples/ directory:

Development

git clone https://github.com/DanielRegaladoUMiami/maya-encoding.git
cd maya-encoding
pip install -e ".[dev]"
pytest          # Run 124 tests
ruff check .    # Lint

Run benchmarks:

pip install -e ".[benchmarks]"
python benchmarks/run_vfd_benchmarks.py
python benchmarks/run_mce_benchmarks.py

Citation

If you use maya-encoding in your research, please cite:

@software{regalado2026maya,
  author = {Regalado, Daniel},
  title = {maya-encoding: Maya-Inspired Numerical Encodings for Machine Learning},
  year = {2026},
  url = {https://github.com/DanielRegaladoUMiami/maya-encoding}
}

License

MIT License. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

maya_encoding-0.2.1.tar.gz (723.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

maya_encoding-0.2.1-py3-none-any.whl (27.4 kB view details)

Uploaded Python 3

File details

Details for the file maya_encoding-0.2.1.tar.gz.

File metadata

  • Download URL: maya_encoding-0.2.1.tar.gz
  • Upload date:
  • Size: 723.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for maya_encoding-0.2.1.tar.gz
Algorithm Hash digest
SHA256 f5f00d3c4e2df57e6ad76ba1715e3fc1b341dfcc47ae89f62b719986ec1eabbe
MD5 5868060d52b6bd955f826c3ef0e9e333
BLAKE2b-256 6d5468355908f81120f64e3a33d3967d57ff9f8b97251c96bd0c5dcbcf5214ec

See more details on using hashes here.

File details

Details for the file maya_encoding-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: maya_encoding-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 27.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for maya_encoding-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 e692f1e0761be20320c014a4d3254e37d9039af15051062dcfe19ac74d5079fb
MD5 8196d02d610175411725a3e9a2674633
BLAKE2b-256 5a4a86459d7f3990421fe5f3f733212afe8175b9cd0b5219aab547fb33c4f72f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page