Skip to main content

Python bindings for fiasto - A language-agnostic modern Wilkinson's formula parser and lexer

Project description

fiasto-py

PyPI version Python versions License: MIT

fiasto-py

logo


Pronouned like fiasco, but with a t instead of a c


(F)ormulas (I)n (AST) (O)ut

Python bindings for fiasto - A language-agnostic modern Wilkinson's formula parser and lexer.

🎯 Features

  • Parse Wilkinson's Formulas: Convert formula strings into structured JSON metadata
  • Tokenize Formulas: Break down formulas into individual tokens with detailed information
  • Python Dictionaries: Returns native Python dictionaries for easy integration

🎯 Simple API

  • parse_formula() - Takes a Wilkinson’s formula string and returns a Python dictionary
  • lex_formula() - Tokenizes a formula string and returns a Python dictionary

🚀 Quick Start

Installation

Install from PyPI (recommended):

pip install fiasto-py

Usage

Usage: Parse Formula

import fiasto_py
from pprint import pprint
# Parse a formula into structured metadata
print("="*30)
print("Parse Formula")
print("="*30)
result = fiasto_py.parse_formula("y ~ x1 + x2 + (1|group)")
pprint(result, compact = True)

Output:

==============================
Parse Formula
==============================
{'all_generated_columns': ['y', 'x1', 'x2', 'group'],
 'columns': {'group': {'generated_columns': ['group'],
                       'id': 4,
                       'interactions': [],
                       'random_effects': [{'correlated': True,
                                           'grouping_variable': 'group',
                                           'has_intercept': True,
                                           'includes_interactions': [],
                                           'kind': 'grouping',
                                           'variables': []}],
                       'roles': ['GroupingVariable'],
                       'transformations': []},
             'x1': {'generated_columns': ['x1'],
                    'id': 2,
                    'interactions': [],
                    'random_effects': [],
                    'roles': ['FixedEffect'],
                    'transformations': []},
             'x2': {'generated_columns': ['x2'],
                    'id': 3,
                    'interactions': [],
                    'random_effects': [],
                    'roles': ['FixedEffect'],
                    'transformations': []},
             'y': {'generated_columns': ['y'],
                   'id': 1,
                   'interactions': [],
                   'random_effects': [],
                   'roles': ['Response'],
                   'transformations': []}},
 'formula': 'y ~ x1 + x2 + (1|group)',
 'metadata': {'family': None,
              'has_intercept': True,
              'has_uncorrelated_slopes_and_intercepts': False,
              'is_random_effects_model': True}}

Usage: Lex Formula

import fiasto_py
from pprint import pprint
print("="*30)
print("Lex Formula")
print("="*30)
tokens = fiasto_py.lex_formula("y ~ x1 + x2 + (1|group)")
pprint(tokens, compact = True)

Output:

==============================
Lex Formula
==============================
[{'lexeme': 'y', 'token': 'ColumnName'},
 {'lexeme': '~', 'token': 'Tilde'},
 {'lexeme': 'x1', 'token': 'ColumnName'},
 {'lexeme': '+', 'token': 'Plus'},
 {'lexeme': 'x2', 'token': 'ColumnName'},
 {'lexeme': '+', 'token': 'Plus'},
 {'lexeme': '(', 'token': 'FunctionStart'},
 {'lexeme': '1', 'token': 'One'},
 {'lexeme': '|', 'token': 'Pipe'},
 {'lexeme': 'group', 'token': 'ColumnName'},
 {'lexeme': ')', 'token': 'FunctionEnd'}]

Simple OLS Regression

import fiasto_py
import polars as pl
import numpy as np
from pprint import pprint

# Load data
mtcars_path = "https://gist.githubusercontent.com/seankross/a412dfbd88b3db70b74b/raw/5f23f993cd87c283ce766e7ac6b329ee7cc2e1d1/mtcars.csv"
df = pl.read_csv(mtcars_path)

# Parse formula
formula = "mpg ~ wt + cyl"
result = fiasto_py.parse_formula(formula)

pprint(result)

# Find the response column(s)
response_cols = [
    col for col, details in result["columns"].items()
    if "Response" in details["roles"]
]

# Find non-response columns
preds = [
    col for col, details in result["columns"].items()
    if "Response" not in details["roles"]
]

# Has intercept
has_intercept = result["metadata"]["has_intercept"]

# Prepare data matrices
X = df.select(preds).to_numpy()
y = df.select(response_cols).to_numpy().ravel()

# Add intercept if metadata says so
if has_intercept:
    X_with_intercept = np.column_stack([np.ones(X.shape[0]), X])
else:
    X_with_intercept = X

# Solve normal equations: (X'X)^-1 X'y
XTX = X_with_intercept.T @ X_with_intercept
XTy = X_with_intercept.T @ y
coefficients = np.linalg.solve(XTX, XTy)

# Extract intercept and slopes
if has_intercept:
    intercept = coefficients[0]
    slopes = coefficients[1:]
else:
    intercept = 0.0
    slopes = coefficients

# Calculate R2
y_pred = X_with_intercept @ coefficients
ss_res = np.sum((y - y_pred) ** 2)
ss_tot = np.sum((y - np.mean(y)) ** 2)
r_squared = 1 - (ss_res / ss_tot)

# Prep Output
# Combine intercept and slopes into one dict
coef_dict = {"intercept": intercept} | dict(zip(preds, slopes))

# Create a tidy DataFrame
coef_df = pl.DataFrame(
    {
        "term": list(coef_dict.keys()),
        "estimate": list(coef_dict.values())
    }
)

# Print results
print(f"Formula: {formula}")
print(f"R² Score: {r_squared:.3f}")
print(coef_df)

Output: OLS Regression

📋 Supported Formula Syntax

fiasto supports comprehensive Wilkinson's notation including:

  • Basic formulas: y ~ x1 + x2
  • Interactions: y ~ x1 * x2
  • Smooth terms: y ~ s(z)
  • Random effects: y ~ x + (1|group)
  • Complex random effects: y ~ x + (1+x|group)

Supported Formulas (Coming Soon)

  • Multivariate models: mvbind(y1, y2) ~ x + (1|g)
  • Non-linear models: y ~ a1 - a2^x, a1 ~ 1, a2 ~ x + (x|g), nl = TRUE

For the complete reference, see the fiasto documentation.

📦 PyPI Package

The package is available on PyPI and can be installed with:

pip install fiasto-py

📚 API Reference

parse_formula(formula: str) -> dict

Parse a Wilkinson's formula string and return structured JSON metadata.

Parameters:

  • formula (str): The formula string to parse

Returns:

  • dict: Structured metadata describing the formula

Raises:

  • ValueError: If the formula is invalid or parsing fails

lex_formula(formula: str) -> dict

Tokenize a formula string and return JSON describing each token.

Parameters:

  • formula (str): The formula string to tokenize

Returns:

  • dict: Token information for each element in the formula

Raises:

  • ValueError: If the formula is invalid or lexing fails

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

🙏 Acknowledgments

  • fiasto - The underlying Rust library
  • PyO3 - Python-Rust bindings
  • maturin - Build system for Python extensions
  • PyPI - Python Package Index for distribution

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fiasto_py-0.1.1-cp313-cp313-macosx_11_0_arm64.whl (287.8 kB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

File details

Details for the file fiasto_py-0.1.1-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for fiasto_py-0.1.1-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 f8e419ddbfb5054ab3177697fd93bd9cfb8d7fefedd3211f896061137214e053
MD5 9710d77aa9c2986e80af00bed0fec3d4
BLAKE2b-256 feeb5778781bc791989f967c340545f11770a3a5ec5b3206e89f486bcc94a872

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page