Python bindings for fiasto - A language-agnostic modern Wilkinson's formula parser and lexer
Project description
fiasto-py
fiasto-py
Pronouned like fiasco, but with a t instead of a c
(F)ormulas (I)n (AST) (O)ut
Python bindings for fiasto - A language-agnostic modern Wilkinson's formula parser and lexer.
🎯 Features
- Parse Wilkinson's Formulas: Convert formula strings into structured JSON metadata
- Tokenize Formulas: Break down formulas into individual tokens with detailed information
- Python Dictionaries: Returns native Python dictionaries for easy integration
🎯 Simple API
parse_formula()- Takes a Wilkinson’s formula string and returns a Python dictionarylex_formula()- Tokenizes a formula string and returns a Python dictionary
🚀 Quick Start
Installation
Install from PyPI (recommended):
pip install fiasto-py
Usage
Usage: Parse Formula
import fiasto_py
from pprint import pprint
# Parse a formula into structured metadata
print("="*30)
print("Parse Formula")
print("="*30)
result = fiasto_py.parse_formula("y ~ x1 + x2 + (1|group)")
pprint(result, compact = True)
Output:
==============================
Parse Formula
==============================
{'all_generated_columns': ['y', 'x1', 'x2', 'group'],
'columns': {'group': {'generated_columns': ['group'],
'id': 4,
'interactions': [],
'random_effects': [{'correlated': True,
'grouping_variable': 'group',
'has_intercept': True,
'includes_interactions': [],
'kind': 'grouping',
'variables': []}],
'roles': ['GroupingVariable'],
'transformations': []},
'x1': {'generated_columns': ['x1'],
'id': 2,
'interactions': [],
'random_effects': [],
'roles': ['FixedEffect'],
'transformations': []},
'x2': {'generated_columns': ['x2'],
'id': 3,
'interactions': [],
'random_effects': [],
'roles': ['FixedEffect'],
'transformations': []},
'y': {'generated_columns': ['y'],
'id': 1,
'interactions': [],
'random_effects': [],
'roles': ['Response'],
'transformations': []}},
'formula': 'y ~ x1 + x2 + (1|group)',
'metadata': {'family': None,
'has_intercept': True,
'has_uncorrelated_slopes_and_intercepts': False,
'is_random_effects_model': True}}
Usage: Lex Formula
import fiasto_py
from pprint import pprint
print("="*30)
print("Lex Formula")
print("="*30)
tokens = fiasto_py.lex_formula("y ~ x1 + x2 + (1|group)")
pprint(tokens, compact = True)
Output:
==============================
Lex Formula
==============================
[{'lexeme': 'y', 'token': 'ColumnName'},
{'lexeme': '~', 'token': 'Tilde'},
{'lexeme': 'x1', 'token': 'ColumnName'},
{'lexeme': '+', 'token': 'Plus'},
{'lexeme': 'x2', 'token': 'ColumnName'},
{'lexeme': '+', 'token': 'Plus'},
{'lexeme': '(', 'token': 'FunctionStart'},
{'lexeme': '1', 'token': 'One'},
{'lexeme': '|', 'token': 'Pipe'},
{'lexeme': 'group', 'token': 'ColumnName'},
{'lexeme': ')', 'token': 'FunctionEnd'}]
Simple OLS Regression
import fiasto_py
import polars as pl
import numpy as np
from pprint import pprint
# Load data
mtcars_path = "https://gist.githubusercontent.com/seankross/a412dfbd88b3db70b74b/raw/5f23f993cd87c283ce766e7ac6b329ee7cc2e1d1/mtcars.csv"
df = pl.read_csv(mtcars_path)
# Parse formula
formula = "mpg ~ wt + cyl"
result = fiasto_py.parse_formula(formula)
pprint(result)
# Find the response column(s)
response_cols = [
col for col, details in result["columns"].items()
if "Response" in details["roles"]
]
# Find non-response columns
preds = [
col for col, details in result["columns"].items()
if "Response" not in details["roles"]
]
# Has intercept
has_intercept = result["metadata"]["has_intercept"]
# Prepare data matrices
X = df.select(preds).to_numpy()
y = df.select(response_cols).to_numpy().ravel()
# Add intercept if metadata says so
if has_intercept:
X_with_intercept = np.column_stack([np.ones(X.shape[0]), X])
else:
X_with_intercept = X
# Solve normal equations: (X'X)^-1 X'y
XTX = X_with_intercept.T @ X_with_intercept
XTy = X_with_intercept.T @ y
coefficients = np.linalg.solve(XTX, XTy)
# Extract intercept and slopes
if has_intercept:
intercept = coefficients[0]
slopes = coefficients[1:]
else:
intercept = 0.0
slopes = coefficients
# Calculate R2
y_pred = X_with_intercept @ coefficients
ss_res = np.sum((y - y_pred) ** 2)
ss_tot = np.sum((y - np.mean(y)) ** 2)
r_squared = 1 - (ss_res / ss_tot)
# Prep Output
# Combine intercept and slopes into one dict
coef_dict = {"intercept": intercept} | dict(zip(preds, slopes))
# Create a tidy DataFrame
coef_df = pl.DataFrame(
{
"term": list(coef_dict.keys()),
"estimate": list(coef_dict.values())
}
)
# Print results
print(f"Formula: {formula}")
print(f"R² Score: {r_squared:.3f}")
print(coef_df)
Output:
📋 Supported Formula Syntax
fiasto supports comprehensive Wilkinson's notation including:
- Basic formulas:
y ~ x1 + x2 - Interactions:
y ~ x1 * x2 - Smooth terms:
y ~ s(z) - Random effects:
y ~ x + (1|group) - Complex random effects:
y ~ x + (1+x|group)
Supported Formulas (Coming Soon)
- Multivariate models:
mvbind(y1, y2) ~ x + (1|g) - Non-linear models:
y ~ a1 - a2^x, a1 ~ 1, a2 ~ x + (x|g), nl = TRUE
For the complete reference, see the fiasto documentation.
📦 PyPI Package
The package is available on PyPI and can be installed with:
pip install fiasto-py
- PyPI Page: pypi.org/project/fiasto-py
- Source Code: github.com/alexhallam/fiasto-py
- Documentation: This README and inline docstrings
📚 API Reference
parse_formula(formula: str) -> dict
Parse a Wilkinson's formula string and return structured JSON metadata.
Parameters:
formula(str): The formula string to parse
Returns:
dict: Structured metadata describing the formula
Raises:
ValueError: If the formula is invalid or parsing fails
lex_formula(formula: str) -> dict
Tokenize a formula string and return JSON describing each token.
Parameters:
formula(str): The formula string to tokenize
Returns:
dict: Token information for each element in the formula
Raises:
ValueError: If the formula is invalid or lexing fails
🤝 Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
🙏 Acknowledgments
- fiasto - The underlying Rust library
- PyO3 - Python-Rust bindings
- maturin - Build system for Python extensions
- PyPI - Python Package Index for distribution
📄 License
This project is licensed under the MIT License - see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fiasto_py-0.1.1-cp313-cp313-macosx_11_0_arm64.whl.
File metadata
- Download URL: fiasto_py-0.1.1-cp313-cp313-macosx_11_0_arm64.whl
- Upload date:
- Size: 287.8 kB
- Tags: CPython 3.13, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f8e419ddbfb5054ab3177697fd93bd9cfb8d7fefedd3211f896061137214e053
|
|
| MD5 |
9710d77aa9c2986e80af00bed0fec3d4
|
|
| BLAKE2b-256 |
feeb5778781bc791989f967c340545f11770a3a5ec5b3206e89f486bcc94a872
|