JAX-first hierarchical search and fitting for count, CMF, duration, and linear models.
Project description
metacountregressor Cookbook
metacountregressor is a JAX-first package for hierarchical model fitting and metaheuristic structure search across:
- count models
- CMF models
- duration models
- linear models
This cookbook now uses the bundled Example 16-3 data from the linked CSV source and keeps the original source column names.
1. Install
python -m pip install -e .
python -m pip install jax jaxlib jaxopt
Quick import check:
python -c "from metacountregressor import __version__, load_example16_3_raw_data; print(__version__, load_example16_3_raw_data().shape)"
2. Example Data In The Package
The package now exposes the Example 16-3 data directly:
from metacountregressor import load_example16_3_raw_data, load_example16_3_model_data
raw_df = load_example16_3_raw_data()
model_df = load_example16_3_model_data()
3.1 Raw data loader
load_example16_3_raw_data() returns the original CSV columns:
IDFREQLENGTHINCLANESDECLANESWIDTHMIMEDSHMXMEDSHSPEEDURBFCAADTSINGLEDOUBLETRAINPEAKHRGRADEBRMIGRADEMXGRADEMXGRDIFFTANGENTCURVESMINRADACCESSMEDWIDTHFRICTIONADTLANESLOPEINTECHAGAVEPREAVESNOW
3.2 Model-ready loader
load_example16_3_model_data() preserves all source columns and adds:
OFFSETFC_ENCODEDFC_LABEL
Notes:
FCremains the original source coding from the Example 16-3 data.FC_ENCODEDis a clean ordered encoding of the observedFCcategories for comparison experiments.FC_LABELis a readable string form likeFC_1,FC_2,FC_5.
3. Build The Main ExperimentBuilder
from metacountregressor import ExperimentBuilder, load_example16_3_model_data
df = load_example16_3_model_data()
builder = ExperimentBuilder(
df=df,
id_col="ID",
y_col="FREQ",
offset_col="OFFSET",
group_id_col="FC",
)
Which arguments can be None
In ExperimentBuilder(...):
id_colRequired. Do not passNone.y_colRequired. Do not passNone.offset_colOptional. You can passNone.group_id_colOptional. You can passNone.
In build_evaluator(...):
variables=NoneUses all candidate columns.fixed_override=NoneNo variable-specific fixed-role restrictions.membership_override=NoneNo variable-specific membership-role restrictions.exclude=NoneDo not exclude extra columns.default_roles=NoneLet the package choose family defaults.
In CMF helpers:
offset_col=NoneAllowed.group_id_col=NoneAllowed.variables=NoneAllowed.
Helpful inspection:
builder.describe()
builder.suggest_config(max_latent_classes=2)
print(builder.get_family_capabilities())
print(builder.get_search_argument_guide())
4. Main Search Arguments
Shared arguments:
algoUsesa,hc,de, orhs.RNumber of simulation draws.max_iterSearch iterations.max_latent_classesMaximum latent classes allowed.variablesCandidate search columns.default_rolesAllowed structural roles.fixed_overrideRestrict roles for named variables.membership_overrideRestrict membership roles for named variables.
To save results consistently:
from metacountregressor import SearchOutputConfig
output_config = SearchOutputConfig(
output_dir="results",
experiment_name="example16_3_count_search",
search_description="Count model search on Example 16-3 data",
)
5. Role Codes
| Code | Meaning |
|---|---|
0 |
Excluded |
1 |
Fixed |
2 |
Random independent |
3 |
Random correlated |
4 |
Grouped random |
5 |
Heterogeneity in means |
6 |
Zero inflation |
7 |
Membership only |
8 |
Membership plus fixed outcome |
Random-parameter distributions:
normallognormaltriangularuniform
6. Count Models
7.1 Count search
evaluator = builder.build_count_evaluator(
variables=[
"AADT",
"LENGTH",
"SPEED",
"CURVES",
"TANGENT",
"SLOPE",
"ACCESS",
"URB",
"AVEPRE",
],
mode="single",
max_latent_classes=2,
R=200,
default_roles=[0, 1, 2, 3, 4, 5, 6, 7, 8],
)
result = builder.run(
evaluator=evaluator,
algo="sa",
max_iter=2000,
seed=42,
output_config=output_config,
)
7.2 Manual count model
manual_spec = builder.make_manual_spec(
fixed_terms=["AADT", "LENGTH", "SPEED"],
rdm_terms=["CURVES:normal"],
rdm_cor_terms=["TANGENT:normal", "SLOPE:lognormal"],
hetro_in_means=["AVEPRE"],
zi_terms=["ACCESS"],
membership_terms=["URB"],
dispersion=1,
latent_classes=2,
)
fit = builder.fit_manual_model(
manual_spec=manual_spec,
model="nb",
R=200,
)
7. CMF Models
CMF models use:
log(mu) = baseline block + local block * log(AADT)
The default CMF route transforms the CMF design and then runs on the main JAX hierarchical architecture.
8.1 CMF search
cmf_search = builder.build_evaluator(
model_family="cmf",
aadt_col="AADT",
baseline_vars=["URB", "ACCESS", "GRADEBR"],
local_vars=["CURVES", "SLOPE", "WIDTH"],
variables=["AVEPRE", "AVESNOW", "FC_ENCODED"],
mode="single",
max_latent_classes=2,
R=200,
default_roles=[0, 1, 2, 3, 4, 5, 6, 7, 8],
)
cmf_result = builder.run_search(
cmf_search,
algo="sa",
max_iter=2000,
seed=7,
)
8.2 Manual CMF model
from metacountregressor import CMFExperimentBuilder
cmf_builder = CMFExperimentBuilder(
df=df,
y_col="FREQ",
aadt_col="AADT",
baseline_vars=["URB", "ACCESS"],
local_vars=["CURVES", "SLOPE"],
)
manual_cmf_spec = cmf_builder.make_manual_cmf_spec(
baseline_fixed=["URB"],
baseline_correlated=["ACCESS"],
local_random=["CURVES"],
local_correlated=["SLOPE"],
hetro_in_means=["AVEPRE"],
zi_terms=["INTECHAG"],
membership_terms=["FC_ENCODED"],
dispersion=1,
latent_classes=2,
)
cmf_fit = cmf_builder.fit_manual_cmf_model(
id_col="ID",
offset_col="OFFSET",
group_id_col="FC",
manual_spec=manual_cmf_spec,
model="nb",
R=200,
)
8.3 Legacy GA-CMF route
legacy_cmf = builder.build_evaluator(
model_family="cmf",
cmf_driver="ga",
aadt_col="AADT",
baseline_vars=["URB", "ACCESS"],
local_vars=["CURVES", "SLOPE"],
)
8. Duration Models
The default duration route now uses the main JAX hierarchical architecture with a lognormal family.
Use the model-ready duration loader:
from metacountregressor import ExperimentBuilder, load_example_duration_data
duration_df = load_example_duration_data()
duration_builder = ExperimentBuilder(
df=duration_df,
id_col="ID",
y_col="DURATION",
offset_col=None,
group_id_col="FC",
)
9.1 Duration search
duration_search = duration_builder.build_evaluator(
model_family="duration",
variables=["WIDTH", "CURVES", "SLOPE", "URB", "FC_ENCODED"],
budget_col="AADT",
mode="single",
max_latent_classes=2,
R=200,
default_roles=[0, 1, 2, 3, 4, 5, 6, 7, 8],
)
9.2 Manual duration model
duration_spec = duration_builder.make_manual_spec(
fixed_terms=["WIDTH"],
rdm_terms=["CURVES:normal"],
rdm_cor_terms=["SLOPE:normal", "URB:normal"],
hetro_in_means=["AVEPRE"],
membership_terms=["FC_ENCODED"],
latent_classes=2,
)
duration_fit = duration_builder.fit_manual_model(
manual_spec=duration_spec,
model="lognormal",
R=200,
)
9. Linear Models
The default linear route now uses the main JAX hierarchical architecture with a Gaussian family.
Use the model-ready linear loader:
from metacountregressor import ExperimentBuilder, load_example_linear_data
linear_df = load_example_linear_data()
linear_builder = ExperimentBuilder(
df=linear_df,
id_col="ID",
y_col="LINEAR_TARGET",
offset_col=None,
group_id_col="FC",
)
10.1 Linear search
linear_search = linear_builder.build_evaluator(
model_family="linear",
variables=["WIDTH", "CURVES", "SLOPE", "URB", "FC_ENCODED"],
mode="single",
max_latent_classes=2,
R=200,
default_roles=[0, 1, 2, 3, 4, 5, 6, 7, 8],
)
10.2 Manual linear model
linear_spec = linear_builder.make_manual_spec(
fixed_terms=["WIDTH"],
rdm_terms=["CURVES:normal"],
rdm_cor_terms=["SLOPE:normal", "URB:normal"],
hetro_in_means=["AVEPRE"],
membership_terms=["FC_ENCODED"],
latent_classes=2,
)
linear_fit = linear_builder.fit_manual_model(
manual_spec=linear_spec,
model="gaussian",
R=200,
)
10. Platform-Speed Linear Mixed Effects Example
The package now includes a synthetic example designed for linear mixed-effects style experiments around vehicle speed relative to a platform.
Load it with:
from metacountregressor import load_example_platform_speed_data, ExperimentBuilder
platform_df = load_example_platform_speed_data()
platform_builder = ExperimentBuilder(
df=platform_df,
id_col="PLATFORM_ID",
y_col="RELATIVE_SPEED",
offset_col=None,
group_id_col="PLATFORM_TYPE",
)
Available columns include:
DIST_TO_PLATFORMVEHICLE_SPEEDRELATIVE_SPEEDPOSTED_SPEEDAPPROACH_ACCELPLATFORM_TYPEPLATFORM_HEIGHTPLATFORM_WIDTHAT_PLATFORM
11.1 Linear mixed-effects style search
platform_linear_search = platform_builder.build_evaluator(
model_family="linear",
variables=[
"DIST_TO_PLATFORM",
"POSTED_SPEED",
"APPROACH_ACCEL",
"PLATFORM_HEIGHT",
"PLATFORM_WIDTH",
"AT_PLATFORM",
],
mode="single",
max_latent_classes=2,
R=200,
default_roles=[0, 1, 2, 3, 4, 5, 7, 8],
)
This is set up to model speed relative to the platform while allowing:
- random parameters
- correlated random parameters
- grouped effects
- heterogeneity in means
- latent classes
11. Duration Example: Time Until Another Vehicle Speeds Over The Platform
The package also includes a synthetic duration experiment for the time before another vehicle speeds over the platform.
Load it with:
from metacountregressor import load_example_platform_gap_duration_data, ExperimentBuilder
gap_df = load_example_platform_gap_duration_data()
gap_builder = ExperimentBuilder(
df=gap_df,
id_col="PLATFORM_ID",
y_col="DURATION_UNTIL_NEXT_SPEEDING",
offset_col=None,
group_id_col=None,
)
Available columns include:
DURATION_UNTIL_NEXT_SPEEDINGPRECEDING_VEHICLE_SPEEDFOLLOWING_VEHICLE_SPEEDPOSTED_SPEEDPLATFORM_HEIGHTPLATFORM_WIDTHAPPROACH_VOLUME
12.1 Duration search
gap_duration_search = gap_builder.build_evaluator(
model_family="duration",
variables=[
"PRECEDING_VEHICLE_SPEED",
"FOLLOWING_VEHICLE_SPEED",
"POSTED_SPEED",
"PLATFORM_HEIGHT",
"PLATFORM_WIDTH",
"APPROACH_VOLUME",
],
budget_col="POSTED_SPEED",
mode="single",
max_latent_classes=2,
R=200,
default_roles=[0, 1, 2, 3, 4, 5, 7, 8],
)
This uses the JAX hierarchical lognormal path and is intended for duration-before-speeding style analysis.
12. What Changing Search Arguments Does
Change the search algorithm
builder.run(evaluator=evaluator, algo="sa", max_iter=2000, seed=1)
builder.run(evaluator=evaluator, algo="de", max_iter=2000, seed=1)
builder.run(evaluator=evaluator, algo="hs", max_iter=2000, seed=1)
Change simulation draws
evaluator = builder.build_count_evaluator(R=500)
Higher R means:
- slower estimation
- more stable simulation-based fitting
Restrict allowed structures
evaluator = builder.build_count_evaluator(
variables=["AADT", "SPEED", "ACCESS"],
default_roles=[0, 1, 2, 6],
)
Restrict specific variables
evaluator = builder.build_count_evaluator(
variables=["AADT", "SPEED", "URB"],
fixed_override={"AADT": [1]},
membership_override={"URB": [7, 8]},
)
13. Consistent Run Output
from metacountregressor import SearchOutputConfig
output_config = SearchOutputConfig(
output_dir="results",
experiment_name="cmf_example16_3",
search_description="CMF search on Example 16-3 data",
)
saved = builder.run_search(
cmf_search,
algo="sa",
max_iter=1000,
output_config=output_config,
)
print(saved["saved_to"])
Each saved JSON file stores:
- experiment name
- search description
- family
- algorithm
- normalized result payload
14. Latent-Class Example: Recover Functional Class
This example is designed to see whether a latent-class model can recover the hidden FC grouping pattern without using FC itself as a direct predictor in the outcome equation.
We keep:
- original truth column:
FC - comparison encoding:
FC_ENCODED
We do not place FC or FC_ENCODED in the outcome equation. Instead we let membership variables explain latent class probabilities.
13.1 Fit a latent-class count model
latent_spec = builder.make_manual_spec(
fixed_terms=["AADT", "SPEED", "LENGTH"],
rdm_cor_terms=["CURVES:normal", "SLOPE:normal"],
hetro_in_means=["AVEPRE"],
membership_terms=["URB", "ACCESS", "GRADEBR"],
dispersion=1,
latent_classes=2,
)
latent_fit = builder.fit_manual_model(
manual_spec=latent_spec,
model="nb",
R=200,
)
13.2 Compute latent-class probabilities and compare to the true FC grouping
class_probs = builder.compute_latent_class_probabilities(
latent_fit,
true_class_col="FC_ENCODED",
)
print(class_probs.head())
Returned columns include:
IDclass_1_probclass_2_probFC_ENCODED
13.3 Compare predicted class with the encoded true class
class_probs["predicted_class"] = (
class_probs[["class_1_prob", "class_2_prob"]]
.to_numpy()
.argmax(axis=1)
)
agreement = (
class_probs["predicted_class"].to_numpy()
== class_probs["FC_ENCODED"].to_numpy()
).mean()
print("Agreement:", agreement)
This is the cookbook pattern for checking whether the latent-class structure is capturing the observed facility-class segmentation.
15. Common Validation Errors
The package now raises clearer errors for:
- missing columns
- invalid family-specific arguments
- CMF specifications missing
aadt_col,baseline_vars, orlocal_vars - CMF data with non-positive
AADT - latent-class probability requests on single-class fits
16. Summary
Use these loaders when you want the real Example 16-3 data inside the package:
load_example16_3_raw_data()load_example16_3_model_data()load_example_duration_data()load_example_linear_data()
Use these builder patterns:
- count:
build_count_evaluator(...) - CMF:
build_evaluator(model_family="cmf", ...) - duration:
build_evaluator(model_family="duration", ...) - linear:
build_evaluator(model_family="linear", ...)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file metacountregressor-1.0.34.tar.gz.
File metadata
- Download URL: metacountregressor-1.0.34.tar.gz
- Upload date:
- Size: 1.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8134fc76595fc034a4ec24abba96d63549ae855eb6d97ed5d16431ecd2339a87
|
|
| MD5 |
798b8fc4a752ed28877d273671146902
|
|
| BLAKE2b-256 |
af5c8bfcfe9fb4994ff6771dfee75141eed7dc9b4460fd0b83c36cc1aa0813cb
|
File details
Details for the file metacountregressor-1.0.34-py3-none-any.whl.
File metadata
- Download URL: metacountregressor-1.0.34-py3-none-any.whl
- Upload date:
- Size: 1.9 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e3a370c2c093676e7a19261c592f354ec81783af26339b9f28e52543d4307272
|
|
| MD5 |
c08e45f6567b7c5127f02778cb3e541d
|
|
| BLAKE2b-256 |
d0c60d87aa4b40f82de02a0a521fe15665c65fc1a2252c2bb522358187a908e5
|