Skip to main content

A modern machine learning library for high-energy physics data analysis

Project description

ColliderML

Tests Coverage Python 3.10+ License: MIT

A modern machine learning library for high-energy physics data analysis.

Installation

pip install colliderml                 # core + Polars loader + unified load()
pip install 'colliderml[sim]'          # local simulation (needs Docker/Podman)
pip install 'colliderml[remote]'       # SaaS backend client
pip install 'colliderml[tasks]'        # benchmark task reference baselines
pip install 'colliderml[all]'          # everything above + dev tools

For development: pip install -e ".[dev]"

Getting the data

Option 1 — Python one-liner (downloads on first call, then caches):

import colliderml

frames = colliderml.load("ttbar_pu0", max_events=200)
print(frames["particles"])             # Polars DataFrame

Option 2 — CLI (explicit download, then load with the library):

colliderml download --channels ttbar --pileup pu0 --objects particles,tracker_hits,calo_hits,tracks --max-events 200

Cache location: default ~/.cache/colliderml, or set COLLIDERML_DATA_DIR. List downloaded configs: colliderml list-configs.

Option 3 — HuggingFace only:

from datasets import load_dataset
dataset = load_dataset("CERN/ColliderML-Release-1", "ttbar_pu0_particles", split="train")

Running simulations

New in v0.4.0: generate events yourself with the full ODD pipeline, either locally in a container or via the SaaS backend.

import colliderml

# Local: runs inside the OpenDataDetector software container.
# Needs Docker or Podman; the `[sim]` extra provides the driver.
result = colliderml.simulate(preset="ttbar-quick")
print(result.run_dir)                  # parquet outputs land here

# Remote: submit to the SaaS backend (requires an HF token).
# The `[remote]` extra pulls in requests; no container runtime needed.
result = colliderml.simulate(preset="higgs-portal-quick", remote=True)
print(result.remote_request_id)

CLI equivalents:

colliderml list-presets
colliderml simulate --preset ttbar-quick --local
colliderml simulate --preset higgs-portal-quick --remote
colliderml status <request-id>
colliderml balance

See the Local Simulation and Remote Simulation guides for details.

Benchmark tasks

New in v0.4.0: six built-in benchmark tasks — tracking, jets, anomaly, tracking_latency, tracking_small, and data_loading — with a unified registry and a leaderboard backed by the SaaS backend.

import colliderml.tasks

print(colliderml.tasks.list_tasks())
scores = colliderml.tasks.evaluate("tracking", "my_preds.parquet")
colliderml.tasks.submit("tracking", "my_preds.parquet")   # earn credits on new bests

Reference baselines (scikit-learn for BDT/IsoForest) ship with the [tasks] extra. See the Benchmark Tasks guide for details.

Using the library

The notebook notebooks/colliderml_loader_exploration.ipynb shows the data-loading and analysis helpers: load_tables, exploding event tables, pileup subsampling, calibration, and plotting.

Full docs: https://opendatadetector.github.io/ColliderML

Development

pytest -v -m "not integration"

Docs are built with VitePress: npm ci --prefix docs && npm run --prefix docs docs:build.

License

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

colliderml-0.4.0rc1.tar.gz (592.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

colliderml-0.4.0rc1-py3-none-any.whl (83.3 kB view details)

Uploaded Python 3

File details

Details for the file colliderml-0.4.0rc1.tar.gz.

File metadata

  • Download URL: colliderml-0.4.0rc1.tar.gz
  • Upload date:
  • Size: 592.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for colliderml-0.4.0rc1.tar.gz
Algorithm Hash digest
SHA256 65dc0524b0038b3c2056f27cb6e61439accf75548fd0f4bdea82c9384ffb3439
MD5 af953ce07e9bebf6990551f94bf5730f
BLAKE2b-256 e19d31385045bba3c2a730129df62183664e941a54a36fb5cf25e95945570d67

See more details on using hashes here.

Provenance

The following attestation bundles were made for colliderml-0.4.0rc1.tar.gz:

Publisher: publish-pypi.yml on OpenDataDetector/ColliderML

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file colliderml-0.4.0rc1-py3-none-any.whl.

File metadata

  • Download URL: colliderml-0.4.0rc1-py3-none-any.whl
  • Upload date:
  • Size: 83.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for colliderml-0.4.0rc1-py3-none-any.whl
Algorithm Hash digest
SHA256 8b0d49d9b4b73a86be8193b43a075566ee6dad8262bf248b543a301c433dd34f
MD5 996d192cfd20955b2aff4866cbc17384
BLAKE2b-256 94e34853ecdef755a46c7c4be1837e271b961ab3660912ed2023d9afbe0ce138

See more details on using hashes here.

Provenance

The following attestation bundles were made for colliderml-0.4.0rc1-py3-none-any.whl:

Publisher: publish-pypi.yml on OpenDataDetector/ColliderML

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page