Skip to main content

A python package used for change point detection at Mozilla.

Project description

mozdetect

A python package containing change point detection techniques for use at Mozilla.

Setup, and Development

Setup

Install uv first using the following:

python -m pip install uv

Install poetry using the following:

python -m pip install poetry

Running

Next, run the following to build the package, and install dependencies. This step can be skipped though since uv run will implicitly build the package:

uv sync

Run a script that uses the built module with the following:

uv run my_script.py

Testing Change Detection Techniques

This section provides an overview about how to add and test new or existing change detection techniques.

Adding New Techniques

All techniques are defined in two parts. The first part is the detector itself that compares two (or more) groups to each other, and returns the result of that comparison. The second part is a timeseries detector that runs across a full timeseries (e.g. a TelemetryTimeSeries) and uses the detector from the first part to detect changes.

Both of these should be defined for any new techniques. This makes it possible to make different timeseries detectors using the same underlying detector technique. See src/mozdetect/detectors/cdf_squared.py for an example implementation of a detector, and src/mozdetect/detectors/cdf_squared.py for an example implementation of the timeseries detector. Note that the detectors will need to be subclasses of the BaseDetector, and BaseTimeSeriesDetector, respectively. Furthermore, they need to specify a name that will be used to access them, e.g. cdf_squared, through the detector_name, and timeseries_detector_name class initialization arguments. These names will be used to access the detectors from the return value of get_detectors/get_timeseries_detectors.

The TelemetryTimeSeries object provides an interface for accessing the data with some helper methods. However, if those are not enough, it's possible to access the raw data that the time series object was built with through TelemetryTimeSeries.raw_data.

The detector only needs to return a dictionary with information about the comparison. However, the timeseries detector must return a list of Detection objects that contain information about the changes detected.

Testing Techniques

After the new technique was added, create a new testing script. This script can exist anywhere, but there's a special folder that can be added to the top-level of the repo called sample-scripts that can contain the script and it will be ignored when making commits. See the example in examples/sample_detection_run.py for how to run the detection. It can be run using the following from the top-level of the repo:

uv run examples/sample_detection_run.py

At the moment, the only detection techniques available use data from BigQuery. This means that you will need to login locally, and ensure that you have access to the mozdata project. Follow these instructions for how to install the tool, then run the following to login and set the project:

gcloud auth login --update-adc
gcloud config set project mozdata

The key things to do in the script are calling get_metric_table to get the data, creating a TelemetryTimeSeries with the data, and then calling the change detection technique with the timeseries object as an argument. The change detection technique class is obtained from mozdetect.get_timeseries_detectors()["name-of-detector"] (name of the detector is given when creating the detector class). Calling detect_changes() on the resulting object will trigger the change detection, and return a list of Detection objects that describe the change that was detected.

Using New Techniques in Alerting/Monitoring

Once a new technique is added, a new release of mozdetect will need to be produced. From there, an update in Treeherder will be needed for the mozdetect package along with a new deployment. Once deployed, it will be usable.

Currently, mozdetect is only used for alerting on telemetry probes so in the monitor field that is added to the probe(s), the field change_detection_technique will need to be used to specify the name of the change detection technique that was added with the detector class - only the timeseries detector classes are used in alerting, and monitoring. Additional arguments to the technique can also be provided through the change_detection_args field.

Pre-commit checks

Pre-commit linting checks must be setup like this (run within the top-level of this repo directory):

uv sync
uv run pre-commit install

Running tests, and linting

Tests all reside in the tests/ folder and can be run using:

uv run pytest

Linting is performed through pre-commit when you commit, however, it's possible to run it directly without performing a commit:

uv run pre-commit run --all-files

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mozdetect-0.1.3a1.tar.gz (23.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mozdetect-0.1.3a1-py3-none-any.whl (25.1 kB view details)

Uploaded Python 3

File details

Details for the file mozdetect-0.1.3a1.tar.gz.

File metadata

  • Download URL: mozdetect-0.1.3a1.tar.gz
  • Upload date:
  • Size: 23.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.2.0 CPython/3.10.20

File hashes

Hashes for mozdetect-0.1.3a1.tar.gz
Algorithm Hash digest
SHA256 72658fd2322300d5a2fa9af45f147fa9f21741375765450eb0f03080bd237b59
MD5 87dd76b0f4b6d0526ac01b42987e11ed
BLAKE2b-256 3ab66e6da3bfcf0a2b15c2b8e27d7441ce41ee0f37c09d8f68b6c827be5b40d1

See more details on using hashes here.

File details

Details for the file mozdetect-0.1.3a1-py3-none-any.whl.

File metadata

  • Download URL: mozdetect-0.1.3a1-py3-none-any.whl
  • Upload date:
  • Size: 25.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.2.0 CPython/3.10.20

File hashes

Hashes for mozdetect-0.1.3a1-py3-none-any.whl
Algorithm Hash digest
SHA256 dd915f3368b60df2790d9a3fcf3718d77773a7f094bb30281d40bdae1b1de47b
MD5 9f9c300e8a4ae8e48e0a7d741c76566d
BLAKE2b-256 31e8d7839b1a73c2053184fece4e37c639d62b948da0a24342d71517711e6780

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page