Skip to main content

Classifier-Based Anomaly Detection for Astronomical Transients

Project description

astromcad Package Documentation

Overview

The astromcad package implements the transient anomaly detection methodology presented in https://arxiv.org/abs/2403.14742 and allows anyone to train custom light curve anomaly detectors. In short, the methodology is to repurpose the penultimate layer of a neural network classifier for anomaly detection. Then, to extract anomalies from this latent space, a separate isolation forest is trained on the observations from each class, and the minimum score from any detector is used as the final anomaly score. This isolation forest approach is called MCIF (Multi-Class Isolation Forest).

Installation


Type pip install astromcad in your command line or terminal


Usage

astromcad has 3 major classes, which are Detect, Custom, and mcif

Detect

This class allows you to use the trained model from the research work for anomaly detection. The model from the work was trained from ZTF simulations which are described in more detail in the manuscript.

Sample Usage

from astromcad.astromcad import Detect

Detect.init() # Load the pretrained model

Detect.classify(light_curves, host_gals) # Uses the pretrained classifier to get a classification output
Detect.anomaly_score(light_curves, host_gals) # Generates the anomaly score for the given simple

The input host_gals must be an array of shape (N_SAMPLES, 2). Each 2-element entry should be [host redshift, milky way extinction] of that object.

The input light_curves must be an array of shape (N_SAMPLES, 656, 4). For each sample, the 2-dimensional array should store the [median passband wavelength, scaled time since trigger, flux / 500, and flux error / 500] for each observation. If there are fewer than 656 observations, you can call

x_data = Detect.pad(x_data)

You can also generate a real-time anomaly score plot for any transient.

Detect.plot_real_time(light_curve, host_gal)

Note that this function takes a single light curve, not a list of samples.

Custom

This class allows you to create a custom light curve classifer to then use for anomaly detection. Start by creating your classifer

from astromcad.astromcad import Custom

det = Custom(656, 4, 2, 9, 12) # n_timesteps, n_features per time step, n_host, latent_size, n_classes
det.create_model()
det.train(X_train, y_train, X_val, y_val, host_gal_train, host_gal_val) # EarlyStopping is initalized in the class
det.create_encoder()
det.init_mcif(X_train, y_train) # Values and labels to init MCIF with
det.score(light_curves, host_gals)

NOTE: If n_host is set to 0 during initalization, don't pass host galaxy information to any other function (or pass None). The class will not use the host galaxy information.

In the case that you want to create your own classifer, be sure to name the input layers and latent layer something memorable. To initalize a Custom object with this classifier, use

det.custom_model(model, lc_name, context_name, host_name)  # Names of the respective layers
det.create_encoder()
...

To use the trained classifier, you can use det.classify(light_curves, host_gals). To plot a real-time score evoluation, use det.plot_real_time(light_curve)

MCIF

This is an implementation of the Multi-Class Isolation Forest Algorithm, which trains a separate isolation forest for each class of data and uses the minimum score from any detector as the final anomaly score. MCIF is very simple to use.

from astromcad.astromcad import mcif

multi = mcif(n_estimators=100)
mutli.train(x_data, labels)
multi.score(x_data)
multi.score_discrete(x_data)

.score(x_data) returns the minimum anomaly score, while .score_discrete reports the anomaly score per detector in a list. To see which unique entry in labels each element of the output refers to, use multi.classes

Example:

multi.train(x_data, labels) # labels = [Class 1, Class 3, Class 2, Class 1, ...]
multi.score_discrete(x_data) # [0.45, -0.3, 0.2]
multi.labels # 				   [Class 1, Class 2, Class 3]

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

astromcad-0.1.1.tar.gz (4.7 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

astromcad-0.1.1-py3-none-any.whl (4.8 MB view details)

Uploaded Python 3

File details

Details for the file astromcad-0.1.1.tar.gz.

File metadata

  • Download URL: astromcad-0.1.1.tar.gz
  • Upload date:
  • Size: 4.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.10.6

File hashes

Hashes for astromcad-0.1.1.tar.gz
Algorithm Hash digest
SHA256 e17d49b2674e2e09cd672ac1f16104af95cad699c0905054743044706d1435df
MD5 dc0c7bbfb8e54f16e727de524b5c4410
BLAKE2b-256 8b5496d933048bea86efeae0c6a4c49da4d11a01350596a91bdba3e7cd3c0e0e

See more details on using hashes here.

File details

Details for the file astromcad-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: astromcad-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 4.8 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.10.6

File hashes

Hashes for astromcad-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 fefae82d7c091ca3da01f6105b5c17a251561b1d3b7580a59c63c49edcadf5e4
MD5 3e15fb1f382ac35216a98af2d21e13f9
BLAKE2b-256 030fb3ab5e46e8b3c594ac53f3e75db9721559f94672a13f8ed1929d4157ee69

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page