Skip to main content

Creates light curves embeddings using ASTROMER

Project description

ASTROMER Python library 🔭

ASTROMER is a transformer based model pretrained on millions of light curves. ASTROMER can be finetuned on specific datasets to create useful representations that can improve the performance of novel deep learning models.

❗ This version of ASTROMER can only works on single band light curves.

🔥 See the official repo here

Install

pip install ASTROMER

How to use it

Currently, there are 2 pre-trained models: macho and atlas. To load weights use:

from ASTROMER.models import SingleBandEncoder

model = SingleBandEncoder()
model = model.from_pretraining('macho')

It will automatically download the weights from this public github repository and load them into the SingleBandEncoder instance.

Assuming you have a list of vary-lenght (numpy) light curves.

import numpy as np

samples_collection = [ np.array([[5200, 0.3, 0.2],
                                 [5300, 0.5, 0.1],
                                 [5400, 0.2, 0.3]]),

                       np.array([[4200, 0.3, 0.1],
                                 [4300, 0.6, 0.3]]) ]

Light curves are Lx3 matrices with time, magnitude, and magnitude std. To encode samples use:

attention_vectors = model.encode(samples_collection,
                                 oids_list=['1', '2'],
                                 batch_size=1,
                                 concatenate=True)

where

  • samples_collection is a list of numpy array light curves
  • oids_list is a list with the light curves ids (needed to concatenate 200-len windows)
  • batch_size specify the number of samples per forward pass
  • when concatenate=True ASTROMER concatenates every 200-lenght windows belonging the same object id. The output when concatenate=True is a list of vary-length attention vectors.

Finetuning or training from scratch

ASTROMER can be easly trained by using the fit. It include

from ASTROMER import SingleBandEncoder

model = SingleBandEncoder(num_layers= 2,
                          d_model   = 256,
                          num_heads = 4,
                          dff       = 128,
                          base      = 1000,
                          dropout   = 0.1,
                          maxlen    = 200)
model.from_pretrained('macho')

where,

  • num_layers: Number of self-attention blocks
  • d_model: Self-attention block dimension (must be divisible by num_heads)
  • num_heads: Number of heads within the self-attention block
  • dff: Number of neurons for the fully-connected layer applied after the attention blocks
  • base: Positional encoder base (see formula)
  • dropout: Dropout applied to output of the fully-connected layer
  • maxlen: Maximum length to process in the encoder Notice you can ignore model.from_pretrained('macho') for clean training.
mode.fit(train_data,
         validation_data,
         epochs=2,
         patience=20,
         lr=1e-3,
         project_path='./my_folder',
         verbose=0)

where,

  • train_data: Training data already formatted as tf.data
  • validation_data: Validation data already formatted as tf.data
  • epochs: Number of epochs for training
  • patience: Early stopping patience
  • lr: Learning rate
  • project_path: Path for saving weights and training logs
  • verbose: (0) Display information during training (1) don't

train_data and validation_data should be loaded using load_numpy or pretraining_records functions. Both functions are in the ASTROMER.preprocessing module.

For large datasets is recommended to use Tensorflow Records (see this tutorial to execute our data pipeline)

Resources

Contributing to ASTROMER 🤝

If you train your model from scratch, you can share your pre-trained weights by submitting a Pull Request on the weights repository

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

astromer-0.1.8.tar.gz (3.0 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

astromer-0.1.8-py3-none-any.whl (28.1 kB view details)

Uploaded Python 3

File details

Details for the file astromer-0.1.8.tar.gz.

File metadata

  • Download URL: astromer-0.1.8.tar.gz
  • Upload date:
  • Size: 3.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.4

File hashes

Hashes for astromer-0.1.8.tar.gz
Algorithm Hash digest
SHA256 7aa7ba98389d39ba3e1d3e7ba8d8f5e85cbec53aba0df67378a7a3f556d7cb59
MD5 116291f58373c31cd604f5b160a99b93
BLAKE2b-256 98142e2f8252fad1e0eb0dc0d16166e0c340e4f253051841f62f166fde19d395

See more details on using hashes here.

File details

Details for the file astromer-0.1.8-py3-none-any.whl.

File metadata

  • Download URL: astromer-0.1.8-py3-none-any.whl
  • Upload date:
  • Size: 28.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.4

File hashes

Hashes for astromer-0.1.8-py3-none-any.whl
Algorithm Hash digest
SHA256 4e2cf7cf01d39bc93f42bdbdbd95f1b3138ce1755a9b4c989c730790a067a078
MD5 10e79ce6497b271664e5873e968dd8f3
BLAKE2b-256 7b1925c1e8348967134e41b02826524d15b3476973dd761b084ba982b35baf5e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page