Python Software Foundation 20th Year Anniversary Fundraiser

Machine learning modelling of the gravitational waves generated by black-hole binaries

## mlgw

Author Stefano Schmidt

Licence CC BY 4.0

Version 2.0.2

## MACHINE LEARNING MODEL FOR THE GRAVITATIONAL WAVES GENERATED BY BLACK-HOLE BINARIES

mlgw (Machine Learning Gravitational Waves) is a useful tool to quickly generate a GW waveform for a BBH coalescence. It is part of a Master thesis work at University of Pisa (Italy) under supervision of prof. Walter Del Pozzo. It implements a ML model which is able to reproduce waveforms of GWs generated by state-of-the-art models, with an aribitrary number of modes. It is quicker than standard methods and it has the same degree of accuracy. The older version of the model (which includes only the dominant 22 mode) is presented in this paper.

To generate (and plot) a wave:

```import mlgw.GW_generator as generator
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.axes_grid.inset_locator import (inset_axes, InsetPosition,mark_inset)

#generating the wave
gen = generator.GW_generator() #creating an istance of the generator (using default model)
theta = np.array([20,10,0.5,-0.3]) #list of parameters to be given to generator [m1,m2,s1,s2]
times = np.linspace(-8,0.02, 100000) #time grid: peak of 22 mode at t=0
modes = [(2,2), (3,3), (4,4), (5,5)]
h_p, h_c = gen.get_WF(theta, times, modes) #returns amplitude and phase of the wave

#plotting the wave
plt.figure(figsize=(15,8))
plt.title("GW by a BBH with [m1,m2,s1,s2] = "+str(theta), fontsize = 15)
plt.plot(times, h_p, c='k') #plot the plus polarization
plt.xlabel("Time (s)", fontsize = 12)
plt.ylabel(r"\$h_+\$", fontsize = 12)
axins = inset_axes(plt.gca(), width="70%", height="30%", loc=2, borderpad = 2.)
axins.plot(times[times >= -0.2], h_p[times >= -0.2], c='k')
plt.show()
```

The output is:

## The ML model

The waveform is composed by a superposition of modes, each labeld by a tuple (l,m). The model performs a different fit for each of the mode and sums them to get the full waveform.

The model to fit each mode is composed by a PCA + Mixture of Experts model and aims to generate the mode amplitude and phase given some input parameter of the BBH. So far the model is fitted only to deal with aligned BH spins.

A PCA model is used to reduce dimensionality, through a linear transformation, of a wave represented in a dense grid. It maps the wave to the linear combination of the first K principal components of the dataset.

A Mixture of Experts model (MoE) is useful to map the orbital parameters of the black holes to the reduced representation of the wave. A prediction of MoE is a linear combination of regression models (the experts), weighted by the output of a gating function which decides which expert to use. The orbital parameters considered are mass ratio q=m1/m2 and the two BHs z-component spins s1 and s2; the total mass m1+m2 is a scale factor and the dependence on it must not be fitted. The experts performs a polynomial regression (using data augmentation in a basis function expansion). The terms in the polynomial are specified at training time.

A complete model for a mode includes two PCA models for both phase and amplitude of the wave and a MoE model for each of the PC considered. The expert takes the form of a basis function regression and one can specify the features they want to use for their regression in the training and test data.

A dataset of GWs must be created to fit the PCA model. It holds waves in time domain, generated in a fixed reduced grid t' = t/M_tot where M_tot is the total mass of the BBH. Each mode is time aligned sucht that the peak of the 22 mode is at t = 0.

## Usage of mlgw

It outputs the GW strain:

where m_i and s_i are BH masses and spins, d_L the luminosity distance from the source, i is the inclination angle and phi is a reference phase. The (l,m) modes included depends on the model considered: use mlgw.GW_generator.GW_generator.list_modes() to list them.

Package mlgw consists in five modules.

• GW_generator: the module holds class mode_generator which builds up all the components for a fit for a single mode (i.e. PCA + regressions for each PC). Class GW_generator collects many istances of mode_generator and sum them together including the dependence on spherical harmonics.
• EM_MoE: holds an implementation of a MoE model as well as the softmax classifier required for it
• ML_routines: holds an implementation of the PCA model as well a GDA classifier and a routine to do data augmentation
• GW_helper: provides some routines to generate a dataset and to evaluate the closeness between waves. This is useful to assess model ability to reproduce original waves
• fit_model: provides some routines useful to fit the MoE + PCA model.

Class GW_generator provides method get_WF to return the plus and cross polarization of the waveform. The orbital parameters must be specified. It accepts N data as (N,D) np.array. The D features must have one of the following layout:

```D = 3   [q, spin1_z, spin2_z]
D = 4   [m1, m2, spin1_z, spin2_z]
D = 5   [m1, m2, spin1_z , spin2_z, D_L]
D = 6   [m1, m2, spin1_z , spin2_z, D_L, inclination]
D = 7   [m1, m2, spin1_z , spin2_z, D_L, inclination, phi_0]
D = 14  [m1, m2, spin1 (3,), spin2 (3,), D_L, inclination, phi_0, long_asc_nodes, eccentricity, mean_per_ano]
```

Method __call__ can only be given the last line.

The user should also provide a time grid to evaluate the WF at. The grid must meet the convention that the peak of amplitude of the 22 mode happens at the origin of time (i.e. the inspiral takes place at negative times). Furthermore, an optional mode list can be provided, in order to control which higher modes shall be included in the WF.

Method get_modes provide the bare (l,m) modes. The user can choose the output type (if amplitude and phase or real and imaginary part) and also which modes to have returned.

## Installation & documentation

To install the package:

```pip install mlgw
```

It requires numpy and scipy all available to PyPI.

A number of tutorials are available to the interested user.

• generate_WF.py: to generate the WF and using the model in its basic features.
• test_HM.py: to test the accuracy of the model. It requires a local installation of EOB model TEOBResumS and it compares the mlgw results with those of TEOBResumS.
• play_WF.py: an interactive plot to plot how a WF changes when the the masses, spins and geometrical variables change.

A number of pre-fitted model are realeased together with the model. The available models can be listed with mlgw.GW_generator.list_models(). However the user is welcome to fit their own model, using the module mlgw.fit_model. To build a model, two steps are required:

• Generating a dataset of WFs: in this part a datset of WFs is created for each of the (l,m) mode to be included. The user here shall choose the range of orbital parameters to include within the dataset as well as the length in time of the WF. See generate_dataset.py for a practical guide on how to do it.
• Fitting the model on the dataset: in this part, for each mode, a PCA model and a MoE model are fitted with the available data. Once the various ML models are properly gathered together, mlgw is raeady to be used. See do_the_fit.py for more information.

The tutorials above are intended only to present a basic usage. For more advanced use or for more information, please refer to the code documentation:

```import mlgw
help(mlgw)
help(mlgw.<module_name>)
```

For more information on the model you can have a look at the presentation paper: arxiv.org/abs/2011.01958.

For full source code (and much more) see: https://github.com/stefanoschmidt1995/MLGW