User-defined filters for the Fink broker.
This repository contains filters used to define which information will be sent to the community.
How to contribute?
Step 0: Fork this repository
Fork and clone the repository, and create a new folder in
fink_filters. The name of the new folder does not matter much, but try to make it meaningful as much as possible! Let's call it
filter_rrlyr for the sake of this example.
Step 1: Define your filter
A filter is typically a Python routine that selects which alerts need to be sent based on user-defined criteria. Criteria are based on the alert entries: position, flux, properties, ... You can find what's in alert here [link to be added].
In this example, let's imagine you want to receive all alerts flagged as RRLyr by the xmatch module. You would create a file called
filter.py and define a simple routine (see full template in the repo):
@pandas_udf(BooleanType(), PandasUDFType.SCALAR) # <- mandatory def rrlyr(cdsxmatch: Any) -> pd.Series: """ Return alerts identified as RRLyr by the xmatch module. Parameters ---------- cdsxmatch: Spark DataFrame Column Alert column containing the cross-match values Returns ---------- out: pandas.Series of bool Return a Pandas DataFrame with the appropriate flag: false for bad alert, and true for good alert. """ # Here goes your logic mask = cdsxmatch.values == "RRLyr" return pd.Series(mask)
- Note the use of the decorator is mandatory. It is a decorator for Apache Spark, and it specifies the output type as well as the type of operation. Just copy and paste it for simplicity.
- The name of the routine will be used as the name of the Kafka topic. So once the filter loaded, you would subscribe to the topic
rrlyrto receive alerts from this filter. Hence choose a meaningful name!
- The name of the input argument must match the name of an alert entry. Here
cdsxmatchis one column added by the xmatch module.
- You can have several input columns. Just add them one after the other:
@pandas_udf(BooleanType(), PandasUDFType.SCALAR) # <- mandatory def filter_w_several_input(acol: Any, anothercol: Any) -> pd.Series: """ Documentation """ pass
Do not forget to include the
__init__.py file in your new folder to make it a package.
Step 3: Open a pull request
Once your filter is done, we will review it. The criteria for acceptance are:
- The filter works ;-)
- The volume of data to be transfered is tractable on our side. Keep in mind, LSST incoming stream is 10 million alerts per night, or ~1TB/night. Hence your filter must focus on a specific aspect of the stream, to reduce the outgoing volume of alerts. Based on your submission, we will provide estimate of the volume to be transfered.
Step 4: Play!
If your filter is accepted, it will be plugged in the broker, and you will be able to receive your alerts in real-time using the fink-client. Note that we do not keep alerts forever available in the broker. While the retention period is not yet defined, you can expect emitted alerts to be available no longer than one week.
If instead you want to install the package, you can just pip it:
pip install fink_filters
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
|Filename, size||File type||Python version||Upload date||Hashes|
|Filename, size fink_filters-0.1.5-py3-none-any.whl (8.2 kB)||File type Wheel||Python version py3||Upload date||Hashes View hashes|
|Filename, size fink-filters-0.1.5.tar.gz (3.4 kB)||File type Source||Python version None||Upload date||Hashes View hashes|
Hashes for fink_filters-0.1.5-py3-none-any.whl