Distribution-based anomaly detection for time series.
Distribution-based anomaly detection for time series data.
>>> from fossa import LastWindowX2AnomalyDetector >>> clf = LastWindowX2AnomalyDetector(p_threshold=0.005, normalize=True) >>> clf.fit(historic_data_df) >>> clf.predict(new_data) direction date category 2018-06-01 hockey 1.0 footbal 0.0 soccer -1.0 tennis 0.0
pip install fossa
- scikit-learn-like classifier API.
- Pickle-able classifier objects.
- Pure python.
- Supports Python 3.5+.
- Fully tested.
All anomaly detectors are desgined to receive as fit parameter a pandas DataFrame with a two-leveled multi-index, the first indexing time and the second indexing category/topic frequency per-window, and a single column of a numeric dtype, giving said frequency.
When detecting trends a similarly-indexed dataframe with detection results is returned, giving detected trends per time windows and category.
All anomaly detector objects in fossa have an identical API:
- fit - Recieves a history of time-windowed distributions to train on and fits the detector on it (see the Data Format section for the exact format). The set of categories may be different across different time windows or between historic and time windoes for detection; detection is done for the union of of categories over all commitee and new time windows.
- partial_fit - The same as fit, but can also incrementaly fit an already-fit detector without necessarilly ignoring all past fitted data. Detectors who do not support incremental fitting will raise a NotImplementedError exception when this method is called.
- detect_trends - Recieves a new dataframe (in the correct format) and detects, for each of the time windows in it, trends for each category. In addition to the direction column - indicating trend direction, with -1 for a downward trend, 0 for no trend and 1 for an upward trend - the returned dataframe might contain additional columns detailing detection confidence or probability, like p-values or commitee vote results.
- predict - Like detect_trends, except the returned dataframe always contains only a single column of detected trend directions.
This family of anomaly detectors all operate similarly: Every detector compares new time windows to a set of committe windows that represent its idea for relevant history and characteristic behaviour of the data; one detector might look at the same hour on the same weekday across several weeks, while another might look at all the same hours in the last 10 or 20 days, or the preciding few hours.
For each of the time windows given to the detect_trends or predict methods, a one-vs-all distribution is generated for each of the categories in the window (and is possibly normalized, depending on the specific detector and its initialization parameters). Then, for each of this distributions chi-squared tests are performed between it and the corresponding distributions in each of the commitee time windows. Each commitee member “votes” on whether a trend is detected or not, and a decision is generated by some pre-set voting rule (for example, majority vote).
Current package maintainer (and one of the authors) is Shay Palachy (firstname.lastname@example.org); You are more than welcome to approach him for help. Contributions are very welcomed.
git clone email@example.com:shaypal5/fossa.git
Install in development mode, including test dependencies:
cd fossa pip install -e '.[test]'
To run the tests use:
cd fossa pytest
The project is documented using the numpy docstring conventions, which were chosen as they are perhaps the most widely-spread conventions that are both supported by common tools such as Sphinx and result in human-readable docstrings. When documenting code you add to this project, follow these conventions.
Additionally, if you update this README.rst file, use python setup.py checkdocs to validate it compiles.
|Filename, size||File type||Python version||Upload date||Hashes|
|Filename, size fossa-0.0.3-py2.py3-none-any.whl (13.4 kB)||File type Wheel||Python version py2.py3||Upload date||Hashes View hashes|
Hashes for fossa-0.0.3-py2.py3-none-any.whl