Skip to main content

flight-ad is a Python package for anomaly detection in the aviation domain built on top of scikit-learn.

Project description

flight-ad

Codacy Badge

flight-ad is a Python package for anomaly detection in the aviation domain built on top of scikit-learn.

It provides:

  • An implementation of an anomaly detection pipeline;
  • A DataBinder object for loading and transforming the data within the pipeline on the fly;
  • A DataWrangler object for building a data wrangling pipeline;
  • A StatisticalLearner object for binding scikit-learn's pipelines and integrating them on the anomaly detection workflow;
  • Visualization tools for assessing potential anomalies;
  • Reporting tools for analyzing results;
  • Sample airplane sensor data, repackaged from NASA's DASHlink for the purpose of evaluating and advancing data mining capabilities that can be used to promote aviation safety;
  • Adaptations of machine learning algorithms, such as a DBSCAN implementation that calculates the hyperparameter epsilon from the input data.

Installation

The easiest way to install flight-ad is using pip from your virtual environment.

Directly from GitHub:

pip install git+https://github.com/coelhosilva/flight-ad.git

Examples

This is a sample usage of the package for constructing an anomaly detection pipeline. Beware that the sample dataset may take up roughly 1 GB in disk space.

from flight_ad.datasets import load_dashlink_bindings
from flight_ad.utils.data import DataBinder
from flight_ad.wrangling import DataWrangler
from wrangling_functions import preprocess, change_col, resample, select
from flight_ad.transformations import reshape_df_interspersed
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA
from flight_ad.cluster import DBSCAN
from flight_ad.learn import FunctionTransformer
from flight_ad.learn import StatisticalLearner
from flight_ad.pipeline import AnomalyDetectionPipeline
from flight_ad.report import clustering_info, silhouette

# Binder
data_bindings = load_dashlink_bindings(download=True)
binder = DataBinder(data_bindings)

# Wrangler
wrangling_steps = [
    ('preprocess_flight', preprocess),
    ('resample_dataframe', resample),
    ('change_col', change_col),
    ('select_col', select)

]
wrangler = DataWrangler(wrangling_steps, memorize='change_col')

# Learner
learning_steps = {
    'preprocessing': [
        ('reshaper', FunctionEstimator(reshape_df_interspersed)),
        ('scaler', StandardScaler()),
        ('pca', PCA())
    ],
    'training': [
        ('dbscan', DBSCAN())
    ]
}
learner = StatisticalLearner(learning_steps, record='pca')

# Pipeline
ad_pipeline = AnomalyDetectionPipeline(binder, wrangler, learner)
ad_pipeline.fit()

# Results
labels, n_clusters, n_noise = clustering_info(learner.pipeline['dbscan'])
avg_silhouette, _ = silhouette(learner.partial_data['pca'], labels)

Package structure

TBD.

Dependencies

flight-ad requires:

  • Python (>=3.6)
  • NumPy
  • pandas
  • scikit-learn
  • matplotlib
  • tqdm

Contributions

We welcome and encourage new contributors to help test flight-ad and add new functionality. Any input, feedback, bug report or contribution is welcome.

If one wishes to contact the author, they may do so by emailing coelho@ita.br.

Citation

If you use flight-ad in a scientific publication, we would appreciate citations.

BibTex: TBD.

Citation string: TBD.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

flight-ad-0.0.1.tar.gz (19.1 kB view details)

Uploaded Source

Built Distribution

flight_ad-0.0.1-py3-none-any.whl (23.0 kB view details)

Uploaded Python 3

File details

Details for the file flight-ad-0.0.1.tar.gz.

File metadata

  • Download URL: flight-ad-0.0.1.tar.gz
  • Upload date:
  • Size: 19.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.3 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.0 CPython/3.8.10

File hashes

Hashes for flight-ad-0.0.1.tar.gz
Algorithm Hash digest
SHA256 2ed1402bd6ddaa98a188358b5cc46269c6a1f8b0ec2630d438bc6dcef7b8f45d
MD5 d65a84f22804d60f7109694457c1e9ee
BLAKE2b-256 2146c21cfc4ec52118b197d43bb45044906c4590f56f41e036d79a6fda1e2bcd

See more details on using hashes here.

File details

Details for the file flight_ad-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: flight_ad-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 23.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.3 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.0 CPython/3.8.10

File hashes

Hashes for flight_ad-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 0803d69c0d5eb0718aa29e9d236ec6e6a62657140c2242c01c5a90d0511783a2
MD5 ba627c5f681d1c16de3d41c9803bec01
BLAKE2b-256 bc4dad36e8aec05d8cefef30a742a0d21a54b86c395353debd67cedbc826df42

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page