Detecting cumulative changes in data.
Project description
canomaly
Project Description
This package detects specific types of anomalies with an emphasis in looking for cumulative changes.
Installation
This package can be installed through PyPi using
pip install canomaly
or
pip3 install canomaly
Example Usage
>>> import pandas as pd
>>> from canomaly.searchtools import cumrexpy
>>> # Get some data
>>> data = {
'date': [
'2018-11-20',
'2018-11-21',
'2018-11-22',
'2018-11-22',
'2018-11-23',
'2018-11-24'],
'email': [
'john.doe@example.com',
'jane.smith@example.com',
'bob-johnson_123@example.com',
'sarah@mydomain.co.uk',
'frank@mydomain.com',
'jessica_lee@mydomain.com'
]
}
>>> df = pd.DataFrame(data)
>>> df['date'] = pd.to_datetime(df['date'])
>>> # Extract regular expressions
>>> cumrexpy(df, 'email', 'date')
date
2018-11-20 [^john\.doe@example\.com$]
2018-11-21 [^[a-z]{4}\.[a-z]{3,5}@example\.com$]
2018-11-22 [^[a-z]{4,5}[.@][a-z]+[.@][a-z]+\.[a-z]{2,3}$,...
2018-11-23 [^frank@mydomain\.com$, ^[a-z]{4,5}[.@][a-z]+[...
2018-11-24 [^frank@mydomain\.com$, ^[a-z]+[.@_][a-z]+[.@]...
Name: email_grouped, dtype: object
We can look at the results in markdown for clarity.
date | email_grouped |
---|---|
2018-11-20 00:00:00 | ['^john\.doe@example\.com$'] |
2018-11-21 00:00:00 | ['^[a-z]{4}\.[a-z]{3,5}@example\.com$'] |
2018-11-22 00:00:00 | ['^[a-z]{4,5}[.@][a-z]+[.@][a-z]+\.[a-z]{2,3}$', '^bob\-johnson_123@example\.com$'] |
2018-11-23 00:00:00 | ['^frank@mydomain\.com$', '^[a-z]{4,5}[.@][a-z]+[.@][a-z]+\.[a-z]{2,3}$', '^bob\-johnson_123@example\.com$'] |
2018-11-24 00:00:00 | ['^frank@mydomain\.com$', '^[a-z]+[.@_][a-z]+[.@][a-z]+\.[a-z]{2,3}$', '^bob\-johnson_123@example\.com$'] |
Build Documentation Locally
cd /path/to/canomaly/docs
make html
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
canomaly-0.0.5.tar.gz
(16.4 kB
view hashes)
Built Distribution
canomaly-0.0.5-py3-none-any.whl
(15.9 kB
view hashes)