Skip to main content

Spam filtering module with Machine Learning using SVM.

Project description

https://github.com/abdullahselek/spampy/workflows/spampy%20ci/badge.svg https://img.shields.io/pypi/v/spampy.svg https://img.shields.io/pypi/pyversions/spampy.svg https://pepy.tech/badge/spampy https://img.shields.io/conda/vn/conda-forge/spampy?logo=conda-forge https://anaconda.org/conda-forge/spampy/badges/latest_release_date.svg

Spam filtering module with Machine Learning using SVM. spampy is a classifier that uses Support Vector Machines which tries to classify given raw emails if they are spam or not.

Support vector machines (SVMs) are supervised learning models with associated learning algorithms that analyze data used for classification and regression analysis. Given a set of training examples, each marked as belonging to one or the other of two categories, an SVM training algorithm builds a model that assigns new examples to one category or the other, making it a non-probabilistic binary linear classifier.

Many email services today provide spam filters that are able to classify emails into spam and non-spam email with high accuracy. spampy is a learning project that you can use filtering spam mails.

spampy uses two different datasets for classification. One of the datasets is already imported inside the project under spampy/datasets/ folder. Second dataset is enron-spam dataset and inside the spampy folder I created a shell script which downloads and extract it for you.

Project tree

  • email_processor Helper to collect features and labels from datasets.

  • spam_classifier Classifies given raw emails.

  • dataset_downloader Enron dataset downloader which uses dataset_downloader.sh

Dependency List

  • scikit_learn

  • scipy

  • numpy

  • nltk

  • click (for CLI)

Two main function of spam_classifier classifies given raw email.

  • classify_email

  • classify_email_with_enron

CLI

For available commands python -m spampy -h

Spam filtering module with Machine Learning using SVM.
Usage
  $ python spampy [<options>]
Options
  --help, -h              Display help message
  --download, -d          Download enron dataset
  --eclassify, -ec        Classify given raw email with enron dataset, prompts for raw email
  --classify, -c          Classify given raw email, prompts for raw email
  --version, -v           Display installed version
Examples
  $ python spampy --help
  $ python spampy --download
  $ python spampy --eclassify
  $ python spampy --classify

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

spampy-0.3.1-py3-none-any.whl (557.3 kB view details)

Uploaded Python 3

File details

Details for the file spampy-0.3.1-py3-none-any.whl.

File metadata

  • Download URL: spampy-0.3.1-py3-none-any.whl
  • Upload date:
  • Size: 557.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.8.2

File hashes

Hashes for spampy-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 fa68aa9caf88b8ff5b0ade23aeeb06805931e4979c843a056f91d547ce53ee2d
MD5 5d378226fa56621077e95a96d7b33a74
BLAKE2b-256 b218689776b6a5189525b06bc52e5aaf1a2ca4d4aec3b7a56a903d2ba4b8a37f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page