Skip to main content

NLP Toolbox

Project description

Reason

License PyPI Downloads Lines Activity

Python easy-to-use natural language processing toolbox.

Toolbox

  • Classifier
  • Machine learning metrics
  • Confusion matrix
  • Word and sentence tokenizer
  • Frequency distribution
  • Bigrams, trigrams and Ngrams.

Install

Install latest stable version using pip:

pip install reason

Quick Start

Classification:

>>> from reason.classify import NaiveBayesClassifier
>>> classifier = NaiveBayesClassifier(train_set)
>>> y_pred = classifier.classify(new_data)

>>> from reason.metrics import accuracy
>>> accuracy(y_true, y_pred)
0.9358

Confusion Matrix:

>>> from reason.metrics import ConfusionMatrix
>>> cm = ConfusionMatrix(y_true, y_pred)

>>> cm
68 21 13
16 70 11
14 10 77

>>> cm[actual, predicted]
16

>>> from reason.metrics import BinaryConfusionMatrix
>>> bcm = BinaryConfusionMatrix(b_y_true, b_y_pred)

>>> bcm.precision()
0.7837
>>> bcm.recall()
0.8055
>>> bcm.f1_score()
0.7944

Word Tokenization:

>>> from reason.tokenize import word_tokenize

>>> text = "Testing reason0.1.0, (on: 127.0.0.1). Cool stuff..."
>>> word_tokenize(text, 'alphanumeric')
['Testing', 'reason0.1.0', 'on', '127.0.0.1', 'Cool', 'stuff']

Sentence Tokenization:

>>> from reason.tokenize import sent_tokenize

>>> text = "Hey, what's up? I love using Reason library!"
>>> sents = sent_tokenize(text)
>>> for sent in sents:
...     print(sent)
Hey, what's up?
I love using Reason library!

Frequency Distribution:

>>> from reason.analysis import FreqDist

>>> words = ['hey', 'hey', 'oh', 'oh', 'oh', 'yeah']
>>> fd = FreqDist(words)

>>> fd
Frequency Distribution
Most-Common: [('oh', 3), ('hey', 2), ('yeah', 1)]
>>> fd.most_common(2)
[('oh', 3), ('hey', 2)]
>>> fd['yeah']
1

Ngrams:

>>> sent = 'Reason is easy to use'

>>> from reason.util import bigrams
>>> bigrams(sent)
[('Reason', 'is'), ('is', 'easy'), ('easy', 'to'), ('to', 'use')]

>>> from reason.util import trigrams
>>> trigrams(sent)
[('Reason', 'is', 'easy'), ('is', 'easy', 'to'), ('easy', 'to', 'use')]

>>> from reason.util import ngrams
>>> ngrams(sent, 4)
[('Reason', 'is', 'easy', 'to'), ('is', 'easy', 'to', 'use')]

Dependencies

  • NumPy
    Used to handle data
  • Pandas
    Used in classify package

Keep in mind NumPy will be automatically installed with Reason.

License

MIT -- See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

reason-0.3.0.tar.gz (224.8 kB view details)

Uploaded Source

Built Distribution

reason-0.3.0-py3-none-any.whl (236.5 kB view details)

Uploaded Python 3

File details

Details for the file reason-0.3.0.tar.gz.

File metadata

  • Download URL: reason-0.3.0.tar.gz
  • Upload date:
  • Size: 224.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.23.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.49.0 CPython/3.7.4

File hashes

Hashes for reason-0.3.0.tar.gz
Algorithm Hash digest
SHA256 0b0ba63348e4ab14125a3e45743a36058bca136f47da03b9128c42b947eef11a
MD5 73d18b47cb6bfdd094e2e16cf2bd65e9
BLAKE2b-256 074903e48cc1334423e6466c13fe84318e7b9e78976b2dd6bbba84b77bff3c73

See more details on using hashes here.

File details

Details for the file reason-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: reason-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 236.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.23.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.49.0 CPython/3.7.4

File hashes

Hashes for reason-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e7363ad5e67c1a027e8afea4900006905b31b9f0aa61cecb6f846796deecb2e4
MD5 58a8a96cbfc7522bbeb4126d316b948d
BLAKE2b-256 7724f2295eab021edc1add9a7567dbdec7e712343579ed1c4df66e5c2219f312

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page