Skip to main content

Easy-to-use NLP toolbox

Project description

Reason

License PyPI Downloads Lines of Code Activity

Python easy-to-use natural language processing toolbox.

Packages

  • classify
    Naive bayes classifier
  • metrics
    Confusion matrix, accuracy
  • tokenize
    Regex word and sentence tokenizer
  • stem
    Porter and regex stemmer
  • analysis
    Frequency distribution
  • util
    Bigrams, trigrams and Ngrams

Install

Install latest stable version using pip:

pip install reason

Quick Start

Classification:

>>> from reason.classify import NaiveBayesClassifier
>>> classifier = NaiveBayesClassifier(train_set)
>>> y_pred = classifier.classify(new_data)

>>> from reason.metrics import accuracy
>>> accuracy(y_true, y_pred)
0.9358

Confusion Matrix:

>>> from reason.metrics import ConfusionMatrix
>>> cm = ConfusionMatrix(y_true, y_pred)

>>> cm
68 21 13
16 70 11
14 10 77

>>> cm[actual, predicted]
16

>>> from reason.metrics import BinaryConfusionMatrix
>>> bcm = BinaryConfusionMatrix(b_y_true, b_y_pred)

>>> bcm.precision()
0.7837
>>> bcm.recall()
0.8055
>>> bcm.f1_score()
0.7944

Word Tokenization:

>>> from reason.tokenize import word_tokenize

>>> text = "Testing reason0.1.0, (on: 127.0.0.1). Cool stuff..."
>>> word_tokenize(text, 'alphanumeric')
['Testing', 'reason0.1.0', 'on', '127.0.0.1', 'Cool', 'stuff']

Sentence Tokenization:

>>> from reason.tokenize import sent_tokenize

>>> text = "Hey, what's up? I love using Reason library!"
>>> sents = sent_tokenize(text)
>>> for sent in sents:
...     print(sent)
Hey, what's up?
I love using Reason library!

Word Stems:

>>> from reason.stem import PorterStemmer

>>> text = 'watched birds flying'
>>> stemmer = PorterStemmer()
>>> stemmer.stem(text)
['watch', 'bird', 'fly']

>>> from reason.stem import regex_stem

>>> regex_pattern = r'^(.*?)(ous)?$'
>>> regex_stem('dangerous', regex_pattern)
danger

Preprocess Text (Tokenizing + Stemming):

>>> from reason import preprocess

>>> text = "What's up? I love using Reason library!"
>>> preprocess(text)
[["what's", 'up', '?'], ['i', 'love', 'us', 'reason', 'librari', '!']]

Frequency Distribution:

>>> from reason.analysis import FreqDist

>>> words = ['hey', 'hey', 'oh', 'oh', 'oh', 'yeah']
>>> fd = FreqDist(words)

>>> fd
Frequency Distribution
Most-Common: [('oh', 3), ('hey', 2), ('yeah', 1)]
>>> fd.most_common(2)
[('oh', 3), ('hey', 2)]
>>> fd['yeah']
1

Ngrams:

>>> sent = 'Reason is easy to use'

>>> from reason.util import bigrams
>>> bigrams(sent)
[('Reason', 'is'), ('is', 'easy'), ('easy', 'to'), ('to', 'use')]

>>> from reason.util import trigrams
>>> trigrams(sent)
[('Reason', 'is', 'easy'), ('is', 'easy', 'to'), ('easy', 'to', 'use')]

>>> from reason.util import ngrams
>>> ngrams(sent, 4)
[('Reason', 'is', 'easy', 'to'), ('is', 'easy', 'to', 'use')]

Dependencies

  • NumPy
    Used to handle data
  • Pandas
    Used in classify package

Keep in mind NumPy will be automatically installed with Reason.

License

MIT -- See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

reason-0.4.0.tar.gz (227.9 kB view details)

Uploaded Source

Built Distribution

reason-0.4.0-py3-none-any.whl (240.9 kB view details)

Uploaded Python 3

File details

Details for the file reason-0.4.0.tar.gz.

File metadata

  • Download URL: reason-0.4.0.tar.gz
  • Upload date:
  • Size: 227.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.23.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.49.0 CPython/3.7.4

File hashes

Hashes for reason-0.4.0.tar.gz
Algorithm Hash digest
SHA256 ce48c11f0682417569c0285959e36610fdee222ac18efabc86f4f9aae6a7ba6b
MD5 683381e57151c1249f87fd0cca19c3a8
BLAKE2b-256 e87abf73437085e0060717bdca2e1846005f54acecdc6d17a143ff936d071aa5

See more details on using hashes here.

File details

Details for the file reason-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: reason-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 240.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.23.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.49.0 CPython/3.7.4

File hashes

Hashes for reason-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 938e33217d1ac158f7258ded1ae24e8817abea7a3d2a4f16b758981a12535a16
MD5 f82dfc8a883f478097dfb4bed9b96706
BLAKE2b-256 1a05dc1fc6259a09ac785073fec3dd45323fb3399db3afad14d5afa420533878

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page