Skip to main content

Easy-to-use NLP toolbox

Project description

Reason

License PyPI Downloads Lines of Code Activity

Python easy-to-use natural language processing toolbox.

Packages

  • classify
    Naive bayes classifier
  • metrics
    Confusion matrix, accuracy
  • tag
    POS tagger, regex, lookup and default tagging tools
  • tokenize
    Regex word and sentence tokenizer
  • stem
    Porter and regex stemmer
  • analysis
    Frequency distribution
  • util
    Bigrams, trigrams and ngrams

Install

Install latest stable version using pip:

pip install reason

Quick Start

Classification:

>>> from reason.classify import NaiveBayesClassifier
>>> classifier = NaiveBayesClassifier(train_set)
>>> y_pred = classifier.classify(new_data)

>>> from reason.metrics import accuracy
>>> accuracy(y_true, y_pred)
0.9358

Confusion matrix:

>>> from reason.metrics import ConfusionMatrix
>>> cm = ConfusionMatrix(y_true, y_pred)

>>> cm
68 21 13
16 70 11
14 10 77

>>> cm[actual, predicted]
16

>>> from reason.metrics import BinaryConfusionMatrix
>>> bcm = BinaryConfusionMatrix(b_y_true, b_y_pred)

>>> bcm.precision()
0.7837
>>> bcm.recall()
0.8055
>>> bcm.f1_score()
0.7944

Part-of-speech tagging:

>>> from reason.tag import POSTagger

>>> text = "10 tools from the file"
>>> tagger = POSTagger()
>>> tagger.tag(text)
[('10', 'CD'), ('tools', 'NNS'), ('from', 'IN'), ('the', 'AT'), ('file', 'NN')]

Word tokenization:

>>> from reason.tokenize import word_tokenize

>>> text = "Testing reason0.1.0, (on: 127.0.0.1). Cool stuff..."
>>> word_tokenize(text, 'alphanumeric')
['Testing', 'reason0.1.0', 'on', '127.0.0.1', 'Cool', 'stuff']

Sentence tokenization:

>>> from reason.tokenize import sent_tokenize

>>> text = "Hey, what's up? I love using Reason library!"
>>> sents = sent_tokenize(text)
>>> for sent in sents:
...     print(sent)
Hey, what's up?
I love using Reason library!

Lemmatization:

>>> from reason.stem import PorterStemmer

>>> text = "watched birds flying"
>>> stemmer = PorterStemmer()
>>> stemmer.stem(text)
['watch', 'bird', 'fly']

>>> from reason.stem import regex_stem

>>> regex_pattern = r'^(.*?)(ous)?$'
>>> regex_stem('dangerous', regex_pattern)
danger

Preprocess text (tokenizing + stemming):

>>> from reason import preprocess

>>> text = "What's up? I love using Reason library!"
>>> preprocess(text)
[["what's", 'up', '?'], ['i', 'love', 'us', 'reason', 'librari', '!']]

Frequency distribution:

>>> from reason.analysis import FreqDist

>>> words = ['hey', 'hey', 'oh', 'oh', 'oh', 'yeah']
>>> fd = FreqDist(words)

>>> fd
Frequency Distribution
Most-Common: [('oh', 3), ('hey', 2), ('yeah', 1)]
>>> fd.most_common(2)
[('oh', 3), ('hey', 2)]
>>> fd['yeah']
1

N-grams:

>>> sent = "Reason is easy to use"

>>> from reason.util import bigrams
>>> bigrams(sent)
[('Reason', 'is'), ('is', 'easy'), ('easy', 'to'), ('to', 'use')]

>>> from reason.util import trigrams
>>> trigrams(sent)
[('Reason', 'is', 'easy'), ('is', 'easy', 'to'), ('easy', 'to', 'use')]

>>> from reason.util import ngrams
>>> ngrams(sent, 4)
[('Reason', 'is', 'easy', 'to'), ('is', 'easy', 'to', 'use')]

Dependencies

  • NumPy
    Used to handle data
  • Pandas
    Used in classify package

Keep in mind NumPy will be automatically installed with Reason.

License

MIT -- See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

reason-0.5.1.tar.gz (234.8 kB view details)

Uploaded Source

Built Distribution

reason-0.5.1-py3-none-any.whl (258.1 kB view details)

Uploaded Python 3

File details

Details for the file reason-0.5.1.tar.gz.

File metadata

  • Download URL: reason-0.5.1.tar.gz
  • Upload date:
  • Size: 234.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.23.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.49.0 CPython/3.7.4

File hashes

Hashes for reason-0.5.1.tar.gz
Algorithm Hash digest
SHA256 44829f6a62ebb57aaa92203cef2f7f0af763eee4003878ebcf79956df215c78a
MD5 478e98fb223a7f35cbe8a22979b8c5dd
BLAKE2b-256 a1c03e862f29099a40b90c4c89a3475563c1f868525f62eaac82073e1dd4c18e

See more details on using hashes here.

File details

Details for the file reason-0.5.1-py3-none-any.whl.

File metadata

  • Download URL: reason-0.5.1-py3-none-any.whl
  • Upload date:
  • Size: 258.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.23.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.49.0 CPython/3.7.4

File hashes

Hashes for reason-0.5.1-py3-none-any.whl
Algorithm Hash digest
SHA256 44e16770f9faee8b597cf4088e50ea81fb69440cc28b95c924d3c9284aa27744
MD5 fed1708d7e18fbad65765b82bb67136c
BLAKE2b-256 e08d49113f9fd5f1d0425aca81a9a1369e75e2f39a6ecf32e1195c48d56127d8

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page