Skip to main content

An NLTK-based toolkit aimed at increasing the understanding of various texts.

Project description

Smart Reading

About

smart_Reading is a Python module designed for increasing the understanding of various textsforms by using natural language processing. It is heavily based on tools available from the Natural Language Toolkit (NLTK), which are used in various applications or provided with an extension.

Installation

The module is available for Python 2.7+, but recommended to run on Python 3+ for a more thorough unicode support (and prettier graphs). Install via pip (or any other desired client):

$ pip install smart_reading

or by downloading the source code on PyPI or GitHub and running the following command in the root folder:

$ python setup.py install

Usage

Importing texts

The basic functionality of smart_reading is to provide the user with an analysis of any given text. Textfiles can be imported via the function smart_reading.book.load(filename). This function utilizes the functionality of the module textract to extract textual information of almost any given data form, including .txt, .pdf, .epub, .docx etc. See the online manual for more details on the inner workings of this module. When this module is not found on the system, the program continues with a limited functionality, in which only .txt files can be read. Additionally, a given string can be imported as an e-book via smart_reading.book.fromstring(text).

Three sample texts are also included with different file structures, and available via the function smart_reading.book.sample:

>>> import smart_reading as sr
>>> sr.book.sample() # or sr.book.sample('txt')

Succesfully loaded 'Benn_Ch_II_The_Metaphysicians.txt' as an e-book
Total n.o. tokens: 10420

<smart_reading.book.Book instance at 0x105a546c8>
>>> sr.book.sample('pdf')

Succesfully loaded 'PhysRev.47.777.pdf' as an e-book
Total n.o. tokens: 3192

<smart_reading.book.Book instance at 0x110c0ebd8>
>>> sr.book.sample('epub')

Succesfully loaded 'Mason_Throwing_Sticks.epub' as an e-book
Total n.o. tokens: 13581

<smart_reading.book.Book instance at 0x105a46e60>

Functionality

TBD

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

smart_reading-0.3.4.tar.gz (322.6 kB view details)

Uploaded Source

Built Distribution

smart_reading-0.3.4-py2.py3-none-any.whl (343.3 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file smart_reading-0.3.4.tar.gz.

File metadata

File hashes

Hashes for smart_reading-0.3.4.tar.gz
Algorithm Hash digest
SHA256 fdf9581a33ef74e0a67019c830ac64af3c5708e8313b35fd1236233c741b3d76
MD5 f4f5c4b71dbde582fca4ea085a24c70c
BLAKE2b-256 7ff02082d7d461b900ee89006443550d7ef2c19cd0ce02cb09bc2970c5ba6a27

See more details on using hashes here.

File details

Details for the file smart_reading-0.3.4-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for smart_reading-0.3.4-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 576b877c261235576a128d1c6682f1303bee8c1fde2f34625ac29a35a8fd1d58
MD5 cb5c44046add179f68f2e8b3e0e6a3b7
BLAKE2b-256 07027a3d1d67891470ccf61f3ca93a7715e1a8b6cc06fd0081d4f5b9333f7ed7

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page