Skip to main content

Textarium is a Python package for text analysis

Project description

textarium: easy-to-use Python package for text analysis.

PyPI Latest Release License Activity Code style: black Imports: isort

What is it?

textarium is a Python package that provides flexible text analysis functions designed to make text analysis intuitive and easy. It aims to be the high-level tool for preparing text-data and complex analysis or NLP modeling.

Installation

Binary installer for the latest released version are available at the Python Package Index (PyPI).

# Type this in your command-line
pip install textarium

Getting started

from textarium import Text
import nltk
nltk.download('wordnet')

s = "This a text example. You can preprocess and analyze it with this package."
text = Text(s, lang='en')
text.prepare()

print(text.prepared_text)
from textarium import Corpus
import nltk
nltk.download('wordnet')

txts = [
    "Hello! My name is Mr.Parker.",
    "I have a website https://parker.com.",
    "It has about 5000 visitors per day.",
    "I track it with a simple html-block like this:",
    "<div>Google.Analytics</div>",
]
c = Corpus(txts, lang='en')
c.info()

c = c.filter(condition=lambda x: len(x.split() > 5), attribute="raw_text")
c.info()

c.prepare()

print(c)

Documentation

The official documentation is hosted on Github.io: https://6b656b.github.io/textarium

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

textarium-0.1.8.1.tar.gz (17.0 kB view details)

Uploaded Source

Built Distribution

textarium-0.1.8.1-py3-none-any.whl (21.4 kB view details)

Uploaded Python 3

File details

Details for the file textarium-0.1.8.1.tar.gz.

File metadata

  • Download URL: textarium-0.1.8.1.tar.gz
  • Upload date:
  • Size: 17.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.6

File hashes

Hashes for textarium-0.1.8.1.tar.gz
Algorithm Hash digest
SHA256 511bc46445f5b8a6c1521a0a1e093c8b138e7143706f3c3ac20c7ce97924d1dd
MD5 659761698b6b435671a81cecaaf80777
BLAKE2b-256 297c0426b8f7abfc5ba4dd632c11c7d122a73e539d335a6ea601df9335404f6f

See more details on using hashes here.

File details

Details for the file textarium-0.1.8.1-py3-none-any.whl.

File metadata

  • Download URL: textarium-0.1.8.1-py3-none-any.whl
  • Upload date:
  • Size: 21.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.6

File hashes

Hashes for textarium-0.1.8.1-py3-none-any.whl
Algorithm Hash digest
SHA256 35f05b29a4e7891c6fb7cd5a12918e4def00d48512cebfb68646fd9ecf751177
MD5 76b0df7c03272f7d7a0c09bc9968c435
BLAKE2b-256 954b7ea3430b4fff6cb415a5b50f18d250f6576218b792d3551ea912aca57a04

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page