Collection of classes and functions for text analysis
Project description
Introduction
Text-analysis-helpers is a collection of classes and functions for text analysis.
Installation
A Python 3 interpreter is required. It is recommended to install the package in a virtual environment in order to avoid corrupting the system's Python interpreter packages.
Install the package using pip.
pip install text-analysis-helpers
python -m nltk.downloader "punkt"
python -m nltk.downloader "averaged_perceptron_tagger"
python -m nltk.downloader "maxent_ne_chunker"
python -m nltk.downloader "words"
python -m nltk.downloader "stopwords"
Usage
You can use the HtmlAnalyser object to analyse the contents of a url.
from text_analysis_helpers.html import HtmlAnalyser
analyser = HtmlAnalyser()
analysis_result = analyser.analyse_url("https://www.bbc.com/sport/formula1/64983451")
analysis_result.save("analysis_result.json")
You can see the scripts in the examples
folder for some usage examples.
There is also an cli utility that can be used to analyse a url. For example to analyse a url and save the analysis result to a json encoded file execute the following command in the terminal.
text-analysis-helpers-cli analyse-url --output analysis_result.json https://www.bbc.com/sport/formula1/64983451
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for text_analysis_helpers-0.6.0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 18fbb140be656374da97576cc92a105ab00d410dde4437e600827ec30bbbb8c7 |
|
MD5 | 01b78bfa3dac9cb79102f0137d3098d8 |
|
BLAKE2b-256 | 0d4eed51c5110f5620da511838c3169996932db78b95a955f9bce681c35508a0 |
Hashes for text_analysis_helpers-0.6.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0beb17912348a4f5a752d00c7a38d5a3b90ae5baebfe4404ec609f399a1f5235 |
|
MD5 | 546aeaa47def0d1997dce2006e95e452 |
|
BLAKE2b-256 | f4bb250ef2b482cf783ba2c23661b7a000d4df8942cd415aed6a9d9c477f30e4 |