A utility library to assist in parsing natural language text.
Project description
Zensols Natural Language Parsing
This framework wraps the spaCy framework and creates features. The motivation is to generate features from the parsed text in an object oriented fashion that is fast and easy to pickle. Other features include:
- Token normalization as a stream of strings by lemmatization, stop word and/or punctuation filters, up/down casing, porter stemming and others.
- Detached features that are safe and easy to pickle to disk.
- Configuration drive parsing and token normalization using configuration factories.
- Pretty print functionality for easy natural language feature selection.
Documentation
Obtaining / Installing
- The easist way to install the command line program is via the
pip
installer:pip3 install zensols.nlp
- Install at least one spaCy model:
python -m spacy download en_core_web_sm
Binaries are also available on pypi.
Attribution
This project, or example code, uses:
- spaCy for natural language parsing
- msgpack and smart-open for Python disk serialization
- nltk for the porter stemmer functionality
Changelog
An extensive changelog is available here.
License
Copyright (c) 2020 - 2021 Paul Landes
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
No source distribution files available for this release.See tutorial on generating distribution archives.
Built Distributions
zensols.nlp-0.1.1-py3.9.egg
(47.7 kB
view hashes)
Close
Hashes for zensols.nlp-0.1.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d06ca8b1996c2226ba374c03fec5e2c3918438880c61fb4d179d9695cc07714a |
|
MD5 | 85ff9b97caf9d3d592417fd1c321f511 |
|
BLAKE2b-256 | cefb01116890f28b1979cc2aea6d9ae8a045dbafe371cd5bc60f9d02f7210eb0 |