Multilingual natural language tools, wrapping NLTK and other systems.
Project Description
This package provides wrappers around NLTK and other systems to provide convenient natural language tools, such as:
- Tokenizers
- Stopword removers
- Word frequency lookup
- Lemmatizers (which reduce words to their root form, possibly taking part-of-speech tags into account)
- Analyzers for East Asian languages (for example, we currently use a MeCab process to find word breaks in Japanese)
For word frequencies in some language, metanl uses corpora from the University of Leeds Center for Translation Studies (http://corpus.leeds.ac.uk/list.html), whose data is released under the Creative Commons Attribution license.
Author: Rob Speer
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Filename, size & hash SHA256 hash help | File type | Python version | Upload date |
---|---|---|---|
metanl-0.5.6.tar.gz (23.0 MB) Copy SHA256 hash SHA256 | Source | None | Apr 11, 2013 |