Lightweight Natural Language Processing for Indonesian Language.
Project description
pySastra
Lightweight Natural Language Processing for Indonesian Language.
Design Plan
Planned | Pipeline | Description |
---|---|---|
🟠 | Language | A text-processing pipeline. |
🟡 | Tokenizer | Segment text, and create Doc objects with the discovered segment boundaries. |
🟠 | Lemmatizer | Determine the base forms of words. |
🟡 | Morphology | Assign linguistic features like lemmas, noun case, verb tense etc. based on the word and its part-of-speech tag. |
🟠 | Tagger | Annotate part-of-speech tags on Doc objects. |
🔄 | DependencyParser | Annotate syntactic dependencies on Doc objects. |
🔄 | EntityRecognizer | Annotate named entities, e.g. persons or products, on Doc objects. |
🔄 | TextCategorizer | Assign categories or labels to Doc objects. |
🔄 | Matcher | Match sequences of tokens, based on pattern rules, similar to regular expressions. |
🔄 | PhraseMatcher | Match sequences of tokens based on phrases. |
🔄 | EntityRuler | Add entity spans to the Doc using token-based rules or exact phrase matches. |
🔄 | Sentencizer | Implement custom sentence boundary detection logic that doesn’t require the dependency parse. |
🟢 Completed With Test 🟡 Completed 🟠 On Progress 🔄 Planned
reference : spaCy language pipeline
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pysastra-0.1.0.tar.gz
(3.0 kB
view details)
Built Distribution
File details
Details for the file pysastra-0.1.0.tar.gz
.
File metadata
- Download URL: pysastra-0.1.0.tar.gz
- Upload date:
- Size: 3.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.47.0 CPython/3.7.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9248d73e9c7d4cb2c9935d52255db900d599bca528b3d9d76d04400a5b5d60b4 |
|
MD5 | 56593daa799840c7fa1616c94c0cf882 |
|
BLAKE2b-256 | 2af676ae2d2a66f8bcb36533ecfdfc1e827c7b514837b3978757676465f5a334 |
File details
Details for the file pysastra-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: pysastra-0.1.0-py3-none-any.whl
- Upload date:
- Size: 4.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.47.0 CPython/3.7.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | bfbb281ab483878a13ed773768873770b67075a14e664220dcdc7d36244a6d99 |
|
MD5 | 269f95345b7eadc3c702925a162c26b5 |
|
BLAKE2b-256 | bc0a7439de96534772284994a76648f4891ae781b3743cc02d76f338ef07dccf |