Skip to main content

Lightweight Natural Language Processing for Indonesian Language.

Project description

pySastra

Lightweight Natural Language Processing for Indonesian Language.

Design Plan

Planned Pipeline Description
🟠 Language A text-processing pipeline.
🟡 Tokenizer Segment text, and create Doc objects with the discovered segment boundaries.
🟠 Lemmatizer Determine the base forms of words.
🟡 Morphology Assign linguistic features like lemmas, noun case, verb tense etc. based on the word and its part-of-speech tag.
🟠 Tagger Annotate part-of-speech tags on Doc objects.
🔄 DependencyParser Annotate syntactic dependencies on Doc objects.
🔄 EntityRecognizer Annotate named entities, e.g. persons or products, on Doc objects.
🔄 TextCategorizer Assign categories or labels to Doc objects.
🔄 Matcher Match sequences of tokens, based on pattern rules, similar to regular expressions.
🔄 PhraseMatcher Match sequences of tokens based on phrases.
🔄 EntityRuler Add entity spans to the Doc using token-based rules or exact phrase matches.
🔄 Sentencizer Implement custom sentence boundary detection logic that doesn’t require the dependency parse.

🟢 Completed With Test 🟡 Completed 🟠 On Progress 🔄 Planned

reference : spaCy language pipeline

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pysastra-0.1.0.tar.gz (3.0 kB view details)

Uploaded Source

Built Distribution

pysastra-0.1.0-py3-none-any.whl (4.2 kB view details)

Uploaded Python 3

File details

Details for the file pysastra-0.1.0.tar.gz.

File metadata

  • Download URL: pysastra-0.1.0.tar.gz
  • Upload date:
  • Size: 3.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.47.0 CPython/3.7.4

File hashes

Hashes for pysastra-0.1.0.tar.gz
Algorithm Hash digest
SHA256 9248d73e9c7d4cb2c9935d52255db900d599bca528b3d9d76d04400a5b5d60b4
MD5 56593daa799840c7fa1616c94c0cf882
BLAKE2b-256 2af676ae2d2a66f8bcb36533ecfdfc1e827c7b514837b3978757676465f5a334

See more details on using hashes here.

File details

Details for the file pysastra-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: pysastra-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 4.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.47.0 CPython/3.7.4

File hashes

Hashes for pysastra-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 bfbb281ab483878a13ed773768873770b67075a14e664220dcdc7d36244a6d99
MD5 269f95345b7eadc3c702925a162c26b5
BLAKE2b-256 bc0a7439de96534772284994a76648f4891ae781b3743cc02d76f338ef07dccf

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page