Skip to main content

Unsupervised Korean Natural Language Processing Toolkits

Project description

It contains unsupervised word extraction, tokenizers and noun extractors.
These algorithms are not depending training corpus but extract patterns from data by theirselves.

Current version has follows
- Word extraction
- Cohesion score
- Branching Entropy
- Accessor Variety
- Tokenizers
- RegexTokenizer
- LTokenizer
- MaxScoreTokenizer
- Noun extractor
- LRNounExtractor


Following packages are helpful
- krwordrank: Unsupervised Korean word/keyword extractor
- https://github.com/lovit/KR-WordRank
- pip install krwordrank
- soyspacing: Korean spacing error corrector
- https://github.com/lovit/soyspacing
- pip install soyspacing


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

soynlp-0.0.2.tar.gz (25.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

soynlp-0.0.2-py3-none-any.whl (34.8 kB view details)

Uploaded Python 3

File details

Details for the file soynlp-0.0.2.tar.gz.

File metadata

  • Download URL: soynlp-0.0.2.tar.gz
  • Upload date:
  • Size: 25.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for soynlp-0.0.2.tar.gz
Algorithm Hash digest
SHA256 59cc916fb952e4e8c85aa1631b1f0159a3e4d8981d750b6ab9ba9a248d064a16
MD5 9603c2cac687405d26cd6b25089a439f
BLAKE2b-256 92522681d42f3aea304fc7b36d965579cec484747d8ad8d036219abd8b8972d2

See more details on using hashes here.

File details

Details for the file soynlp-0.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for soynlp-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 f0ca0285d232c050b00cbab168e2c1ad81ec8ac067402aa34e5dd38bed13cbe0
MD5 0bb8ada88d5652e0fb53dde967c4620b
BLAKE2b-256 190a17f5dbb7df417d34e07ecd8f8846041203fe7d492ea7e6cf2f66af209508

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page