Skip to main content

Python port of open source text processing library for Turkish, zemberek-nlp

Reason this release was yanked:

Broken release! Normalizer module will not work

Project description

ZEMBEREK-PYTHON

Python implementation of Natural Language Processing library for Turkish, zemberek-nlp. It is based on zemberek 0.17.1 and is completely written in Python meaning there is no need to setup a Java development environment to run it.

Source Code

https://github.com/Loodos/zemberek-python

Dependencies

  • antlr4-python3-runtime>=4.8
  • numpy>=1.19.0

Supported Modules

Currently, following modules are supported.

  • Core (Partially)

  • TurkishMorphology (Partially)

    • Single Word Analysis
    • Diacritics Ignored Analysis
    • Word Generation
  • Tokenization

    • Sentence Boundary Detection
    • Tokenization
  • Normalization (Partially)

    • Spelling Suggestion
    • Noisy Text Normalization

Installation

You can install the package with pip

pip install zemberek-python

Examples

Example usages can be found in examples.py

Notes

There are some minor changes in codes where original contains some Java specific functionality and data structures. We used Python equivalents as much as we could but sometimes we needed to change them. And it affects the performance and accuracy a bit.

Credits

This project is Python port of zemberek-nlp.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zemberek-python-0.1.1.tar.gz (93.6 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

zemberek_python-0.1.1-py3-none-any.whl (93.6 MB view details)

Uploaded Python 3

File details

Details for the file zemberek-python-0.1.1.tar.gz.

File metadata

  • Download URL: zemberek-python-0.1.1.tar.gz
  • Upload date:
  • Size: 93.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.8.9

File hashes

Hashes for zemberek-python-0.1.1.tar.gz
Algorithm Hash digest
SHA256 2d5a66df8b4dfc58609dd5932b21c94ed7ad635a7730594dc035648e8cbc2b2b
MD5 a2b05d3596acd94e80e5ae971b3fa2b2
BLAKE2b-256 fd00d30709b72d749911cb2504aacdebbdde0e625cd84df3324663c041ee9725

See more details on using hashes here.

File details

Details for the file zemberek_python-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: zemberek_python-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 93.6 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.8.9

File hashes

Hashes for zemberek_python-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 108de30d781bd8e696c79bae2d3b1fa7afb349388a06c6a5e4ead98590cb5cb6
MD5 9eb9788f730aca0f93193a3a743e86e0
BLAKE2b-256 d75260250e2ed6972c29e55a6855ddd43461b73b8f52588651c67b3a8b871ec4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page