Skip to main content

EstNLTK — open source tools for Estonian natural language processing

Project description

EstNLTK -- Open source tools for Estonian natural language processing

EstNLTK provides common natural language processing functionality such as paragraph, sentence and word tokenization, morphological analysis, named entity recognition, etc. for the Estonian language.

The project is funded by EKT (Eesti Keeletehnoloogia Riiklik Programm).

This package contains EstNLTK's basic linguistic analysis, system and database tools:

  • Text class with the Estonian NLP pipeline;
  • tokenization tools: word, sentence and paragraph tokenization; clause segmentation;
  • morphology tools: morphological analysis and disambiguation, spelling correction, morphological synthesis and syllabification, HFST based analyser, GT and UD converters;
  • information extraction tools: addresses tagger, named entity recognizer, temporal expression tagger; tools for rule based and grammar based fact extraction;
  • experimental taggers: verb chain detector, noun phrase chunker, adjective phrase tagger, PropBank semantic roles tagger;
  • syntactic analysis tools: preprocessing for syntactic analysis, VislCG3 and Maltparser based syntactic parsers;
  • Estonian Wordnet and Collocation-Net;
  • web taggers -- such as bert embeddings web tagger, neural named entity recognition web tagger, stanza syntax web tagger and stanza ensemble syntax web tagger;
  • corpus importing tools -- tools for importing data from large Estonian corpora, such as the Reference Corpus or the National Corpus of Estonia;
  • system taggers -- regex tagger, disambiguator, atomizer, merge tagger etc;
  • utils for downloading additional resources (e.g. model files required by taggers);
  • Postgres database tools;

Version 1.7

Installation

EstNLTK is available for osx, windows-64, and linux-64, and for python versions 3.9 to 3.13. You can install the latest version via PyPI:

pip install estnltk==1.7.4

Alternatively, you can install EstNLTK via Anaconda. Installation steps with conda:

  1. create a conda environment with python 3.10, for instance:
conda create -n py310 python=3.10
  1. activate the environment, for instance:
conda activate py310
  1. install EstNLTK with the command:
conda install -c estnltk -c conda-forge estnltk=1.7.4

Note: for using some of the tools in estnltk, you also need to have Java installed in your system. We recommend using Oracle Java http://www.oracle.com/technetwork/java/javase/downloads/index.html, although alternatives such as OpenJDK (http://openjdk.java.net/) should also work.

Using on Google Colab

You can install EstNLTK on Google Colab environment via command:

!pip install estnltk==1.7.4

Documentation

EstNLTK's tutorials come in the form of jupyter notebooks.

Additional educational materials on EstNLTK are available on web pages of an NLP course taught at the University of Tartu:

Note: if you have trouble viewing jupyter notebooks in github (you get an error message Sorry, something went wrong. Reload? at loading a notebook), then try to open notebooks with the help of https://nbviewer.jupyter.org

Source

The source of the last release is available at the main branch.

Changelog is available here.

Citation

In case you use EstNLTK in your work, please cite us as follows:

@InProceedings{laur-EtAl:2020:LREC,
  author    = {Laur, Sven  and  Orasmaa, Siim  and  Särg, Dage  and  Tammo, Paul},
  title     = {EstNLTK 1.6: Remastered Estonian NLP Pipeline},
  booktitle = {Proceedings of The 12th Language Resources and Evaluation Conference},
  month     = {May},
  year      = {2020},
  address   = {Marseille, France},
  publisher = {European Language Resources Association},
  pages     = {7154--7162},
  url       = {https://www.aclweb.org/anthology/2020.lrec-1.884}
}

License

EstNLTK is released under dual license - either GNU General Public License v2.0 or Apache 2.0 License.

(C) University of Tartu

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

estnltk-1.7.4.tar.gz (59.0 MB view details)

Uploaded Source

Built Distributions

estnltk-1.7.4-cp313-cp313-win_amd64.whl (59.7 MB view details)

Uploaded CPython 3.13Windows x86-64

estnltk-1.7.4-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (70.9 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

estnltk-1.7.4-cp313-cp313-macosx_11_0_arm64.whl (59.8 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

estnltk-1.7.4-cp313-cp313-macosx_10_13_x86_64.whl (59.8 MB view details)

Uploaded CPython 3.13macOS 10.13+ x86-64

estnltk-1.7.4-cp312-cp312-win_amd64.whl (59.7 MB view details)

Uploaded CPython 3.12Windows x86-64

estnltk-1.7.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (70.9 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

estnltk-1.7.4-cp312-cp312-macosx_11_0_arm64.whl (59.8 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

estnltk-1.7.4-cp312-cp312-macosx_10_13_x86_64.whl (59.8 MB view details)

Uploaded CPython 3.12macOS 10.13+ x86-64

estnltk-1.7.4-cp311-cp311-win_amd64.whl (59.7 MB view details)

Uploaded CPython 3.11Windows x86-64

estnltk-1.7.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (70.9 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

estnltk-1.7.4-cp311-cp311-macosx_11_0_arm64.whl (59.8 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

estnltk-1.7.4-cp311-cp311-macosx_10_9_x86_64.whl (59.8 MB view details)

Uploaded CPython 3.11macOS 10.9+ x86-64

estnltk-1.7.4-cp310-cp310-win_amd64.whl (59.7 MB view details)

Uploaded CPython 3.10Windows x86-64

estnltk-1.7.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (70.8 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

estnltk-1.7.4-cp310-cp310-macosx_11_0_arm64.whl (59.8 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

estnltk-1.7.4-cp310-cp310-macosx_10_9_x86_64.whl (59.8 MB view details)

Uploaded CPython 3.10macOS 10.9+ x86-64

estnltk-1.7.4-cp39-cp39-win_amd64.whl (59.7 MB view details)

Uploaded CPython 3.9Windows x86-64

estnltk-1.7.4-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (70.8 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ x86-64

estnltk-1.7.4-cp39-cp39-macosx_10_9_x86_64.whl (59.8 MB view details)

Uploaded CPython 3.9macOS 10.9+ x86-64

File details

Details for the file estnltk-1.7.4.tar.gz.

File metadata

  • Download URL: estnltk-1.7.4.tar.gz
  • Upload date:
  • Size: 59.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for estnltk-1.7.4.tar.gz
Algorithm Hash digest
SHA256 08432fe5e95cc364c50aece3d8c28f4fdf2c83682f0a5a484339654501bd2dc6
MD5 e21776601be97daa3fb8c02aabb51fa4
BLAKE2b-256 bbedd49a7b3fb0394b34a925d05d0124b0e0a48fd860d78a7be9e4ca31dcd6cf

See more details on using hashes here.

File details

Details for the file estnltk-1.7.4-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: estnltk-1.7.4-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 59.7 MB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for estnltk-1.7.4-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 722f85972b9f8b543042dd573b735e27b5480b747ee52eff4803bb2f5c4c10d0
MD5 f9b817ac61eaabe37c5d66f74ef1c542
BLAKE2b-256 85f4f5c7df9e7f5276c22befd547105a9ac0d4033a1f2592311c93e22fad63bb

See more details on using hashes here.

File details

Details for the file estnltk-1.7.4-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for estnltk-1.7.4-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 603340d7211f8ddf39cd0c7bbd6b1c0f78af0cd1fc7a34dbc0f2f1551ace09a3
MD5 b658bee7e5e6e4454abcd344c7b6e718
BLAKE2b-256 5ae377b9b6497f5cccab83c75cfc4f173de8b09e9ef6843c9afca8ecbcd6a280

See more details on using hashes here.

File details

Details for the file estnltk-1.7.4-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for estnltk-1.7.4-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 c9dd17adb9054d16f7bdd7a12fdee69c4d09b037b6bad79f2d465b0dc93d852c
MD5 b6d896c29a8148cb9ff3aac72ed0a7e3
BLAKE2b-256 8025a9771826b94283b1cf6c945ec866d4a4b64621bb937c76456e0e445316f0

See more details on using hashes here.

File details

Details for the file estnltk-1.7.4-cp313-cp313-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for estnltk-1.7.4-cp313-cp313-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 f7ed870ba30cc04f79cd9f93fdbf9172c8f8bae3aa24a9efd66edaa2a2d6bcd0
MD5 b142263c791b30a27eddeb08ccdcdb43
BLAKE2b-256 6ad024fca2282af0ac1403e0dc2cd1602015b7f1e3d8d0689591cffae7a4b17c

See more details on using hashes here.

File details

Details for the file estnltk-1.7.4-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: estnltk-1.7.4-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 59.7 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for estnltk-1.7.4-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 b4cdca3acfcc72118a54453237ff7fdc7afce3a1cbed25c1d2f9e3e9a19dc9b3
MD5 cce28c05a86790752f6c316b1816b5f0
BLAKE2b-256 dc68d7ea139567a42ec28a4e4288d46cd040db910058964b386e044311f3971f

See more details on using hashes here.

File details

Details for the file estnltk-1.7.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for estnltk-1.7.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 947e9f7d0f1e4afe1b9ca207f5578c0846c923e10cb6d1b154cc69b73cd3883f
MD5 e65367aba919418cb65ea11433deef94
BLAKE2b-256 5e0082029fe8324c5220147aab2bf3aa3b5e9e305eb6a70b7ed39f6a99cc5e8b

See more details on using hashes here.

File details

Details for the file estnltk-1.7.4-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for estnltk-1.7.4-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 00cd16332fb2fca9d9986b26774b32f8254924371c632d791925eaecb67630ad
MD5 0fe98d3259fa3da5434ad257c9805f0e
BLAKE2b-256 c15fe8caafc9756473aba67c973ce9ef992d1bb1b49be6526290925c7e3b6177

See more details on using hashes here.

File details

Details for the file estnltk-1.7.4-cp312-cp312-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for estnltk-1.7.4-cp312-cp312-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 76829816a22d79d6c98bcf86d677d2d01b41b135f1e39042ddb736509b62836d
MD5 5c2fa14cb612cba4ba188309394e2a5d
BLAKE2b-256 01bbd5066a4583c85442ed15030e2aa5073a76974009d03db0d689419f95dda6

See more details on using hashes here.

File details

Details for the file estnltk-1.7.4-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: estnltk-1.7.4-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 59.7 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for estnltk-1.7.4-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 a9198583c090070a2dbbce69bd142725c25f25f2527ade67b0fbddf73f16812e
MD5 8edce0a2deb15e65c1266f3baab1ea43
BLAKE2b-256 5601cb6b54166ea6e93976d1076f319a13543398cbb500a3756ec93070444042

See more details on using hashes here.

File details

Details for the file estnltk-1.7.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for estnltk-1.7.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 835cce97e29e8ad501fbb08cb654f5cc1dcfb67637ad87fd12138c93bbe3462e
MD5 0ec91b7d786c553a3c933bd3a0f4e919
BLAKE2b-256 28d4f4aa700b573e8f0ab34bf8d6004631e7617d973bce0dd88352ae6b5c247a

See more details on using hashes here.

File details

Details for the file estnltk-1.7.4-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for estnltk-1.7.4-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 c3e396c82518dfd0c8cb8f12be2783ce102c7362b002c5f9b5d66eafa2c0c646
MD5 d8c09525e3e62dd22915239b15f2f6c3
BLAKE2b-256 977cc97fc8020d4ae9a7160ef9be08cded06ff9633820ad5dd9ac14f5be25c95

See more details on using hashes here.

File details

Details for the file estnltk-1.7.4-cp311-cp311-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for estnltk-1.7.4-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 5cfb1daa09e5a6c6404d876ac6027aea20dc3c08e5d0132447859a14618f0309
MD5 0c6a6632a60e4848845b4e7b5512c107
BLAKE2b-256 3a2a337ebdc5b89fc43bf6e17f159ac97ae7435bf358e1c167815458a5c8f641

See more details on using hashes here.

File details

Details for the file estnltk-1.7.4-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: estnltk-1.7.4-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 59.7 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for estnltk-1.7.4-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 f2564cb8bc03c4ceb2a3668dcfd2a92ade94c9f1f572a62ac89a2796d5e9a130
MD5 72f27e3bd7561428ea997df07acc59fc
BLAKE2b-256 da095f5c367726e1df10d2c5e7f498bfcb21e49ab3d330a48b4f62a9b0500a86

See more details on using hashes here.

File details

Details for the file estnltk-1.7.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for estnltk-1.7.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 519dd963b7e4333977332eaabb93b741bb713772f91df8d14723b05ab49dd340
MD5 c9f32daec617b8b183f3e9703285bdbf
BLAKE2b-256 70cd1a4b8312d627e94f97d0e2bc5aefbad98fe183b7940847ad8c04d4d7d0d1

See more details on using hashes here.

File details

Details for the file estnltk-1.7.4-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for estnltk-1.7.4-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 c94fc18f84d6f5a078093112f141985fb82d139bdb83c92d42f1f0a8955fd927
MD5 e8221b09865a0da85c8bbed023bbe74c
BLAKE2b-256 4332636178b78b71b3a6b8ca9a62b7995000319a5d84d2473439dd4332c48b71

See more details on using hashes here.

File details

Details for the file estnltk-1.7.4-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for estnltk-1.7.4-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 bd717d44e02468459de9e78da9863b5b987acd86d9f07a913813d136e7f817d1
MD5 9acb28a0c2c59ca14a24bb3a8f72f67d
BLAKE2b-256 58c6c995e1b1aed5d5a9c2c8a3bd0c376b9fa863e12db0f108d83dfb335f6b3d

See more details on using hashes here.

File details

Details for the file estnltk-1.7.4-cp39-cp39-win_amd64.whl.

File metadata

  • Download URL: estnltk-1.7.4-cp39-cp39-win_amd64.whl
  • Upload date:
  • Size: 59.7 MB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for estnltk-1.7.4-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 d78f97e46584637e802d740827c0d744bebec7ad0038e035ddd9aa24b0f85229
MD5 b6fdbbe2d0ec44efb22ad9521cbbf2ad
BLAKE2b-256 175cc8c5ddc25e75c002a718a082dc5d8b024f28d1046b52b8a0cd7afe40db13

See more details on using hashes here.

File details

Details for the file estnltk-1.7.4-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for estnltk-1.7.4-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 aeeef27dc0e8a40e48988cc84df8e5d769d566b975e8a773e7742d841d2b19fc
MD5 fd9b2636eb4b8014a125abdb98f605b9
BLAKE2b-256 a116455a02ac082a0e2e1bd2ec33e6af1580463b62fbdf64b855726f6e88c95e

See more details on using hashes here.

File details

Details for the file estnltk-1.7.4-cp39-cp39-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for estnltk-1.7.4-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 f942a9966c4afbcd80198d81933e6d1fbd32ce2108b3dea18a648e17505a0838
MD5 96690fa84e704d81af0788046ab8038a
BLAKE2b-256 9e662e6bab85c9b9192d18e9d485df1e403297fb77063d1accbccfd80e617be1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page