Skip to main content

Wrapper for the TreeTagger text annotation tool from H.Schmid.

Project description

Author:

Laurent Pointal <laurent.pointal@limsi.fr> <laurent.pointal@laposte.net>

Organization:
CNRS - LIMSI
license:

GNU-GPL Version 3 or greater

Version:
2.3

What is it?

This module wrap the Helmut Schmid language independent part-of-speech statistical tagger into a Python class allowing to tag several texts one after the other, maintaining connexions with the tagger process to speed-up processing (remove external Perl scripts dependency for chunking).

Using objects, you can start multiple taggers simultaneously, eventually using different languages.

Support chunking for:

  • english

  • french

  • german

  • spanish

Support tagging for languages supported by TreeTagger, but you have to do chunking by your own, if necessary you have to specify parameter files via options.

This version has been reworked to run with Python2 and Python3 (thanks so six) and globally reworked, bugs fixed.

Installation

Unless someone built a package for your OS distro, the simplest procedure is to use pip to install the module:

pip install treetaggerwrapper

If you have no admin access to install things on you computer, you may install a virtualenv and run pip inside this virtual env, or you can do a local user installation:

pip install –user treetaggerwrapper

May use pip3 to go with your Python3 installation.

You also need to install TreeTagger…

TreeTagger

Treetagger itself is is freely available for research, education and evaluation. See TreeTagger page.

There is an installation procedure based on a script, where you download needed files into the directory where you want to install TreeTagger, including the installation script, and then launch the script to unzip and install right files in right directories with right names.

For Windows users, there is a downloadable Windows binary, but no install script. You have to download TreeTagger parameter files (since TreeTagger goes utf-8 they are same on Linux and Windows), unzip them and install them in the right place (lib/), with the right names (you can see these files names in treetaggerwrapper.py global dictionnary g_langsupport, in keys tagparfile and abbrevfile.

If you install TreeTagger in a common place, there is normally a working autodetection within treetaggerwrapper. But if you install it in a special place or with a special name, you will have to provide this installation directory to the module (see TAGDIR in the doc).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

treetaggerwrapper-2.3.tar.gz (43.8 kB view details)

Uploaded Source

File details

Details for the file treetaggerwrapper-2.3.tar.gz.

File metadata

  • Download URL: treetaggerwrapper-2.3.tar.gz
  • Upload date:
  • Size: 43.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.9.1 setuptools/20.7.0 requests-toolbelt/0.8.0 tqdm/4.23.0 CPython/3.5.2

File hashes

Hashes for treetaggerwrapper-2.3.tar.gz
Algorithm Hash digest
SHA256 8d4a291fef263ee8b546e6368e407e984ba466d5afa56e6ffc2ae495f3414b67
MD5 138370a353625db62651dc8604095f39
BLAKE2b-256 0e9da48c990ca015a80ff8f19061aebba41509851b01c1b90bc7d0346af13ee0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page