Skip to main content

Python wrapper for Loomchild segmenter

Project description

loomchild-segment

A python module for interfacing with Java sentence splitter Loomchild. This package is aimed to be used in Bifixer and/or Bitextor

System dependencies to build and use this package are Maven and Java.

Installation

This package can be installed with pip from pypi:

pip install loomchild-segment

Usage

Splitting a text into sentences:

from loomchild.segmenter import LoomchildSegmenter

segmenter = LoomchildSegmenter(lang)
# segmenting a single line:
segments = segmenter.get_segmentation(input_line)
print("\n".join(segments))

# segmenting a document (i.e. multiple line breaks in the input)
segments = segmenter.get_document_segmentation(input_text)
print("\n".join(segments))

A command line tool is provided to work with base64 encoded documents.

cat b64encoded_input | py-segment -l $LANG > b64encoded_output

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

loomchild-segment-2.0.4.2.tar.gz (2.4 MB view hashes)

Uploaded Source

Built Distribution

loomchild_segment-2.0.4.2-py3-none-any.whl (2.4 MB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page