Skip to main content

Sentence splitting & tokenization

Project description

Obeliks - sentence splitting & tokenization

Instalation

Install from PyPI:

pip install obeliks

Usage

Command line:

obeliks -h

Command line parameters:

-if <name*> read input from one or more files
-sif        read input from list of files, specified via stdin
-o <name>   write output to file <name>
-tei        produce XML-TEI output
-c		    produce CoNLL-U output
-d		    pass "newdoc id" to output (implies -c)

Usage examples:

obeliks "To je stavek." "Tudi to je stavek."
echo -e "To je stavek.\nTudi to je stavek." | obeliks
obeliks "To je stavek." "Tudi to je stavek." -o output.txt
echo -e "To je stavek.\nTudi to je stavek." | obeliks > output.txt
obeliks -if text*.txt
cat text*.txt | obeliks

As a Python module:

import obeliks

text = 'Hello, world!'

# Store results to string
output = obeliks.run(text, conllu=True)

# Write result to file
obeliks.run(text, out_file='tei.txt', tei=True)

# Write to stdout
obeliks.run(text, to_stdout=True, conllu=True)

# Read input from file(s)
output = obeliks.run(in_file='in.txt', conllu=True)
output = obeliks.run(in_files=['in1.txt', 'in2.txt'], tei=True)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

obeliks-1.0.2.tar.gz (12.4 kB view details)

Uploaded Source

Built Distribution

obeliks-1.0.2-py3-none-any.whl (15.7 kB view details)

Uploaded Python 3

File details

Details for the file obeliks-1.0.2.tar.gz.

File metadata

  • Download URL: obeliks-1.0.2.tar.gz
  • Upload date:
  • Size: 12.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.21.0 setuptools/46.4.0 requests-toolbelt/0.9.1 tqdm/4.46.1 CPython/3.7.1

File hashes

Hashes for obeliks-1.0.2.tar.gz
Algorithm Hash digest
SHA256 76f800cb0efd4db4919a0baef4d15a0d2a74a55c1733768993d9ada2d202aae7
MD5 dd7e20413731e60cebc813fc4e325513
BLAKE2b-256 d1c2061dad96f987c5042cc25be7889f436f30ec48e546c8dcfd35950fcfc9ed

See more details on using hashes here.

File details

Details for the file obeliks-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: obeliks-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 15.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.21.0 setuptools/46.4.0 requests-toolbelt/0.9.1 tqdm/4.46.1 CPython/3.7.1

File hashes

Hashes for obeliks-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 b22268713d2d345aebd62b01dcb921ad95a623b29d23009bd350d848233c5f00
MD5 7fb46c9fac26087aa49d3ec5c2eb6690
BLAKE2b-256 15d4825ade0b8ed92a00ad8e881251eb85b3e1b2be42e04cf5f5d2e9ce3e8def

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page