Skip to main content

pysbd (Python Sentence Boundary Disambiguation) is a rule-based sentence boundary detection that works out-of-the-box across many languages.

Project description

pySBD: Python Sentence Boundary Disambiguation (SBD)

Build Status License

pySBD - python Sentence Boundary Disambiguation (SBD) - is a rule-based sentence boundary detection module that works out-of-the-box.

This project is a direct port of ruby gem - Pragmatic Segmenter which provides rule-based sentence boundary detection.

Install

Python

pip install pysbd

Usage

  • Currently pySBD supports only English language. Support for more languages will be released soon.
import pysbd
text = "Hello World. My name is Jonas."
seg = pysbd.Segmenter(language="en", clean=False)
print(seg.segment(text))
# ['Hello World.', 'My name is Jonas.']

Contributing

If you find a text that is incorrectly segmented using pySBD, please submit an issue.

  1. Fork it ( https://github.com/nipunsadvilkar/pySBD/fork )
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create a new Pull Request

Credit

This project wouldn't be possible without the great work done by Pragmatic Segmenter team.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pysbd-0.1.2-py3-none-any.whl (22.6 kB view details)

Uploaded Python 3

File details

Details for the file pysbd-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: pysbd-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 22.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.6.8

File hashes

Hashes for pysbd-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 603e3a03be9d13fe7b78e1c75d6ed8d8c0d452ff4a514f4d9e3354bb04b67132
MD5 1b03abacf7193b3dec5154cb8b6f6eb1
BLAKE2b-256 61efcb042cabd795827fe450ed97b7358951dfa6e5b32eb66308c8b5fef57f69

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page