Skip to main content

spacy pipeline component for syllables

Project description

spacy syllables

Spacy Syllables

example workflow Latest Version Python Support

Buy Me A Coffee

A spacy 2+ pipeline component for adding multilingual syllable annotation to tokens.

  • Uses well established pyphen for the syllables.
  • Supports a ton of languages
  • Ease of use thx to the awesome pipeline framework in spacy

Install

$ pip install spacy_syllables

which also installs the following dependencies:

  • spacy = "^2.2.3"
  • pyphen = "^0.9.5"

Usage

The SpacySyllables class autodetects language from the given spacy nlp instance, but you can also override the detected language by specifying the lang parameter during instantiation, see how here.

Normal usecase

import spacy
from spacy_syllables import SpacySyllables

nlp = spacy.load("en_core_web_sm")

nlp.add_pipe("syllables", after="tagger")

assert nlp.pipe_names == ["tok2vec", "tagger", "syllables", "parser", "ner", "attribute_ruler", "lemmatizer"]

doc = nlp("terribly long")

data = [(token.text, token._.syllables, token._.syllables_count) for token in doc]

assert data == [("terribly", ["ter", "ri", "bly"], 3), ("long", ["long"], 1)]

more examples in tests

Migrating from spacy 2.x to 3.0

In spacy 2.x, spacy_syllables was originally added to the pipeline by instantiating a SpacySyllables object with the desired options and adding it to the pipeline:

from spacy_syllables import SpacySyllables

syllables = SpacySyllables(nlp, "en_US")

nlp.add_pipe(syllables, after="tagger")

In spacy 3.0, you now add the component to the pipeline simply by adding it by name, setting custom configuration information in the add_pipe() parameters:

from spacy_syllables import SpacySyllables

nlp.add_pipe("syllables", after="tagger", config={"lang": "en_US"})

In addition, the default pipeline components have changed between 2.x and 3.0; please make sure to update any asserts you have that check for these. e.g.:

spacy 2.x:

assert nlp.pipe_names == ["tagger", "syllables", "parser", "ner"]

spacy 3.0:

assert nlp.pipe_names == ["tok2vec", "tagger", "syllables", "parser", "ner", "attribute_ruler", "lemmatizer"]

Dev setup / testing

install

install the dev package and pyenv versions

$ pip install -e ".[dev]"
$ python -m spacy download en_core_web_sm

run tests

$ black .
$ pytest

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spacy_syllables-3.0.2.tar.gz (4.7 kB view details)

Uploaded Source

Built Distribution

spacy_syllables-3.0.2-py3-none-any.whl (5.1 kB view details)

Uploaded Python 3

File details

Details for the file spacy_syllables-3.0.2.tar.gz.

File metadata

  • Download URL: spacy_syllables-3.0.2.tar.gz
  • Upload date:
  • Size: 4.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.2

File hashes

Hashes for spacy_syllables-3.0.2.tar.gz
Algorithm Hash digest
SHA256 1f45a8307382daa0c65d32a996d84bd5dd90552f42e675f721342c35ba3d032b
MD5 ea123b4bd0d59ccc906b3d2fc1714d8e
BLAKE2b-256 159ab94b12188ef0a08e5b87ab95f2f4018365ade7ff36ba22496e6af1c98b21

See more details on using hashes here.

File details

Details for the file spacy_syllables-3.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for spacy_syllables-3.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 0c67cfc086624c643f510bb05c53c93c323de4357761b500ce8d9e48942618ed
MD5 f8f406cb85c4ceaf2897574e8769d6c7
BLAKE2b-256 bcc0412775c4db008df8f5d3887e0d96fa4d14306b9ba8ae257c21aa98a3ec4b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page