Skip to main content

Pure-Python implementations of the Snowball stemmers

Project description

The traditional way of using the Snowball stemmers in Python is via the pystemmer package, which provides a Python wrapper around the Snowball C library. However, Python C extensions are problematic in some environments. Therefore, this package provides pure-Python implementations of the Snowball stemming algorithms.

The implementations of the stemming algorithms is translated from the Snowball language to Python via sbl2py.

Installation

Installing purestemmer is easy using pip:

pip install purestemmer

Usage

Usually, you’ll prefer to use the pystemmer module whenever that is possible, because it’s much faster than purestemmer:

try:
    import Stemmer
except ImportError:
    # pystemmer is not available, use purestemmer instead
    import purestemmer as Stemmer

Since purestemmer has the same public API and provides the same algorithms as pystemmer, there should be no need to change any code when switching between pystemmer and purestemmer like this.

Please see the pystemmer documentation for details on how to use the stemming algorithms.

Differences between purestemmer and pystemmer

  • purestemmer has only been tested on Python 2.7

  • purestemmer.Stemmer instances are thread-safe

  • purestemmer is on average about 100x slower than pystemmer

License

purestemmer itself is covered by the MIT License. The underlying Snowball algorithms are covered by the BSD-3 License. Please see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

purestemmer-0.1.1.tar.gz (78.3 kB view details)

Uploaded Source

File details

Details for the file purestemmer-0.1.1.tar.gz.

File metadata

  • Download URL: purestemmer-0.1.1.tar.gz
  • Upload date:
  • Size: 78.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for purestemmer-0.1.1.tar.gz
Algorithm Hash digest
SHA256 33587f8c3024f0a061a8e10ed6d5dd5594ee7ec27b32fe79abd3b954177ba5cd
MD5 7a144e0b090298a9d145427e19ca9d0f
BLAKE2b-256 c4e6609a6154001b2ef37bf20d6137dc3d5d13b48949738c9d2e399ba330be4d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page