Skip to main content

Collection of stopwords for multiple languages, using ISO 639-1 language code.

Project description

stopwordsiso

PyPI - Version GitHub License

Collection of stopwords for multiple languages, using ISO 639-1 language code.

This Python package is based on Stopwords ISO project by Gene Diaz. You can see the full list of stopwords in every language available there. Contribution to the word lists should also happen there.

Comparable packages also published on npm and bower.

Installation

pip install stopwordsiso

Usage

import stopwordsiso

stopwordsiso.has_lang("th")  # True if there are stopwords for Thai
stopwordsiso.langs()  # frozenset of all supported ISO 639-1 language codes
from stopwordsiso import stopwords

stopwords("en")  # English stopwords
stopwords(["de", "id", "zh"])  # German, Indonesian, and Chinese stopwords
stopwords("xxx")  # an empty set will be returned for unknown language

Stopwords data

The entire collection is in JSON format and can be found at stopwords-iso.json in your stopwordsiso/ Python package directory. You are free to use this collection any way you like.

Stopwords for each language is a list value with a key of respective language in ISO 639-1 language code, like this:

{
    "af": [ "aan", "af", "al", "as" ],
    "ar": [ "آض", "آمينَ", "آه", "آهاً" ],
}

If you wish to add, remove, or update some of the stopwords, please go to Stopwords ISO project at https://github.com/stopwords-iso.

Credits

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

stopwordsiso-0.7.0.tar.gz (73.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

stopwordsiso-0.7.0-py3-none-any.whl (73.4 kB view details)

Uploaded Python 3

File details

Details for the file stopwordsiso-0.7.0.tar.gz.

File metadata

  • Download URL: stopwordsiso-0.7.0.tar.gz
  • Upload date:
  • Size: 73.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for stopwordsiso-0.7.0.tar.gz
Algorithm Hash digest
SHA256 cc3fd2ca524fa02f24609a133eb2a27afe803fe86fa2761cdb7b0b4bced53b4c
MD5 cf6ca01c9b21a7b0e68dcabeebffc2db
BLAKE2b-256 9074503aa9a5d8384e40c4bfe06a144a3f43f65d59231ffa60164b015b341c02

See more details on using hashes here.

Provenance

The following attestation bundles were made for stopwordsiso-0.7.0.tar.gz:

Publisher: pypi-publish.yml on bact/stopwords-iso

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file stopwordsiso-0.7.0-py3-none-any.whl.

File metadata

  • Download URL: stopwordsiso-0.7.0-py3-none-any.whl
  • Upload date:
  • Size: 73.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for stopwordsiso-0.7.0-py3-none-any.whl
Algorithm Hash digest
SHA256 99025477be8082ae872ccafcad0ca9020801e7bbc00d35728423f5c057723a7a
MD5 b4d283e5d76e3496e475f21c6722cd68
BLAKE2b-256 6a2596994809c74125f00421c9deea47fd33c29bbc04ebe0e1db0f4d18d71f8f

See more details on using hashes here.

Provenance

The following attestation bundles were made for stopwordsiso-0.7.0-py3-none-any.whl:

Publisher: pypi-publish.yml on bact/stopwords-iso

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page