stop words lists in many languages
Project description
Simple Python package that provides a single function for loading sets of stop words for different languages.
Stop words in English, French, German, Finish, Hungarian, Turkish, Russian, Czech, Greek, Arabic, Chinese, Japanese, Korean, Catalan, Polish, Hebrew, Norwegian, Swedish, Italian, Portuguese and Spanish, were retrieved from the following sources:
Wiktionary lists of prepositions in the respective languages
Kevin Bouge: https://sites.google.com/site/kevinbouge/stopwords-lists
NLTK
The directory called orig contains the original files used to compile the stop word lists. The directory called not_used contains raw data for creating more stop words lists for languages that are not yet available in many_stop_words.available_languages
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file many-stop-words-0.2.2.tar.gz
.
File metadata
- Download URL: many-stop-words-0.2.2.tar.gz
- Upload date:
- Size: 25.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e712f15f00bda7a185843aae921a5ee824527d0e1561491f56c3f204581bebb8 |
|
MD5 | 6306874f7e3acc9095aebb154a143ad6 |
|
BLAKE2b-256 | db67133043a8557622adc1db4b46c94c3f9b206900cc3988a7e0408f3a83dc81 |