A list of ~100,000 German nouns and their grammatical properties compiled from WiktionaryDE as CSV file. Plus a module to look up the data and parse compound words.
Project description
German nouns
A comma seperated list of ~100 thousand German nouns and their grammatical properties (tense, number, gender) as CSV file. Plus a module to look up the data and parse compound words. Compiled from the WiktionaryDE.
The list can be found here: german_nouns/nouns.csv
If you want to look up nouns or parse compound words, install this package (for Python 3.8+) and follow the instructions below:
Installation
pip install german-nouns
Lookup words
from pprint import pprint
from german_nouns.lookup import Nouns
nouns = Nouns()
# Lookup a word
word = nouns['Fahrrad']
pprint(word)
# Output:
[{'flexion': {'akkusativ plural': 'Fahrräder',
'akkusativ singular': 'Fahrrad',
'dativ plural': 'Fahrrädern',
'dativ singular': 'Fahrrad',
'dativ singular*': 'Fahrrade',
'genitiv plural': 'Fahrräder',
'genitiv singular': 'Fahrrades',
'genitiv singular*': 'Fahrrads',
'nominativ plural': 'Fahrräder',
'nominativ singular': 'Fahrrad'},
'genus': 'n',
'lemma': 'Fahrrad',
'pos': ['Substantiv']}]
# parse compound word
words = nouns.parse_compound('Vermögensbildung')
print(words)
# Output:
['Vermögen', 'Bildung'] # Now lookup nouns['Vermögen'] etc.
Compiling the list
To compile the list yourself, you need Python 3.8+ and Poetry installed.
1. Clone the repository and install dependencies with Poetry:
$ git clone https://github.com/gambolputty/german-nouns
$ cd german-nouns
$ poetry install
2. Compile the list of nouns from a Wiktionary XML file:
Find the latest XML-dump files here: https://dumps.wikimedia.org/dewiktionary/latest, for example this one and download it. Then execute:
$ poetry run python -m german_nouns.parse_dump /path-to-xml-dump-file.xml.bz2
The CSV file will be saved here: german_nouns/nouns.csv.
Remove german_nouns/index.txt
to let the script recreate the word-index when using the lookup methods.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file german-nouns-1.2.5.tar.gz
.
File metadata
- Download URL: german-nouns-1.2.5.tar.gz
- Upload date:
- Size: 3.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.1.13 CPython/3.9.10 Darwin/21.4.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1258f2917db364d3661a651a47d1089c8ef40bddbe061e43cb529dae5f13ce54 |
|
MD5 | 8abf87f430c20d368e5837bb715006eb |
|
BLAKE2b-256 | bcb93d803c566f752b6c64bd5bed6438b78e5ca31305bf33037a257101a7ccbf |
File details
Details for the file german_nouns-1.2.5-py3-none-any.whl
.
File metadata
- Download URL: german_nouns-1.2.5-py3-none-any.whl
- Upload date:
- Size: 3.1 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.1.13 CPython/3.9.10 Darwin/21.4.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0f113c59ea331aae750bc03c7b6c7217bae975c6677b4df8d8932093093f509f |
|
MD5 | 5979a70b393d5ba720da706b74af3c3a |
|
BLAKE2b-256 | 3fe5466f9559d9b2a413a1f5de3dac398d36ddafd1abfaa942660277234dc8e1 |