Skip to main content

A list of ~98,000 German nouns and their grammatical properties compiled from WiktionaryDE as CSV file. Plus a module to look up the data and parse compound words.

Project description

German nouns

A comma seperated list of ~100 thousand German nouns and their grammatical properties (tense, number, gender) as CSV file. Plus a module to look up the data and parse compound words. Compiled from the WiktionaryDE.

The list can be found here: german_nouns/nouns.csv

If you want to look up nouns or parse compound words, install this package (for Python 3.8+) and follow the instructions below:

Installation

pip install german-nouns

Lookup words

from pprint import pprint
from german_nouns.lookup import Nouns

nouns = Nouns()

# Lookup a word
word = nouns['Fahrrad']
pprint(word)

# Output:
[{'flexion': {'akkusativ plural': 'Fahrräder',
              'akkusativ singular': 'Fahrrad',
              'dativ plural': 'Fahrrädern',
              'dativ singular': 'Fahrrad',
              'dativ singular*': 'Fahrrade',
              'genitiv plural': 'Fahrräder',
              'genitiv singular': 'Fahrrades',
              'genitiv singular*': 'Fahrrads',
              'nominativ plural': 'Fahrräder',
              'nominativ singular': 'Fahrrad'},
  'genus': 'n',
  'lemma': 'Fahrrad',
  'pos': ['Substantiv']}]

# parse compound word
words = nouns.parse_compound('Vermögensbildung')
print(words)

# Output:
['Vermögen', 'Bildung'] # Now lookup nouns['Vermögen'] etc.

Compiling the list

To compile the list yourself, you need Python 3.8+ and Poetry installed.

1. Clone the repository and install dependencies with Poetry:

$ git clone https://github.com/gambolputty/german-nouns
$ cd german-nouns
$ poetry install

2. Compile the list of nouns from a Wiktionary XML file:

Find the latest XML-dump files here: https://dumps.wikimedia.org/dewiktionary/latest, for example this one and download it. Then execute:

$ poetry run python -m german_nouns.parse_dump /path-to-xml-dump-file.xml.bz2

The CSV file will be saved here: german_nouns/nouns.csv.


License: CC BY-SA 4.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

german-nouns-1.2.2.tar.gz (3.1 MB view details)

Uploaded Source

Built Distribution

german_nouns-1.2.2-py3-none-any.whl (3.1 MB view details)

Uploaded Python 3

File details

Details for the file german-nouns-1.2.2.tar.gz.

File metadata

  • Download URL: german-nouns-1.2.2.tar.gz
  • Upload date:
  • Size: 3.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.13 CPython/3.9.10 Darwin/21.4.0

File hashes

Hashes for german-nouns-1.2.2.tar.gz
Algorithm Hash digest
SHA256 d0366cc23e296df151ac75eb8597b6dadd0ad4f14a69ebadbd79058feaed02a9
MD5 b667090edff9ee64ec57759bac73993a
BLAKE2b-256 58088c81bbdc2e3a680d98c993477a9e78909a83dfd983be95050b8b68758bca

See more details on using hashes here.

File details

Details for the file german_nouns-1.2.2-py3-none-any.whl.

File metadata

  • Download URL: german_nouns-1.2.2-py3-none-any.whl
  • Upload date:
  • Size: 3.1 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.13 CPython/3.9.10 Darwin/21.4.0

File hashes

Hashes for german_nouns-1.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 b4ce17120079b901184d0d3e1198ce9b60037a10fa7dbb044ef0b369d371558e
MD5 411848140d9238c449cb42069c08b8dd
BLAKE2b-256 cdcd38d3c12946c4a90df054d565690acdf459f3581b7c45346dee08df406751

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page