Skip to main content

A tool to parse word data from wiktionary.com into a JSON object. Based on wiktionary parser by Suyash Behera

Project description

Wiktionary Parser

A python project which parses word content from Wiktionary in an easy to use JSON format. Right now, it parses etymologies, definitions, pronunciations, examples, audio links and related words.

Downloads

JSON structure

[{
    "pronunciations": {
        "text": ["pronunciation text"],
        "audio": ["pronunciation audio"]
    },
    "definitions": [{
        "relatedWords": [{
            "relationshipType": "word relationship type",
            "words": ["list of related words"]
        }],
        "text": ["list of definitions"],
        "partOfSpeech": "part of speech",
        "examples": ["list of examples"]
    }],
    "etymology": "etymology text",
}]

Installation

Using pip
  • run pip install wiktionaryparser
From Source
  • Clone the repo or download the zip
  • cd to the folder
  • run pip install -r "requirements.txt"

Usage

  • Import the WiktionaryParser class.
  • Initialize an object and use the fetch("word", "language") method.
  • The default language is English, it can be changed using the set_default_language method.
  • Include/exclude parts of speech to be parsed using include_part_of_speech(part_of_speech) and exclude_part_of_speech(part_of_speech)
  • Include/exclude relations to be parsed using include_relation(relation) and exclude_relation(relation)

Examples

>>> from wiktionaryparser import WiktionaryParser
>>> parser = WiktionaryParser()
>>> word = parser.fetch('test')
>>> another_word = parser.fetch('test', 'french')
>>> parser.set_default_language('french')
>>> parser.exclude_part_of_speech('noun')
>>> parser.include_relation('alternative forms')

Requirements

  • requests==2.20.0
  • beautifulsoup4==4.4.0

Contributions

If you want to add features/improvement or report issues, feel free to send a pull request!

License

Wiktionary Parser is licensed under MIT.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wiktionaryparser-ml-0.0.1.tar.gz (17.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

wiktionaryparser_ml-0.0.1-py3-none-any.whl (20.3 kB view details)

Uploaded Python 3

File details

Details for the file wiktionaryparser-ml-0.0.1.tar.gz.

File metadata

  • Download URL: wiktionaryparser-ml-0.0.1.tar.gz
  • Upload date:
  • Size: 17.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.20.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.47.0 CPython/3.7.3

File hashes

Hashes for wiktionaryparser-ml-0.0.1.tar.gz
Algorithm Hash digest
SHA256 0fff833f04c00d819871733444ca3d7839846fa11ad55e3d8d2089492f0e5f4a
MD5 5c546bf5c76ffc345ed1013b9aec3f6a
BLAKE2b-256 fa1612e4ee40694f56fca96cdfa2a741c424ff8a0211c759ba2de41b9e44cc86

See more details on using hashes here.

File details

Details for the file wiktionaryparser_ml-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: wiktionaryparser_ml-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 20.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.20.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.47.0 CPython/3.7.3

File hashes

Hashes for wiktionaryparser_ml-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 0910aca0cc4779b9d22ab714ccfe10a4298980fbda8b70ad1cbea71fc8e18ff0
MD5 3b7653fbb69dcb6728bbc991e9990634
BLAKE2b-256 0b0fb58b655908dd4c547d94561ff24436f9579e5f9a83f26aabec4df0ea57ef

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page