Skip to main content

A simple Wikipedia parser

Project description

SimpleWikiParser

An Simplified Wiki Data Parser

Installation

pip install simple-wikiparser

Usage:

from wikiparser.core import WikiMediaDumpParser

# initialise Parser for a language (say Hindi)
wiki_dump_parser = WikiMediaDumpParser(language="Hindi")

# parse
wiki_dump_parser.parse()

# export
wiki_dump_parser.export_hf_dataset("/path/to/data.jsonl", "dataset_name")

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

simple-wikiparser-0.0.1a0.tar.gz (7.5 kB view details)

Uploaded Source

File details

Details for the file simple-wikiparser-0.0.1a0.tar.gz.

File metadata

  • Download URL: simple-wikiparser-0.0.1a0.tar.gz
  • Upload date:
  • Size: 7.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.3

File hashes

Hashes for simple-wikiparser-0.0.1a0.tar.gz
Algorithm Hash digest
SHA256 57735fa8d18f4ca790972daa0998c2908437950641fd2b078716c5ca51fa0b2a
MD5 049738d064cdb42ef6c56ea5610c2429
BLAKE2b-256 37913b21be4446848d7716d25199508085980afbc69d64742fd3537604a081ef

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page