Skip to main content

python library for converting wikipedia articles to markdown

Project description

Wiki2Md

An opinionated tool for converting wikipedia HTML into markdown suitable for ingestion by LLMs.

  • removes citations (.reference)
  • removes ref list (.reflist)
  • removes js table headers and footers (.pcs-collapse-table-icon)
  • removes metadata like portal lists (.metadata)
  • removes flag icons
  • optionally removes links

Install the pre-commit hooks with poetry run pre-commit install or just run them manually e.g. poetry run ruff check

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wiki2md-0.1.3.tar.gz (4.8 kB view details)

Uploaded Source

Built Distribution

wiki2md-0.1.3-py3-none-any.whl (5.9 kB view details)

Uploaded Python 3

File details

Details for the file wiki2md-0.1.3.tar.gz.

File metadata

  • Download URL: wiki2md-0.1.3.tar.gz
  • Upload date:
  • Size: 4.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.4 CPython/3.13.0 Darwin/23.4.0

File hashes

Hashes for wiki2md-0.1.3.tar.gz
Algorithm Hash digest
SHA256 83d28d8d432cb4f69cfd21a4518e07d50c60b87ef167f0835ddf4ca12d2da2b9
MD5 62126f578c67b6a60ecb21073ee68c50
BLAKE2b-256 09c729babbeb0761d76eb6015073b5290119baaad1ac23704ec2832c996004ba

See more details on using hashes here.

File details

Details for the file wiki2md-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: wiki2md-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 5.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.4 CPython/3.13.0 Darwin/23.4.0

File hashes

Hashes for wiki2md-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 a568873fcf488d2acb18f987f24c8f4a2e92a20cc98f2d459ac3ba16f886bec7
MD5 a57dd9e3909a32f1a69b07004c531e76
BLAKE2b-256 7e019e28369a3e0de55918fc81dbd2a73783b03b98c8d1eb63735245f2610ec7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page