Skip to main content

No project description provided

Project description

Wikipedia Scraper

A library for scraping data from Wikipedia. Can be useful in Natural Language Processing, text processing etc. The library can also perform certain tasks on the scrpaed text such as removing punctutations,numbers,citations, converting text into lower case and tokenization

Installation

pip install wiki-scraper

Get Started

How to scrape data from the wikipedia article using this library

from wiki_scraper import WikiScraper

scraper = WikiScraper('India') text = WikiScraper.get_data(remove_punctuations=False,remove_numbers=False,lower_case=False,remove_citations=False,tokenization=False)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wiki-scraper-0.1.0.tar.gz (2.1 kB view details)

Uploaded Source

Built Distribution

wiki_scraper-0.1.0-py3-none-any.whl (2.4 kB view details)

Uploaded Python 3

File details

Details for the file wiki-scraper-0.1.0.tar.gz.

File metadata

  • Download URL: wiki-scraper-0.1.0.tar.gz
  • Upload date:
  • Size: 2.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.6.0 pkginfo/1.6.1 requests/2.24.0 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.8.5

File hashes

Hashes for wiki-scraper-0.1.0.tar.gz
Algorithm Hash digest
SHA256 7b50c32b0648b668ecc62bd2be259949201771e67419798352a0d77b07436182
MD5 51c4a1ff36a0eda75ed68cdb799de61a
BLAKE2b-256 9536ed44f5757b33eedad3ee7234bbbd3221cf23011076f58297c52f80d1523a

See more details on using hashes here.

File details

Details for the file wiki_scraper-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: wiki_scraper-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 2.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.6.0 pkginfo/1.6.1 requests/2.24.0 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.8.5

File hashes

Hashes for wiki_scraper-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8db689174935c6041debb658b448b6cd4df5ded2d7cd0a5a84a8178f28fe29fd
MD5 f08a99a9ee269c2ea25aa6cd078db9c1
BLAKE2b-256 1f444f40bab49aa3f8199917c44f20434411321297bea309e6b7182a15e92378

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page