Skip to main content

An unofficial WikiHow API

Project description

WHAPI

An unofficial WikiWow API. Uses BeautifulSoup to scrape WikiHow and return the data you want.

Installation

pip install whapi

Usage

Random How To

Learn random stuff! Retuns a random WikiHow article. Sometimes they're weird. Sometimes they're really weird.

from whapi import RandomHowTo

how_to = RandomHowTo()
how_to.print()

Searching

from whapi import WikiHow, search_wikihow


max_results = 1  # default for optional argument is 10
how_tos = search_wikihow("how to learn programming", max_results)
assert len(how_tos) == 1
how_tos[0].print()


# for efficiency and to get unlimited entries, the best is to use the generator
for how_to in WikiHow.search("how to learn python"):
    how_to.print()

Parsing

Manipulate HowTo objects

from whapi import HowTo

how_to = HowTo("https://www.wikihow.com/Train-a-Dog")

data = how_to.as_dict()

print(how_to.url)
print(how_to.title)
print(how_to.intro)
print(how_to.n_steps)
print(how_to.summary)

first_step = how_to.steps[0]
first_step.print()
data = first_step.as_dict()

how_to.print(extended=True)

ToDo

  • Many WikiHow articles also contain "Parts" which break down further into sub-steps. Write a function to parse these additional divisions.
  • Add parser for tips
  • Add parser for warnings
  • Add function to cycle through useragent strings

Project details


Release history Release notifications

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for whapi, version 0.5.3
Filename, size File type Python version Upload date Hashes
Filename, size whapi-0.5.3-py3-none-any.whl (4.8 kB) File type Wheel Python version py3 Upload date Hashes View
Filename, size whapi-0.5.3.tar.gz (3.7 kB) File type Source Python version None Upload date Hashes View

Supported by

Pingdom Pingdom Monitoring Google Google Object Storage and Download Analytics Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page