A random wikipedia page generator.

These details have not been verified by PyPI

Project links

Project description

WikiBot

Welcome to WikiBot! This is a small program to get a random page from a Wikipedia category AND it's subcategories (up to a specified depth).

Installation

All you need to do it clone this repo and install the dependencies. Make sure you have Pip installed!

git clone https://github.com/ddxtanx/wikiBot
cd wikiBot
pip install -r

pip install wikiBot

To use as an API

Usage

python wikiBot.py -h shows the usage of the program.

usage: wikiBot.py [-h] [--tree_depth [TREE_DEPTH]] [--similarity [SIMILARITY]]
                  [-s] [-r] [-v] [-c]
                  category

Get a random page from a wikipedia category

positional arguments:
  category              The category you wish to get a page from.


optional arguments:
  -h, --help            show this help message and exit
  --tree_depth [TREE_DEPTH]
                        How far down to traverse the subcategory tree
  --similarity [SIMILARITY]
                        What percent of page categories need to be in
                        subcategory array. Must be used with -c/--check
  -s, --save            Save subcategories to a file for quick re-runs
  -r, --regen           Regenerate the subcategory file
  -v, --verbose         Print debug lines
  -c, --check           After finding page check to see that it truly fits in
                        category

Pro Tips:

Use a tree_depth of 3 or 4, more than 4 will bring loosely relates categories into subcategories.
Use a similarity of .25 or .33. If you want a higher similarity value then you might sacrifice other valid pages in search for the PERFECT page.

If you're using it in your own Python code the best way to set it up is

from wikiBot import WikiBot
wb = WikiBot({{Your preferred tree_depth}}, {{Your preferred similarity_val}})

"""
...
Your Awesome Code
...
"""

randomPage = wb.randomPage(category,...)

You can also change the tree depth and similarity_val by using wb.td = {{ New Tree Depth}} and wb.sv = {{ New Similarity Val}}

More info available by using help(wikiBot)

How It Works

The most important part of this program is the Wikipedia API; it allows the program to gather all of the subcategories of a given category in a fast(ish) and usable manner, and to get the pages belonging to a given category. The bulk of my code focuses on iteratively getting the subcategories at a given depth in a tree, adding them to an array with all subcategories of a given 'parent' category, and continuing on in that fashion until there are no more subcategories or the program has fetched to the maximum tree depth allowed. i.e. if a subcategory chain went

Category A -> Category B -> Category C -> Category D -> ...

(-> denotes 'is a supercategory of')

and the maximum tree depth was 3, then the code would stop gathering subcategories for Category C,D,E...

After all subcategories of a given parent category have been amassed in some list L, the program randomly chooses a category C from L, finds the pages belonging to C, chooses a random page P from C and return the URL pointing to P. For speeds sake, after gathering all subcategories from a given parent category the program optionally saves all of them to a text file to find subcategories faster.

To determine how similar a page is to a category, the program first enumerates what categories the page selected belongs to. Then it loops through all of the found categories using a variable I will call A here. It then checks if A belongs to the subcategories generated by the 'parent' category, and computes a 'score' of that page. If it is >= than a prespecified value (Default is .5: half of all A's should be subcategories of parent category) then it is a valid subpage. If not, it removes that page from the category list and loops on.

Note on types

This project uses type annotations and mypy type checking, so you can be sure you are passing the right types to functions. If you're using Atom to edit your code, I recommend using atom-linter-mypy to do type linting. Have fun!

Contributions

I'm open to anyone contributing, especially if they know of a way to make this faster or take up less drive space for locally stored subcategories. Email me at gcc@ameritech.net and we can talk stuff out.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.3.1

May 9, 2018

1.2

May 9, 2018

1.1

May 8, 2018

1.0

May 8, 2018

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wiki_bot-1.3.1.tar.gz (7.4 kB view details)

Uploaded May 9, 2018 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

wiki_bot-1.3.1-py2.py3-none-any.whl (6.7 kB view details)

Uploaded May 9, 2018 Python 2Python 3

File details

Details for the file wiki_bot-1.3.1.tar.gz.

File metadata

Download URL: wiki_bot-1.3.1.tar.gz
Upload date: May 9, 2018
Size: 7.4 kB
Tags: Source
Uploaded using Trusted Publishing? No

File hashes

Hashes for wiki_bot-1.3.1.tar.gz
Algorithm	Hash digest
SHA256	`ad03ba8265f3a5a425a09cf705b8a2490d554b2e9ab9d93a1345b80bf73c1b16`
MD5	`e5923cd943743e02ddd87df8b6bf8e4a`
BLAKE2b-256	`79e6dc55ef257c28d503e1606ebc3a763e595688a8fd0b03b2b4761d371d7da3`

See more details on using hashes here.

File details

Details for the file wiki_bot-1.3.1-py2.py3-none-any.whl.

File metadata

Download URL: wiki_bot-1.3.1-py2.py3-none-any.whl
Upload date: May 9, 2018
Size: 6.7 kB
Tags: Python 2, Python 3
Uploaded using Trusted Publishing? No

File hashes

Hashes for wiki_bot-1.3.1-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`4de1e97143dd99c5877273d660070fd2224b1f1a291a53202419209dfe4272e3`
MD5	`1b6d31ffc1f2a620ef9f8f7ec1cf4c1f`
BLAKE2b-256	`8700a76d9f71331210e1067a15de815e8ca4869e7bfabf1aa91a4b348db85c3b`

See more details on using hashes here.

wiki-bot 1.3.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

WikiBot

Installation

Usage

How It Works

Note on types

Contributions

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes