A downloader for named images containing faces from Wiki servers.
Project description
Wiki Faces:
TLDR
This project downloads images from a Wiki that include human faces. Specifically, images that are associated with certain wikipedia categories.
Installation
Pip Installation Procedure:
From PIP:
pip install wikifaces
From Repo:
git clone git@github.com:tford9/Wiki-Faces-Downloader.git
cd Wiki-Faces-Downloader
python setup.py
pip install wikifaces
Usage
Command-Line Example
python downloader -i "indonesian engineers" -o ../data/ -d
Package Example
from wikifaces.downloader import WikiFace
wikiface_obj = WikiFace()
wikiface_obj.download(categories=['facebook'], depth=2, output_location='../data/')
The following structure is output:
facebook
cached_1_people_pages_d2.pkl
cached_pages_d2.pkl
alan_rushbridger
Alan_Rusbridger_01.jpg-p0.jpg
...
mark_zuckerberg
MarkZuckerbergcrop.jpg-p1.jpg
...
The process is carried out as follows:
- Given a category from a Wiki, collect n pages that contain the same category as well as at least one category containing "people" in the title.
- With those pages, crawl across their included categories and collect y pages that contain those categories as well as at least one "people" category.
- Given the collected Wiki pages, download the primary image from the page and determine if it is a human face using light facial detection.
- We capture all images from the wiki that contain the name of the page (if it's a person then the filename contains their name),
- Using the captured name and images, we create a dataset for that face.
TODOs:
- Currently, a part of this process uses a recursive call structure to get all related pages; there may be a way to linearize, or parallelize this.
- Currently, we are only pulling images contain the person's name in the title and only have one visible face in the image. All other images are not considered. A voting system should be added to get the most represented faces across multiple images.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
wikifaces-1.0.7.tar.gz
(9.9 kB
view hashes)
Built Distribution
Close
Hashes for wikifaces-1.0.7-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2bf820f949761c76d2a348b577ea82ce2c3d8c24f9503d15828afe9373dd29f0 |
|
MD5 | 5bed6938a163d38ce8b3e486b871a9a6 |
|
BLAKE2b-256 | 8048d5d25fd21a07fe3cb7f329618dd50c8598492e4e3df7bf8df7951be78bd3 |