Mathematics Genealogy Project Spider
Project description
mathgenproject
Mathematics Genealogy Project Spider
A webspider for the Mathematics Genealogy Project.
Installation
Use the package manager pip to install the spider.
pip install mathgenproject
Usage
Define a pipeline through which to process each mathematician returned from the spider.
class MyPipeline(object): def open_spider(self, spider): ... def process_item(self, item, spider): print(item['name']) return item def close_spider(self, spider): ...
Run the spider using scrapy's CrawlerProcess
, passing in the mathematician's MGP ID.
from scrapy.crawler import CrawlerProcess from mathgenproject.spiders import MathGenProjectSpider process = CrawlerProcess(settings={ 'FEED_FORMAT': 'json', 'FEED_URI': 'items.json', 'ITEM_PIPELINES': { 'MyPipeline': 300, }, }) process.crawl(MathGenProjectSpider, mgp_id='216087') process.start()
Contributing
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Filename, size | File type | Python version | Upload date | Hashes |
---|---|---|---|---|
Filename, size mathgenproject_jgolden17-0.1.0-py2.py3-none-any.whl (7.9 kB) | File type Wheel | Python version py2.py3 | Upload date | Hashes View |
Filename, size mathgenproject-jgolden17-0.1.0.tar.gz (5.1 kB) | File type Source | Python version None | Upload date | Hashes View |
Close
Hashes for mathgenproject_jgolden17-0.1.0-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a706dd17329530377e3286cb8a6cb176084fa968c5dfc5e9bc873c3e04f87c9f |
|
MD5 | 94eeebe4e4666d585045b3fbbd7494d6 |
|
BLAKE2-256 | 338ffdbfbcb43a68f0a4b2683f4cf22d7f57809338bc7395bdb1798bea4f68c3 |
Close
Hashes for mathgenproject-jgolden17-0.1.0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 84d85c6dd74adb1ad6646f3fc8ede4d06a012adc87247af4527c31e3ccfd7610 |
|
MD5 | 82c35cbcc20f8a9bf219d97a115a2337 |
|
BLAKE2-256 | 8feed465fb2d72a4f1984609ebbdb238736533aa3c3750cfe23e371aa9849849 |