Data scraper infrastructure for OpenBlock (hyperlocal news for Django)
Project description
ebdata
Code to help write scripts that import/crawl/parse data from the web into ebpub, as well as extract (US) street addresses from (English) text.
This package is part of OpenBlock. Originally developed for EveryBlock.com.
For more information, see the documentation or the project website.
Problems can be reported to the issue tracker.
Discussion is on the ebcode google group or the #openblock channel on freenode.
Installation
Do not just try to easy_install or pip install ebdata. It has a lot of specific dependencies which can’t/shouldn’t be captured by setup.py.
Instead, see the full documentation at http://openblockproject.org/docs/install/index.html which includes links to pip requirements files and instructions on preparing your system.
OpenBlock
OpenBlock is a web application that allows users to browse and search their local area for “hyper-local news” - to see what’s going on recently in the immediate geographic area.
For installation instructions and other documentation, see http://openblockproject.org/docs/ (or the .rst files in the docs/ directory).
For help, you can try the ebcode group: http://groups.google.com/group/ebcode or look for us in the #openblock IRC channel on irc.freenode.net.
About the Project
OpenBlock began life as the open-source code released by Everyblock.com in June 2009. Originally created by Adrian Holovaty and the Everyblock team, it is now developed as an open-source (GPL) project by at http://openblockproject.org.
Funding for the initial creation of Everyblock and the ongoing development of OpenBlock has been provided by the Knight Foundation (http://www.knightfoundation.org/).
OpenBlock 1.0.1 (Sept 7, 2011)
This is a minor bugfix (and docs) release, and is mostly identical to 1.0.0.
Bug fixes
The georss scraper now gets coordinates in the right order on the first try, and populates location_name if it falls back to geocoding.
Fix date formatting on newsitem-detail page. (ticket #201)
The import_blocks_tiger and import_blocks_esri scripts had a circular import.
Fix a broken doctest in bootsrap.py.
Documentation
Added docs for cloning an EC2 instance from our Amazon AMI.
Remove nonexistent --city option from geodata docs.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.