Skip to main content

Local version of scraperwiki scraperlibs

Project description

Local ScraperWiki Python Library
====
This library aims to be a drop-in replacement for the Python `scraperwiki` library
for use locally. That is, functions will work the same way, and data will go
into a local SQLite database; a targeted bombing of ScraperWiki's servers
will not stop this local library from working, unless you happen to be running
it on one of ScraperWiki's servers.

## Installing

This will soon be in PyPI, but for now you can just install from the git repository.

## Documentation

Read the standard ScraperWiki Python library's [documentation](https://scraperwiki.com/docs/python/python_help_documentation/),
then look below for some quirks about the local version.

## Quirks

The local library aims to be a drop-in replacement.
In reality, the local version sometimes works better,
though not all of the features have been implemented.

## Differences

### Datastore differences

The local `scraperwiki.sqlite` is powered by
[DumpTruck](http://dumptruck.io), so some things
work a bit differently.

Data is stored to a local sqlite database named `scraperwiki.sqlite`.

Bizarre table and column names are supported.

`scraperwiki.sqlite.execute` will return an empty list of keys on an
empty select statement result.

`scraperwiki.sqlite.attach` downloads the whole datastore from ScraperWiki,
So you might not want to use this too often on large databases.

### Other Differences

## Status of implementation
In general, features that have not been implemented raise a `NotImplementedError`.

### Datastore
`scraperwiki.sqlite` is missing the following features.

* All of the `verbose` keyword arguments (These control what is printed on the ScraperWiki code editor)

### Geo
The UK geocoding helpers (`scraperwiki.geo`) documented on scraperwiki.com have been implemented. They partially depend on scraperwiki.com being available.

<!-- They have also been released as a separate library (ukgeo). Not true yet. -->

### Utils
`scraperwiki.utils` is implemented, as well as the following functions.

* `scraperwiki.log`
* `scraperwiki.scrape`
* `scraperwiki.pdftoxml`
* `scraperwiki.swimport`

### Deprecated
These submodules are deprecated and thus will not be implemented.

* `scraperwiki.apiwrapper`
* `scraperwiki.datastore`
* `scraperwiki.jsqlite`
* `scraperwiki.metadata`
* `scraperwiki.newsql`

### Development

Run tests with `./runtests`; this small wrapper cleans up after itself.

### Specs
Here are some ScraperWiki scrapers that demonstrate the non-local library's quirks.

https://scraperwiki.com/scrapers/scraperwiki_local/
https://scraperwiki.com/scrapers/cast/
https://scraperwiki.com/scrapers/things_happen_when_you_do_not_commit/
https://scraperwiki.com/scrapers/what_does_show_tables_return/
https://scraperwiki.com/scrapers/on_conflict/
https://scraperwiki.com/scrapers/spaces_in_table_names/
https://scraperwiki.com/scrapers/spaces_in_table_names_1/

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for scraperwiki_local, version 0.0.5
Filename, size File type Python version Upload date Hashes
Filename, size scraperwiki_local-0.0.5.tar.gz (15.1 kB) File type Source Python version None Upload date Hashes View

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring DigiCert DigiCert EV certificate Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page