read google sheets, use them for sites
Project description
# sheetsite: sheets for sites
[![Build Status](https://travis-ci.org/paulfitz/sheetsite.svg?branch=master)](https://travis-ci.org/paulfitz/sheetsite)
[![PyPI version](https://badge.fury.io/py/sheetsite.svg)](http://badge.fury.io/py/sheetsite)
Keep a website or directory in sync with a google sheet.
Features:
* Copy a google spreadsheet locally, as json or excel format.
* Can strip specified tabs, columns, or cells from the spreadsheet,
in case not all of it should be copied along.
* Can push a filtered json copy out to a git repository, handy for
maintaining a website based on a private shared spreadsheet.
* Can augment the sheet with geocoding, adding latitude and longitude based
on address fields for example.
* Can notify people by email with a summary of updates.
## Installation
For the basics:
```
pip install sheetsite
```
For all bells and whistles, when automating a sheet-to-site workflow:
```
pip install sheetsite[queue]
```
## Specifying the source and destination
The `sheetsite` utility, when run without any arguments, will expect
to find all necessary options in a `_sheetsite.yml` file. A simple
example of such a file is:
```yaml
source:
name: google-sheets
key: 15Vs_VGpupeGkljceEow7q1ig447FJIxqNS1Dd0dZpFc
credential_file: service.json
destination:
file: sheet.xlsx
```
The file should have two stanzas, `source` specifying where to get
data from, and `destination` specifying where to put it. This
examples reads a private google spreadsheet and saves it as
`sheet.xlsx`. The key comes from the url of the spreadsheet.
The credentials file is something you [get from google](https://pygsheets.readthedocs.io/en/latest/authorizing.html).
Here's an example that outputs json:
```yaml
source:
name: google-sheets
key: 15Vs_VGpupeGkljceEow7q1ig447FJIxqNS1Dd0dZpFc
credential_file: service.json
destination:
file: _data/directory.json
```
You could now build a static website from that `.json`, see
http://jekyllrb.com/docs/datafiles/ for how, or see an example
at https://github.com/datacommons/commoners
Here's an example that adds some geocoded fields and directly
updates a git repository:
```yaml
source:
name: google-sheets
key: 19UaXhqPQ0QHEfSWS_adDEtPwYstq8llK2YijpvFZcKA
credential_file: service.json
flags:
add:
directory:
- LAT
- LNG
- COUNTRY
- STREET
- REGION
- LOCALITY
destination:
name: git
repo: git@github.com:datacommons/commoners
file: _data/directory.json
```
## Strip private sheets, columns, or cells
By default, sheetsite will strip:
* Any columns whose name is in parentheses, e.g. `(Private Notes)`
* Any cells or text within cells surrounded by double parentheses, e.g. `((private@email.address))`
* Any sheets whose name is in double parentheses, e.g. `((secret sheet))`
## Geocoding
If you have a table with a column called `address`, sheetsite can geocode it for
you and pass along the results. Just add the following in your yaml:
```
flags:
add:
table_name_goes_here:
- latitude
- longitude
- country
- state
- city
- street
- zip
```
You can add just the columns you want. Geocoding results are cached in a `_cache`
directory by default so they do not need to be repeated in future calls to sheetsite.
The full list of columns (with synonyms) available is:
* latitude / lat
* longitude / lng
* latlng
* country
* state / province / region
* city / locality
* street
* zip / postal_code
Normally you won't actually have a stand-alone `address` column. More usually,
information will be spread over multiple columns, or some will be implicit (e.g.
the state/province and country). You can tell sheetsite how to construct addresses
for geocoding by listing columns and constants to build it from. For example:
```
flags:
address:
table_name_goes_here:
- street_address1
- street_address2
- city
- Manitoba
- Canada
add:
table_name_goes_here:
- postal_code
```
This tells sheetsite to produce addresses of the form:
```
<street_address1> <street_address2> <city> Manitoba Canada
```
And add a `postal_code` column populated by geocoding.
It is possible to request columns directly in the spreadsheet. Just
wrap the column name in square brackets, like `[state]` or `[zip]`.
Any blank cells in such columns will be filled using geocoding based
on the address given in that row. If the address columns have not been
configured in `flags` then the address must be present in a single column
literally called `address`.
## Getting credentials
[Obtain credentials for accessing sheets from the Google Developers Console](https://pygsheets.readthedocs.io/en/latest/authorizing.html).
Make sure you share the sheet with the email address in the credentials file. Read-only permission is fine.
## Examples
For example, the map at http://datacommons.coop/tap/ is a visualization
of data pulled from a google spreadsheet, styled using
https://github.com/datacommons/tap via github pages.
## sheetwatch
It can be useful to automate and forget `sheetsite`, so that updates
to a google spreadsheet propagate automatically to their final
destination. The `sheetwatch` utility does this. It requires a queue
server to operate. To install, do:
```
pip install sheetsite[queue]
```
Install any queue server supported by `celery`. For example, `redis`:
```
sudo apt-get install redis-server
redis-server
```
We need to set some environment variables to let `sheetwatch` know
where to find the queue server:
```
export SHEETSITE_BROKER_URL=redis://localhost
export SHEETSITE_RESULT_BACKEND=redis://localhost
```
The `sheetwatch` program needs a cache directory for its operations.
```
export SHEETSITE_CACHE=$HOME/cache/sites
```
Finally, it needs to know where there is a directory full of `yml`
files describing any sheets to monitor and their corresponding sites:
```
export SHEETSITE_LAYOUT=$PWD/sites/enabled
```
We now start a worker:
```
sheetwatch worker
```
The last thing we need to do is check a mailbox from time to time
for sheet change notifications from Google, and kick off site updates
as needed:
```
export GMAIL_USERNAME=*****
export GMAIL_PASSWORD=*****
sheetwatch ping --delay 60
```
## License
sheetsite is distributed under the MIT License.
[![Build Status](https://travis-ci.org/paulfitz/sheetsite.svg?branch=master)](https://travis-ci.org/paulfitz/sheetsite)
[![PyPI version](https://badge.fury.io/py/sheetsite.svg)](http://badge.fury.io/py/sheetsite)
Keep a website or directory in sync with a google sheet.
Features:
* Copy a google spreadsheet locally, as json or excel format.
* Can strip specified tabs, columns, or cells from the spreadsheet,
in case not all of it should be copied along.
* Can push a filtered json copy out to a git repository, handy for
maintaining a website based on a private shared spreadsheet.
* Can augment the sheet with geocoding, adding latitude and longitude based
on address fields for example.
* Can notify people by email with a summary of updates.
## Installation
For the basics:
```
pip install sheetsite
```
For all bells and whistles, when automating a sheet-to-site workflow:
```
pip install sheetsite[queue]
```
## Specifying the source and destination
The `sheetsite` utility, when run without any arguments, will expect
to find all necessary options in a `_sheetsite.yml` file. A simple
example of such a file is:
```yaml
source:
name: google-sheets
key: 15Vs_VGpupeGkljceEow7q1ig447FJIxqNS1Dd0dZpFc
credential_file: service.json
destination:
file: sheet.xlsx
```
The file should have two stanzas, `source` specifying where to get
data from, and `destination` specifying where to put it. This
examples reads a private google spreadsheet and saves it as
`sheet.xlsx`. The key comes from the url of the spreadsheet.
The credentials file is something you [get from google](https://pygsheets.readthedocs.io/en/latest/authorizing.html).
Here's an example that outputs json:
```yaml
source:
name: google-sheets
key: 15Vs_VGpupeGkljceEow7q1ig447FJIxqNS1Dd0dZpFc
credential_file: service.json
destination:
file: _data/directory.json
```
You could now build a static website from that `.json`, see
http://jekyllrb.com/docs/datafiles/ for how, or see an example
at https://github.com/datacommons/commoners
Here's an example that adds some geocoded fields and directly
updates a git repository:
```yaml
source:
name: google-sheets
key: 19UaXhqPQ0QHEfSWS_adDEtPwYstq8llK2YijpvFZcKA
credential_file: service.json
flags:
add:
directory:
- LAT
- LNG
- COUNTRY
- STREET
- REGION
- LOCALITY
destination:
name: git
repo: git@github.com:datacommons/commoners
file: _data/directory.json
```
## Strip private sheets, columns, or cells
By default, sheetsite will strip:
* Any columns whose name is in parentheses, e.g. `(Private Notes)`
* Any cells or text within cells surrounded by double parentheses, e.g. `((private@email.address))`
* Any sheets whose name is in double parentheses, e.g. `((secret sheet))`
## Geocoding
If you have a table with a column called `address`, sheetsite can geocode it for
you and pass along the results. Just add the following in your yaml:
```
flags:
add:
table_name_goes_here:
- latitude
- longitude
- country
- state
- city
- street
- zip
```
You can add just the columns you want. Geocoding results are cached in a `_cache`
directory by default so they do not need to be repeated in future calls to sheetsite.
The full list of columns (with synonyms) available is:
* latitude / lat
* longitude / lng
* latlng
* country
* state / province / region
* city / locality
* street
* zip / postal_code
Normally you won't actually have a stand-alone `address` column. More usually,
information will be spread over multiple columns, or some will be implicit (e.g.
the state/province and country). You can tell sheetsite how to construct addresses
for geocoding by listing columns and constants to build it from. For example:
```
flags:
address:
table_name_goes_here:
- street_address1
- street_address2
- city
- Manitoba
- Canada
add:
table_name_goes_here:
- postal_code
```
This tells sheetsite to produce addresses of the form:
```
<street_address1> <street_address2> <city> Manitoba Canada
```
And add a `postal_code` column populated by geocoding.
It is possible to request columns directly in the spreadsheet. Just
wrap the column name in square brackets, like `[state]` or `[zip]`.
Any blank cells in such columns will be filled using geocoding based
on the address given in that row. If the address columns have not been
configured in `flags` then the address must be present in a single column
literally called `address`.
## Getting credentials
[Obtain credentials for accessing sheets from the Google Developers Console](https://pygsheets.readthedocs.io/en/latest/authorizing.html).
Make sure you share the sheet with the email address in the credentials file. Read-only permission is fine.
## Examples
For example, the map at http://datacommons.coop/tap/ is a visualization
of data pulled from a google spreadsheet, styled using
https://github.com/datacommons/tap via github pages.
## sheetwatch
It can be useful to automate and forget `sheetsite`, so that updates
to a google spreadsheet propagate automatically to their final
destination. The `sheetwatch` utility does this. It requires a queue
server to operate. To install, do:
```
pip install sheetsite[queue]
```
Install any queue server supported by `celery`. For example, `redis`:
```
sudo apt-get install redis-server
redis-server
```
We need to set some environment variables to let `sheetwatch` know
where to find the queue server:
```
export SHEETSITE_BROKER_URL=redis://localhost
export SHEETSITE_RESULT_BACKEND=redis://localhost
```
The `sheetwatch` program needs a cache directory for its operations.
```
export SHEETSITE_CACHE=$HOME/cache/sites
```
Finally, it needs to know where there is a directory full of `yml`
files describing any sheets to monitor and their corresponding sites:
```
export SHEETSITE_LAYOUT=$PWD/sites/enabled
```
We now start a worker:
```
sheetwatch worker
```
The last thing we need to do is check a mailbox from time to time
for sheet change notifications from Google, and kick off site updates
as needed:
```
export GMAIL_USERNAME=*****
export GMAIL_PASSWORD=*****
sheetwatch ping --delay 60
```
## License
sheetsite is distributed under the MIT License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
sheetsite-0.2.1.tar.gz
(21.4 kB
view hashes)