Skip to main content

Making it easy to query APIs via SQL

Project description

https://coveralls.io/repos/github/betodealmeida/shillelagh/badge.svg?branch=master Documentation Status https://badge.fury.io/py/shillelagh.svg PyPI - Python Version docs/logo.png

Shillelagh (ʃɪˈleɪlɪ) is a Python library and CLI that allows you to query many resources (APIs, files, in memory objects) using SQL. It’s both user and developer friendly, making it trivial to access resources and easy to add support for new ones.

Learn more on the documentation.

The library is an implementation of the Python DB API 2.0 based on SQLite (using the APSW library):

from shillelagh.backends.apsw.db import connect

connection = connect(":memory:")
cursor = connection.cursor()

query = "SELECT * FROM a_table"
for row in cursor.execute(query):
    print(row)

There is also a SQLAlchemy dialect:

from sqlalchemy import text
from sqlalchemy.engine import create_engine

engine = create_engine("shillelagh://")
connection = engine.connect()

query = "SELECT * FROM a_table"
for row in connection.execute(text(query)):
    print(row)

And a command-line utility:

$ shillelagh
🍀> SELECT * FROM a_table

There is also an experimental backend that uses Postgres with the Multicorn2 extension, and another one using sqlglot.

To run the Postgres backend, first install the additional dependencies:

$ pip install 'shillelagh[multicorn]'
$ pip install 'multicorn @ git+https://github.com/pgsql-io/multicorn2.git@v2.5'

Then run:

from shillelagh.backends.multicorn.db import connect

connection = connect(
    user="username",
    password="password",
    host="localhost",
    port=5432,
    database="examples",
)

Or:

from sqlalchemy import create_engine
engine = create_engine("shillelagh+multicorn2://username:password@localhost:5432/examples")

For the sqlglot backend you can use shillelagh+sqlglot:// as the connection string, or import connect() from shillelagh.backends.sqlglot.db.

Why SQL?

Sharks have been around for a long time. They’re older than trees and the rings of Saturn, actually! The reason they haven’t changed that much in hundreds of millions of years is because they’re really good at what they do.

SQL has been around for some 50 years for the same reason: it’s really good at what it does.

Why “Shillelagh”?

Picture a leprechaun hitting APIs with a big stick so that they accept SQL.

How is it different?

Shillelagh allows you to easily query non-SQL resources. For example, if you have a Google Spreadsheet you can query it directly as if it were a table in a database:

SELECT country, SUM(cnt)
FROM "https://docs.google.com/spreadsheets/d/1_rN3lm0R_bU3NemO0s9pbFkY5LQPcuy1pscv8ZXPtg8/edit#gid=0"
WHERE cnt > 0
GROUP BY country

You can even run INSERT/DELETE/UPDATE queries against the spreadsheet:

UPDATE "https://docs.google.com/spreadsheets/d/1_rN3lm0R_bU3NemO0s9pbFkY5LQPcuy1pscv8ZXPtg8/edit#gid=0"
SET cnt = cnt + 1
WHERE country != 'BR'

Queries like this are supported by adapters. Currently Shillelagh has the following adapters:

Name

Type

URI pattern

Example URI

CSV

File/API

/path/to/file.csv; http(s)://*

/home/user/sample_data.csv

Datasette

API

http(s)://*

https://global-power-plants.datasettes.com/global-power-plants/global-power-plants

Generic JSON

API

http(s)://*

https://api.stlouisfed.org/fred/series?series_id=GNPCA&api_key=XXX&file_type=json#$.seriess[*]

Generic XML

API

http(s)://*

https://api.congress.gov/v3/bill/118?format=xml&offset=0&limit=2&api_key=XXX#.//bill

GitHub

API

https://api.github.com/repos/${owner}/{$repo}/pulls

https://api.github.com/repos/apache/superset/pulls

GSheets

API

https://docs.google.com/spreadsheets/d/${id}/edit#gid=${sheet_id}

https://docs.google.com/spreadsheets/d/1LcWZMsdCl92g7nA-D6qGRqg1T5TiHyuKJUY1u9XAnsk/edit#gid=0

HTML table

API

http(s)://*

https://en.wikipedia.org/wiki/List_of_countries_and_dependencies_by_population

Pandas

In memory

Any variable name (local or global)

my_df

S3

API

s3://bucket/path/to/file

s3://shillelagh/sample_data.csv

Socrata

API

https://${domain}/resource/${dataset-id}.json

https://data.cdc.gov/resource/unsk-b7fc.json

System

API

system://${resource}

system://cpu?interval=2

WeatherAPI

API

https://api.weatherapi.com/v1/history.json?key=${key}&q=${location}

https://api.weatherapi.com/v1/history.json?key=XXX&q=London

There are also 3rd-party adapters:

A query can combine data from multiple adapters:

INSERT INTO "/tmp/file.csv"
SELECT time, chance_of_rain
FROM "https://api.weatherapi.com/v1/history.json?q=London"
WHERE time IN (
  SELECT datetime
  FROM "https://docs.google.com/spreadsheets/d/1_rN3lm0R_bU3NemO0s9pbFkY5LQPcuy1pscv8ZXPtg8/edit#gid=1648320094"
)

The query above reads timestamps from a Google sheet, uses them to filter weather data from WeatherAPI, and writes the chance of rain into a (pre-existing) CSV file.

New adapters are relatively easy to implement. There’s a step-by-step tutorial that explains how to create a new adapter to an API or filetype.

Installation

Install Shillelagh with pip:

$ pip install 'shillelagh'

You also need to install optional dependencies, depending on the adapter you want to use:

$ pip install 'shillelagh[console]'        # to use the CLI
$ pip install 'shillelagh[genericjsonapi]' # for Generic JSON
$ pip install 'shillelagh[genericxmlapi]'  # for Generic XML
$ pip install 'shillelagh[githubapi]'      # for GitHub
$ pip install 'shillelagh[gsheetsapi]'     # for GSheets
$ pip install 'shillelagh[htmltableapi]'   # for HTML tables
$ pip install 'shillelagh[pandasmemory]'   # for Pandas in memory
$ pip install 'shillelagh[s3selectapi]'    # for S3 files
$ pip install 'shillelagh[systemapi]'      # for CPU information

Alternatively, you can install everything with:

$ pip install 'shillelagh[all]'

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

shillelagh-1.4.3.tar.gz (104.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

shillelagh-1.4.3-py2.py3-none-any.whl (128.7 kB view details)

Uploaded Python 2Python 3

File details

Details for the file shillelagh-1.4.3.tar.gz.

File metadata

  • Download URL: shillelagh-1.4.3.tar.gz
  • Upload date:
  • Size: 104.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.24

File hashes

Hashes for shillelagh-1.4.3.tar.gz
Algorithm Hash digest
SHA256 d78b7c8044bb11d4fb926392a495907c34cec71d6e438d5d102e1bfd842c31f7
MD5 d09624bbe62f5a99fb832f4bd1b2c755
BLAKE2b-256 c401d04a03d0de301b6aeee53cb981cc8fb21200b9bd8b46c29cb233eb67b0cd

See more details on using hashes here.

File details

Details for the file shillelagh-1.4.3-py2.py3-none-any.whl.

File metadata

  • Download URL: shillelagh-1.4.3-py2.py3-none-any.whl
  • Upload date:
  • Size: 128.7 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.24

File hashes

Hashes for shillelagh-1.4.3-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 8f0c5e66994f0ab1242606d469f45904165fa017e3760860670db64e53649487
MD5 5b6a64f0ee396034f76e06a60941ab53
BLAKE2b-256 7475c4a33def5ed3ca982d44fb0b95f345c0e087206756e2f02b935a590d24e1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page