Data7 streams CSV/Parquet datasets over HTTP from SQL queries.
Project description
Data7 - Dynamic datasets the easy way
Pronounced data·set (7 like sept in French).
The idea 💡
TL;DR Data7 is a high performance web server that generates dynamic datasets (in CSV or Parquet formats) from existing databases and stream them over HTTP 🎉
Example usage
Let say you have a restaurant
table in your wonderful-places
PostgreSQL
database, and you want to make this table an always-up-to-date dataset that can
be easily used by the rest of the world. All you have to do is edit Data7
configuration as follow:
#
# Data7 configuration file
#
# config.yaml
#
datasets:
- basename: restaurants
query: "SELECT * FROM restaurant"
Fire up the data7
server:
data7 start
And :boom: your dataset is available at:
- https://data7.wonderful-places.org/d/restaurants.csv
- https://data7.wonderful-places.org/d/restaurants.parquet
Getting started
To quickly start contributing to this project, we got you covered! Once you've
cloned the project, use GNU Make to ease your life (make
and curl
are
required).
# Clone the project somewhere on your system
git clone git@github.com:jmaupetit/data7.git
# Enter the project's root directory
cd data7
# Prepare your working environment
make bootstrap
You can now start the development server:
make run
Test development endpoints:
# CSV format (displayed in the terminal)
curl http://localhost:8000/d/invoices.csv
# Parquet format (downloaded locally)
curl -O http://localhost:8000/d/invoices.parquet
# Check that the file exists
ls invoices.parquet
You can run quality checks using dedicated GNU Make rules:
# Run the tests suite
make test
# Linters!
make lint
Happy hacking 😻
License
This work is released under the MIT License (see LICENSE).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.