Skip to main content

Data7 streams CSV/Parquet datasets over HTTP from SQL queries.

Project description

Data7 - Dynamic datasets the easy way

Pronounced data·set (7 like sept in French).

GitHub Actions Workflow Status PyPI - Version

The idea 💡

TL;DR Data7 is a high performance web server that generates dynamic datasets (in CSV or Parquet formats) from existing databases and stream them over HTTP 🎉

Example usage

Let say you have a restaurant table in your wonderful-places PostgreSQL database, and you want to make this table an always-up-to-date dataset that can be easily used by the rest of the world. All you have to do is edit Data7 configuration as follow:

#
# Data7 configuration file
#
# config.yaml
#
datasets:
  - basename: restaurants
    query: "SELECT * FROM restaurant"

Fire up the data7 server:

data7 start

And :boom: your dataset is available at:

Getting started

To quickly start contributing to this project, we got you covered! Once you've cloned the project, use GNU Make to ease your life (make and curl are required).

# Clone the project somewhere on your system
git clone git@github.com:jmaupetit/data7.git

# Enter the project's root directory
cd data7

# Prepare your working environment
make bootstrap

You can now start the development server:

make run

Test development endpoints:

# CSV format (displayed in the terminal)
curl http://localhost:8000/d/invoices.csv

# Parquet format (downloaded locally)
curl -O http://localhost:8000/d/invoices.parquet

# Check that the file exists
ls invoices.parquet

You can run quality checks using dedicated GNU Make rules:

# Run the tests suite
make test

# Linters!
make lint

Happy hacking 😻

License

This work is released under the MIT License (see LICENSE).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

data7-0.2.0.tar.gz (9.5 kB view hashes)

Uploaded Source

Built Distribution

data7-0.2.0-py3-none-any.whl (10.4 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page