Skip to main content

Build a directory full of files into a SQLite database

Project description

datasette-build

PyPI Changelog Tests License

Build a directory full of files into a SQLite database

⚠️ Early alpha preview. Everything about this tool is likely to change.

Installation

Install this tool using pip or pipx:

pipx install datasette-build

This will provide the datasette-build CLI application.

You can also install it as a Datasette plugin. First install Datasette, then run:

datasette install datasette-build

This will provide a datasette build ... command that works the same as the datasette-build CLI application.

Or you can install it as a plugin for sqlite-utils. With that installed, run this:

sqlite-utils install datasette-build

Now you can access the tool as sqlite-utils build ...

Usage

The datasette-build (or datasette build or sqlite-utils build) command takes two arguments: a path to a SQLite database file and a path to a directory containing files to be loaded into that database:

datasette-build mydatabase.db myfiles/

The myfiles/ folder can contain a mixture of CSV, TSV and JSON files. Each file will be loaded into a table in the mydatabase.db SQLite database.

The database file will be created if it does not already exist.

Consider a myfiles/cities.csv file like this:

id,name,latitude,longitude,country
nyc,New York City,40.7128,-74.006,US
lon,London,51.5074,-0.1278,GB
tok,Tokyo,35.6895,139.6917,JP
par,Paris,48.8566,2.3522,FR
ber,Berlin,52.52,13.405,DE
syd,Sydney,-33.8688,151.2093,AU
cai,Cairo,30.0444,31.2357,EG
rio,Rio de Janeiro,-22.9068,-43.1729,BR
mos,Moscow,55.7558,37.6173,RU
mum,Mumbai,19.076,72.8777,IN

Since this has a id column the primary key for the table will be set to id. Without an id column the primary key will not be defined.

A myfiles/counties.tsv file could look like this:

id	name	population
US	United States	331002651
GB	United Kingdom	67886011
JP	Japan	126476461
FR	France	65273511
DE	Germany	83783942
AU	Australia	25499884
EG	Egypt	102334404
BR	Brazil	212559417
RU	Russia	145934462
IN	India	1380004385

And a myfiles/museums.json file like this:

[
  {
    "id": 1,
    "name": "Metropolitan Museum of Art",
    "city_id": "nyc"
  },
  {
    "id": 2,
    "name": "British Museum",
    "city_id": "lon"
  }
]

Running datasette-build mydatabase.db myfiles/ will create a SQLite database file containing three tables: cities, counties and museums. The schema will look like this:

CREATE TABLE [museums] (
   [id] INTEGER PRIMARY KEY,
   [name] TEXT,
   [city_id] TEXT
);
CREATE TABLE "cities" (
   [id] TEXT PRIMARY KEY,
   [name] TEXT,
   [latitude] FLOAT,
   [longitude] FLOAT,
   [country] TEXT
);
CREATE TABLE "countries" (
   [id] TEXT PRIMARY KEY,
   [name] TEXT,
   [population] INTEGER
);

Development

To set up this plugin locally, first checkout the code. Then create a new virtual environment:

cd datasette-build
python3 -m venv venv
source venv/bin/activate

Now install the dependencies and test dependencies:

pip install -e '.[test]'

To run the tests:

pytest

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datasette-build-0.1a0.tar.gz (9.5 kB view details)

Uploaded Source

Built Distribution

datasette_build-0.1a0-py3-none-any.whl (10.4 kB view details)

Uploaded Python 3

File details

Details for the file datasette-build-0.1a0.tar.gz.

File metadata

  • Download URL: datasette-build-0.1a0.tar.gz
  • Upload date:
  • Size: 9.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/4.0.2 CPython/3.11.7

File hashes

Hashes for datasette-build-0.1a0.tar.gz
Algorithm Hash digest
SHA256 f9d12862ab86f9a77a8d69c432df3a0fa6673b77e2c3348d925571cf42e2d3ec
MD5 72d30b2040ce9e949f39f114921f869b
BLAKE2b-256 7332dccb279c4ea99fb418918d7bb6c0a2843e56d3182727a2c40d6656892258

See more details on using hashes here.

File details

Details for the file datasette_build-0.1a0-py3-none-any.whl.

File metadata

File hashes

Hashes for datasette_build-0.1a0-py3-none-any.whl
Algorithm Hash digest
SHA256 04388c38c66ec49625d49807aaf4da3e730605654f4a9fbb9641b9ef6e333a71
MD5 c5fb4c42d34b66d1f5755ef87fed2522
BLAKE2b-256 6d5967188661678b623866aa69099862664a49181ba127444e1c04c8f03e5a73

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page