Convert CSV files into a SQLite database

These details have been verified by PyPI

Project links

Homepage

GitHub Statistics

Maintainers

simonw

These details have not been verified by PyPI

Project description

csvs-to-sqlite

Convert CSV files into a SQLite database. Browse and publish that SQLite database with Datasette.

[!NOTE] This tool is infrequently maintained. I suggest using sqlite-utils for importing CSV and TSV to SQLite instead for most cases.

Basic usage:

csvs-to-sqlite myfile.csv mydatabase.db

This will create a new SQLite database called mydatabase.db containing a single table, myfile, containing the CSV content.

You can provide multiple CSV files:

csvs-to-sqlite one.csv two.csv bundle.db

The bundle.db database will contain two tables, one and two.

This means you can use wildcards:

csvs-to-sqlite ~/Downloads/*.csv my-downloads.db

If you pass a path to one or more directories, the script will recursively search those directories for CSV files and create tables for each one.

csvs-to-sqlite ~/path/to/directory all-my-csvs.db

Handling TSV (tab-separated values)

You can use the -s option to specify a different delimiter. If you want to use a tab character you'll need to apply shell escaping like so:

csvs-to-sqlite my-file.tsv my-file.db -s $'\t'

Refactoring columns into separate lookup tables

Let's say you have a CSV file that looks like this:

county,precinct,office,district,party,candidate,votes
Clark,1,President,,REP,John R. Kasich,5
Clark,2,President,,REP,John R. Kasich,0
Clark,3,President,,REP,John R. Kasich,7

(Real example taken from the Open Elections project)

You can now convert selected columns into separate lookup tables using the new --extract-column option (shortname: -c) - for example:

csvs-to-sqlite openelections-data-*/*.csv \
    -c county:County:name \
    -c precinct:Precinct:name \
    -c office -c district -c party -c candidate \
    openelections.db

The format is as follows:

column_name:optional_table_name:optional_table_value_column_name

If you just specify the column name e.g. -c office, the following table will be created:

CREATE TABLE "office" (
    "id" INTEGER PRIMARY KEY,
    "value" TEXT
);

If you specify all three options, e.g. -c precinct:Precinct:name the table will look like this:

CREATE TABLE "Precinct" (
    "id" INTEGER PRIMARY KEY,
    "name" TEXT
);

The original tables will be created like this:

CREATE TABLE "ca__primary__san_francisco__precinct" (
    "county" INTEGER,
    "precinct" INTEGER,
    "office" INTEGER,
    "district" INTEGER,
    "party" INTEGER,
    "candidate" INTEGER,
    "votes" INTEGER,
    FOREIGN KEY (county) REFERENCES County(id),
    FOREIGN KEY (party) REFERENCES party(id),
    FOREIGN KEY (precinct) REFERENCES Precinct(id),
    FOREIGN KEY (office) REFERENCES office(id),
    FOREIGN KEY (candidate) REFERENCES candidate(id)
);

They will be populated with IDs that reference the new derived tables.

Installation

pip install csvs-to-sqlite

csvs-to-sqlite now requires Python 3. If you are running Python 2 you can install the last version to support Python 2:

pip install csvs-to-sqlite==0.9.2

csvs-to-sqlite --help

Usage: csvs-to-sqlite [OPTIONS] PATHS... DBNAME

  PATHS: paths to individual .csv files or to directories containing .csvs

  DBNAME: name of the SQLite database file to create

Options:
  -s, --separator TEXT            Field separator in input .csv
  -q, --quoting INTEGER           Control field quoting behavior per csv.QUOTE_*
                                  constants. Use one of QUOTE_MINIMAL (0),
                                  QUOTE_ALL (1), QUOTE_NONNUMERIC (2) or
                                  QUOTE_NONE (3).
  --skip-errors                   Skip lines with too many fields instead of
                                  stopping the import
  --replace-tables                Replace tables if they already exist
  -t, --table TEXT                Table to use (instead of using CSV filename)
  -c, --extract-column TEXT       One or more columns to 'extract' into a
                                  separate lookup table. If you pass a simple
                                  column name that column will be replaced with
                                  integer foreign key references to a new table
                                  of that name. You can customize the name of
                                  the table like so:     state:States:state_name
                                  
                                  This will pull unique values from the 'state'
                                  column and use them to populate a new 'States'
                                  table, with an id column primary key and a
                                  state_name column containing the strings from
                                  the original column.
  -d, --date TEXT                 One or more columns to parse into ISO
                                  formatted dates
  -dt, --datetime TEXT            One or more columns to parse into ISO
                                  formatted datetimes
  -df, --datetime-format TEXT     One or more custom date format strings to try
                                  when parsing dates/datetimes
  -pk, --primary-key TEXT         One or more columns to use as the primary key
  -f, --fts TEXT                  One or more columns to use to populate a full-
                                  text index
  -i, --index TEXT                Add index on this column (or a compound index
                                  with -i col1,col2)
  --shape TEXT                    Custom shape for the DB table - format is
                                  csvcol:dbcol(TYPE),...
  --filename-column TEXT          Add a column with this name and populate with
                                  CSV file name
  --fixed-column <TEXT TEXT>...   Populate column with a fixed string
  --fixed-column-int <TEXT INTEGER>...
                                  Populate column with a fixed integer
  --fixed-column-float <TEXT FLOAT>...
                                  Populate column with a fixed float
  --no-index-fks                  Skip adding index to foreign key columns
                                  created using --extract-column (default is to
                                  add them)
  --no-fulltext-fks               Skip adding full-text index on values
                                  extracted using --extract-column (default is
                                  to add them)
  --just-strings                  Import all columns as text strings by default
                                  (and, if specified, still obey --shape,
                                  --date/datetime, and --datetime-format)
  --version                       Show the version and exit.
  --help                          Show this message and exit.

Project details

These details have been verified by PyPI

Project links

Homepage

GitHub Statistics

Maintainers

simonw

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

1.3.1

Apr 10, 2025

1.3

Nov 18, 2021

1.2

Nov 3, 2020

1.1

Aug 9, 2020

1.0

Aug 3, 2019

0.9.2

Jul 3, 2019

0.9.1

Jun 24, 2019

0.9

Jan 17, 2019

0.8.1

Apr 24, 2018

0.8

Apr 24, 2018

0.7

Nov 26, 2017

0.6.1

Nov 25, 2017

0.6

Nov 24, 2017

0.5

Nov 19, 2017

0.3

Nov 17, 2017

0.2

Nov 13, 2017

0.1

Nov 13, 2017

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

csvs_to_sqlite-1.3.1.tar.gz (22.7 kB view details)

Uploaded Apr 10, 2025 Source

Built Distribution

csvs_to_sqlite-1.3.1-py2.py3-none-any.whl (16.9 kB view details)

Uploaded Apr 10, 2025 Python 2Python 3

File details

Details for the file csvs_to_sqlite-1.3.1.tar.gz.

File metadata

Download URL: csvs_to_sqlite-1.3.1.tar.gz
Upload date: Apr 10, 2025
Size: 22.7 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for csvs_to_sqlite-1.3.1.tar.gz
Algorithm	Hash digest
SHA256	`6a54e29168f35074639700b66be01c66e5aff5e771607fe62d588161b22f517a`
MD5	`26acbeb0c8cda3421b6903cf9927a70f`
BLAKE2b-256	`90121e3d87cf57a0eac7da70c949c3a02211752cda4f8786ee9e3dce8c0dfb46`

See more details on using hashes here.

Provenance

The following attestation bundles were made for csvs_to_sqlite-1.3.1.tar.gz:

Publisher: publish.yml on simonw/csvs-to-sqlite

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: csvs_to_sqlite-1.3.1.tar.gz
- Subject digest: 6a54e29168f35074639700b66be01c66e5aff5e771607fe62d588161b22f517a
- Sigstore transparency entry: 195100198
- Sigstore integration time: Apr 10, 2025
Source repository:
- Permalink: simonw/csvs-to-sqlite@33b79ed80a688ceae4486ebdcca28ae689f54c03
- Branch / Tag: refs/tags/1.3.1
- Owner: https://github.com/simonw
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@33b79ed80a688ceae4486ebdcca28ae689f54c03
- Trigger Event: release

File details

Details for the file csvs_to_sqlite-1.3.1-py2.py3-none-any.whl.

File metadata

Download URL: csvs_to_sqlite-1.3.1-py2.py3-none-any.whl
Upload date: Apr 10, 2025
Size: 16.9 kB
Tags: Python 2, Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for csvs_to_sqlite-1.3.1-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`0f92a8239a659b15bbaf0e213be40de3ff06a30cb909c0b56b91b0047b8c6329`
MD5	`9b08ea740f99c7be0c99b625b21006b3`
BLAKE2b-256	`700db07e2b4269e47d1ee12f0c269d2450fc7529a48fd43a99bfa39a6d228f3f`

See more details on using hashes here.

Provenance

The following attestation bundles were made for csvs_to_sqlite-1.3.1-py2.py3-none-any.whl:

Publisher: publish.yml on simonw/csvs-to-sqlite

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: csvs_to_sqlite-1.3.1-py2.py3-none-any.whl
- Subject digest: 0f92a8239a659b15bbaf0e213be40de3ff06a30cb909c0b56b91b0047b8c6329
- Sigstore transparency entry: 195100204
- Sigstore integration time: Apr 10, 2025
Source repository:
- Permalink: simonw/csvs-to-sqlite@33b79ed80a688ceae4486ebdcca28ae689f54c03
- Branch / Tag: refs/tags/1.3.1
- Owner: https://github.com/simonw
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@33b79ed80a688ceae4486ebdcca28ae689f54c03
- Trigger Event: release

csvs-to-sqlite 1.3.1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

csvs-to-sqlite

Handling TSV (tab-separated values)

Refactoring columns into separate lookup tables

Installation

csvs-to-sqlite --help

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance