Skip to main content

A package to generate SQL statements from CSV files

Project description

Generate Temp Table SQL

Generate Temp Table SQL is a Python package that generates SQL statements for creating a temporary table and inserting data from a CSV file. It's useful when you need to move data between disconnected databases and data warehouses. You can now simply unload a CSV, turn that CSV into SQL statements creating a temp table and inserting data with a CLI command, and copy those SQL statements into your query editor so you can start using the data in a different warehouse.

Why did I build this? I often need to move data between disconnected warehouses. E.g. when I need to join data in my finance warehouse with data in my operational warehouse to see how a feature is linked to customer spend. I can use excel to convert a CSV to SQL statements...but it's a PITA.

With generate_temp_table_sql, I can now move the data in seconds. I simply download the CSV, run generate-tt-sql csv_name.csv in my terminal, and copy it into my query editor. It's immediately available to query so I can go straight to analysis.

If you have the same problem, this will save you a ton of time!

Also available in PyPi.

Features

  • Load data from a CSV file
  • Generate a CREATE TEMP TABLE SQL statement
  • Generate INSERT INTO SQL statements for the data
  • Command-line interface (CLI) for easy usage

Installation

Prerequisites

  • Python 3.6 or higher
  • pandas library

Installing

  1. Clone the repository:

    git clone https://github.com/rywaldor/generate_temp_table_sql.git
    cd generate_temp_table_sql
    
  2. Install the package locally:

    pip install -e .
    

Usage

Command-Line Interface (CLI)

After installing the package, you can use the generate-tt-sql command to generate SQL statements from a CSV file.

Basic Usage

To generate SQL statements and save them to a file:

generate-tt-sql path/to/your/file.csv

Additional Options

--o The path to the output SQL file.  Defaults to the director you call the command in.
--overwrite: Allow overwriting the output file if it exists.
--table_name: Specify the name of the temporary table to create.
--column_type: Specify the data type of the columns in the temporary table. Defaults to TEXT which works for Redshift and Snowflake. Use STRING for BigQuery.
--batch_size: Specify the number of rows inserted per insert statement.  If present, it creates multiple insert statements based on the batch size specified.

Example

Assume you have a CSV file example.csv with the following content:

name,age,city
John,30,New York
Jane,25,Los Angeles

Run the following command to generate SQL statements:

generate-tt-sql example.csv -o output.sql --table_name my_temp_table --column_type STRING

The output.sql file will contain:

CREATE TEMP TABLE my_temp_table (
    name STRING,
    age STRING,
    city STRING
);

--Insert Data SQL:
INSERT INTO my_temp_table (name, age, city) VALUES 
    ('John', '30', 'New York'),
    ('Jane', '25', 'Los Angeles');

All arguments are optional except the csv

Running Tests

To run the tests, use the following command:

python -m unittest discover -s tests

Contributing

Contributions are welcome! Please follow these steps:

  1. Fork the repository
  2. Create a new branch (git checkout -b feature-branch)
  3. Commit your changes (git commit -am 'Add new feature')
  4. Push to the branch (git push origin feature-branch)
  5. Create a new Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Author

Ryan Waldorf - ryan@ryanwaldorf.com
GitHub LinkedIn

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

generate_temp_table_sql-0.4.5.tar.gz (6.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

generate_temp_table_sql-0.4.5-py3-none-any.whl (7.9 kB view details)

Uploaded Python 3

File details

Details for the file generate_temp_table_sql-0.4.5.tar.gz.

File metadata

  • Download URL: generate_temp_table_sql-0.4.5.tar.gz
  • Upload date:
  • Size: 6.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.9.6

File hashes

Hashes for generate_temp_table_sql-0.4.5.tar.gz
Algorithm Hash digest
SHA256 69f0ae7366ac1536f799e100de4bd30d3f3536a95ad0be4d5884d20abcb905f7
MD5 def10096d151a39f9094ae5abec93720
BLAKE2b-256 4fe2f9fd28dd9edbdf6fb46f3b49ba2c38ab1857cba6ded6ea0bc4a4f2cd535c

See more details on using hashes here.

File details

Details for the file generate_temp_table_sql-0.4.5-py3-none-any.whl.

File metadata

File hashes

Hashes for generate_temp_table_sql-0.4.5-py3-none-any.whl
Algorithm Hash digest
SHA256 7579a60fb55ac0713b0898e3015d858dda693569f5b8866344e25eb7a800fd66
MD5 fea1c07ce0b2e3248c4e74e3deab43a0
BLAKE2b-256 ca4024435ad67dee12d56340bcd7412ac22186c66054337427aef8bdce31abaa

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page