Skip to main content

A package to generate SQL statements from CSV files

Project description

Generate Temp Table SQL

Generate Temp Table SQL is a Python package that generates SQL statements for creating a temporary table and inserting data from a CSV file. It's useful when you need to move data between disconnected databases and data warehouses. You can now simply unload a CSV, turn that CSV into SQL statements creating a temp table and inserting data with a CLI command, and copy those SQL statements into your query editor so you can start using the data in a different warehouse.

Why did I build this? I'm often wondering how a product I'm working on is linked to customer spend. However, my operational warehouse that includes data like the dates and times customers used a product is separate from my financial warehouse which has the billing data. I need to move data from one to the other so I can join the finance and product data together to do an analysis...and turning the CSV into SQL by hand in excel is a PITA. With this, I can now move the data in seconds. I simply download the CSV, run generate-tt-sql csv_name.csv in my terminal, and copy it into my query editor. It's immediately available to query so I can go straight to analysis.

Features

  • Load data from a CSV file
  • Generate a CREATE TEMP TABLE SQL statement
  • Generate INSERT INTO SQL statements for the data
  • Command-line interface (CLI) for easy usage

Installation

Prerequisites

  • Python 3.6 or higher
  • pandas library

Installing

  1. Clone the repository:

    git clone https://github.com/rywaldor/generate_temp_table_sql.git
    cd generate_temp_table_sql
    
  2. Install the package locally:

    pip install -e .
    

Usage

Command-Line Interface (CLI)

After installing the package, you can use the generate-tt-sql command to generate SQL statements from a CSV file.

Basic Usage

To generate SQL statements and save them to a file:

generate-tt-sql path/to/your/file.csv

Additional Options

--o The path to the output SQL file.  Defaults to the director you call the command in.
--overwrite: Allow overwriting the output file if it exists.
--table_name: Specify the name of the temporary table to create.
--column_type: Specify the data type of the columns in the temporary table. Defaults to TEXT which works for Redshift and Snowflake. Use STRING for BigQuery.
--batch_size: Specify the number of rows inserted per insert statement.  If present, it creates multiple insert statements based on the batch size specified.

Example

Assume you have a CSV file example.csv with the following content:

name,age,city
John,30,New York
Jane,25,Los Angeles

Run the following command to generate SQL statements:

generate-tt-sql example.csv -o output.sql --table_name my_temp_table --column_type STRING

The output.sql file will contain:

CREATE TEMP TABLE my_temp_table (
    name STRING,
    age STRING,
    city STRING
);

--Insert Data SQL:
INSERT INTO my_temp_table (name, age, city) VALUES 
    ('John', '30', 'New York'),
    ('Jane', '25', 'Los Angeles');

All arguments are optional except the csv

Running Tests

To run the tests, use the following command:

python -m unittest discover -s tests

Contributing

Contributions are welcome! Please follow these steps:

  1. Fork the repository
  2. Create a new branch (git checkout -b feature-branch)
  3. Commit your changes (git commit -am 'Add new feature')
  4. Push to the branch (git push origin feature-branch)
  5. Create a new Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Author

Ryan Waldorf - ryan@ryanwaldorf.com
GitHub LinkedIn

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

generate_temp_table_sql-0.4.4.tar.gz (6.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

generate_temp_table_sql-0.4.4-py3-none-any.whl (7.9 kB view details)

Uploaded Python 3

File details

Details for the file generate_temp_table_sql-0.4.4.tar.gz.

File metadata

  • Download URL: generate_temp_table_sql-0.4.4.tar.gz
  • Upload date:
  • Size: 6.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.9.6

File hashes

Hashes for generate_temp_table_sql-0.4.4.tar.gz
Algorithm Hash digest
SHA256 6b4abc62ac66e3bcc2895955faa6459995590155fc995df286d09f97e9e47ce1
MD5 719063bca84ba262048083ea5e4a6a95
BLAKE2b-256 0e9ca95d8e5a3a6593cbfddc46b50ea08394dc6f7c1ac7a72affa9483ac12fef

See more details on using hashes here.

File details

Details for the file generate_temp_table_sql-0.4.4-py3-none-any.whl.

File metadata

File hashes

Hashes for generate_temp_table_sql-0.4.4-py3-none-any.whl
Algorithm Hash digest
SHA256 5ac873da1e99757e32d1c679bb14a499a1f8790e9e87cab29fba76a8bb1da69f
MD5 574201de550610324bb32710bb294232
BLAKE2b-256 753a4eca574903c0ab81290c88f265547a6410f3e7afe95b80613bb34e59ce55

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page