Skip to main content

A simple CLI to query cloud storage objects with SQL.

Project description

S3SQL

Overview

S3SQL is a lightweight command line utility for querying data stored in s3.

Features

  • Return S3 data straight to your terminal
  • Write SQL queries to filter and manipulate S3 data
  • List S3 objects available to current scoped credentials

Demo

s3sql

Installation

  1. Install with:
    pip install s3sql
    

Usage

  1. Open a new terminal window

  2. Test the CLI is properly configured with:

    s3sql --help
    
  3. Set the AWS access key with:

    s3sql set-key --api-key AKIA*******
    

    This command returns the following message:

    API secret saved successfully!
    

[!TIP] Credentials are stored in the ~/s3sql directory. I.E. Windows = C:\Users\Ian\s3sql\credentials MacOS = /Users/Ian/s3sql

  1. Set the AWS access secret with:

    s3sql set-secret --api-secret Gy096********
    

    This command returns the following message:

    API key saved successfully!
    

[!TIP] Credentials are stored in the ~/s3sql directory. I.E. Windows = C:\Users\Ian\s3sql\credentials MacOS = /Users/Ian/s3sql

  1. List objects available. For example, to list the objects in the bucket named: osg-repo-scan-data:

    s3sql ls --bucket "s3sql-demo"
    

    This command returns a formatted table of the objects available in the bucket:

    Query executed in 0.3216 seconds
    +------------------------------------------+---------------------------+------------------------------------+---------------------+--------+----------------+
    | Key                                      | LastModified              | ETag                               | ChecksumAlgorithm   |   Size |StorageClass   |
    +==========================================+===========================+====================================+=====================+========+================+
    | folder_example/                          | 2025-06-13 01:22:05+00:00 | "d41d8cd98f00b204e9800998ecf8427e" | ['CRC64NVME']       |      0 | STANDARD       |
    +------------------------------------------+---------------------------+------------------------------------+---------------------+--------+----------------+
    | folder_example/sql_database_releases.csv | 2025-06-13 01:57:02+00:00 | "90089983c8e30097002094756b5f7478" | ['CRC64NVME']       |    438 | STANDARD       |
    +------------------------------------------+---------------------------+------------------------------------+---------------------+--------+----------------+
    | sql_database_features.csv                | 2025-06-13 01:45:38+00:00 | "8a46b42d596d9310bcd4ac9db14df718" | ['CRC64NVME']       |   1991 | STANDARD       |
    +------------------------------------------+---------------------------+------------------------------------+---------------------+--------+----------------+
    | sql_engines.csv                          | 2025-06-13 01:21:49+00:00 | "586f00530a11fcbb46c244210f625292" | ['CRC64NVME']       |    407 | STANDARD       |
    +------------------------------------------+---------------------------+------------------------------------+---------------------+--------+----------------+
    | sql_user_base.csv                        | 2025-06-13 01:45:58+00:00 | "ff5e96d2ca7f3475dc38c537cc1f6c36" | ['CRC64NVME']       |    597 | STANDARD       |
    +------------------------------------------+---------------------------+------------------------------------+---------------------+--------+----------------+
    
  2. Query an object with the following command:

    s3sql query --uri "s3://s3sql-demo/sql_engines.csv" --sql "SELECT * FROM df WHERE 1=1"
    

    This command returns a formatted table of the data from the specified object:

    Query executed in 0.4116 seconds
    +------+---------------+-----------+--------------------+-------------------------------------+-------------------------+
    |   Id | engine_name   |   version | license_type       | developer                           | primary_use_case        |
    +======+===============+===========+====================+=====================================+=========================+
    |    1 | SQLite        |      3.46 | Public Domain      | D. Richard Hipp                     | Embedded systems        |
    +------+---------------+-----------+--------------------+-------------------------------------+-------------------------+
    |    2 | PostgreSQL    |     16.4  | PostgreSQL License | PostgreSQL Global Development Group | General-purpose OLTP    |
    +------+---------------+-----------+--------------------+-------------------------------------+-------------------------+
    |    3 | MySQL         |      8.4  | GPLv2              | Oracle Corporation                  | Web applications        |
    +------+---------------+-----------+--------------------+-------------------------------------+-------------------------+
    |    4 | SQLServer     |   2022    | Proprietary        | Microsoft Corporation               | Enterprise applications |
    +------+---------------+-----------+--------------------+-------------------------------------+-------------------------+
    |    5 | DuckDB        |      1    | MIT License        | DuckDB Labs                         | Analytical queries      |
    +------+---------------+-----------+--------------------+-------------------------------------+-------------------------+
    

    Apply a LIMIT 1 to the previous query:

    s3sql query --uri "s3://s3sql-demo/sql_engines.csv" --sql "SELECT * FROM df WHERE 1=1 LIMIT 1"
    

    This command returns a formatted table of the data from the specified object:

    Query executed in 0.3426 seconds
    +------+---------------+-----------+----------------+-----------------+--------------------+
    |   Id | engine_name   |   version | license_type   | developer       | primary_use_case   |
    +======+===============+===========+================+=================+====================+
    |    1 | SQLite        |      3.46 | Public Domain  | D. Richard Hipp | Embedded systems   |
    +------+---------------+-----------+----------------+-----------------+--------------------+
    
  3. Query an object and output the query results to a file with the following command:

    s3sql query --uri "s3://s3sql-demo/folder_example/sql_database_releases.csv" --sql "SELECT * FROM df WHERE 1=1 LIMIT 1" --out "output.csv"
    

    This command returns a formatted table of the data with an additional message specifying the filename the data was written to:

    Query executed in 0.3414 seconds
    +---------------+------------------------+-----------------+-----------------------+--------------------------+
    | engine_name   |   initial_release_year | designer        | organization          | current_stable_version   |
    +===============+========================+=================+=======================+==========================+
    | SQLite        |                   2000 | D. Richard Hipp | Hipp Wyrick & Company | 3.50.1                   |
    +---------------+------------------------+-----------------+-----------------------+--------------------------+
    Data successfully written to file: output.csv
    
  4. Query an object within the folder_example folder with the following command:

    s3sql query --uri "s3://s3sql-demo/folder_example/sql_database_releases.csv" --sql "SELECT * FROM df WHERE 1=1"
    

    This command returns a formatted table of the data from the specified object:

    Query executed in 0.3591 seconds
    +---------------+------------------------+-------------------------------------+------------------------------------+--------------------------+
    | engine_name   |   initial_release_year | designer                            | organization                       | current_stable_version   |
    +===============+========================+=====================================+====================================+==========================+
    | SQLite        |                   2000 | D. Richard Hipp                     | Hipp Wyrick & Company              | 3.50.1                   |
    +---------------+------------------------+-------------------------------------+------------------------------------+--------------------------+
    | PostgreSQL    |                   1986 | Michael Stonebraker                 | University of California  Berkeley | 16.4                     |
    +---------------+------------------------+-------------------------------------+------------------------------------+--------------------------+
    | MySQL         |                   1995 | Michael Widenius and David Axmark   | MySQL AB                           | 8.4                      |
    +---------------+------------------------+-------------------------------------+------------------------------------+--------------------------+
    | SQLServer     |                   1989 | Donald Chamberlin and Raymond Boyce | Microsoft Corporation              | 2022                     |
    +---------------+------------------------+-------------------------------------+------------------------------------+--------------------------+
    | DuckDB        |                   2019 | Mark Raasveldt and Hannes Muhleisen | Centrum Wiskunde & Informatica     | 1.0.0                    |
    +---------------+------------------------+-------------------------------------+------------------------------------+--------------------------+
    
  5. Query an object and save the output to your local file system using the following command:

    s3sql query --uri "s3://s3sql-demo/folder_example/sql_database_releases.csv" --sql "SELECT * FROM df WHERE 1=1 LIMIT 1" --out "output.csv"
    

    This command returns a formatted table of the data from the specified object and an additional Data successfully written to file: output.csv:

    Query executed in 0.3391 seconds
    +---------------+------------------------+-------------------------------------+------------------------------------+--------------------------+
    | engine_name   |   initial_release_year | designer                            | organization                       | current_stable_version   |
    +===============+========================+=====================================+====================================+==========================+
    | SQLite        |                   2000 | D. Richard Hipp                     | Hipp Wyrick & Company              | 3.50.1                   |
    +---------------+------------------------+-------------------------------------+------------------------------------+--------------------------+
    | PostgreSQL    |                   1986 | Michael Stonebraker                 | University of California  Berkeley | 16.4                     |
    +---------------+------------------------+-------------------------------------+------------------------------------+--------------------------+
    | MySQL         |                   1995 | Michael Widenius and David Axmark   | MySQL AB                           | 8.4                      |
    +---------------+------------------------+-------------------------------------+------------------------------------+--------------------------+
    | SQLServer     |                   1989 | Donald Chamberlin and Raymond Boyce | Microsoft Corporation              | 2022                     |
    +---------------+------------------------+-------------------------------------+------------------------------------+--------------------------+
    | DuckDB        |                   2019 | Mark Raasveldt and Hannes Muhleisen | Centrum Wiskunde & Informatica     | 1.0.0                    |
    +---------------+------------------------+-------------------------------------+------------------------------------+--------------------------+
    Data successfully written to file: output.csv
    
  6. Query a .csv object and convert it to a .parquet file with the following command:

s3sql query --uri "s3://s3sql-demo/sql_engines.csv" --sql "SELECT * FROM df WHERE 1=1 LIMIT 1" --out "output.parquet"

This command returns a formatted table of the data from the specified object and an additional Data successfully written to file: output.parquet:

Query executed in 0.3399 seconds
+------+---------------+-----------+----------------+-----------------+--------------------+
|   Id | engine_name   |   version | license_type   | developer       | primary_use_case   |
+======+===============+===========+================+=================+====================+
|    1 | SQLite        |      3.46 | Public Domain  | D. Richard Hipp | Embedded systems   |
+------+---------------+-----------+----------------+-----------------+--------------------+
Data successfully written to file: output.parquet

Directory Structure

project/
├── .github/
│   └── workflows/
│       └── publish.yml
├── dist/
│   └── s3sql-*.*.*-py3-none-any.whl
│   └── s3sql-*.*.*.tar.gz
├── s3sql/
│   └── cli.py
│   └── test_cli.py
├── .env
├── .gitignore
├── mycli.spec
├── poetry.lock
└── README.md

Testing

  • Use the PyTest VS code extension or the following command to run all tests pytest -s s3sql/test_cli.py:

    image

Github

https://github.com/Ian-Fogelman/s3sql

Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/your-feature)
  3. Rebase your current branch (git pull --rebase)
  4. Add changes to your commit (git add .)
  5. Commit your changes (git commit -m 'Add your feature')
  6. Push to the branch (git push origin feature/your-feature)
  7. Create a Pull Request

PyPI

https://pypi.org/project/s3sql/

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

s3sql-0.0.6.tar.gz (6.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

s3sql-0.0.6-py3-none-any.whl (7.1 kB view details)

Uploaded Python 3

File details

Details for the file s3sql-0.0.6.tar.gz.

File metadata

  • Download URL: s3sql-0.0.6.tar.gz
  • Upload date:
  • Size: 6.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.11.5 Windows/10

File hashes

Hashes for s3sql-0.0.6.tar.gz
Algorithm Hash digest
SHA256 4cb85cb22c3a5adbda0048b5f6bd21157fe1a82ad8c58add1fcd4c07fee42dd4
MD5 7eee41591f4fcf94b3b00f385949e754
BLAKE2b-256 6b502b9442264a54ed00c49f249b514818a93610546811f857f5cb05ba89c396

See more details on using hashes here.

File details

Details for the file s3sql-0.0.6-py3-none-any.whl.

File metadata

  • Download URL: s3sql-0.0.6-py3-none-any.whl
  • Upload date:
  • Size: 7.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.11.5 Windows/10

File hashes

Hashes for s3sql-0.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 660ab62215d76405a706c38de170591e676ee7f596c6665ba29e69ad6cbee597
MD5 d71a0a83867fd00b199564f497071658
BLAKE2b-256 25c0449338f4dd6710eed66746ea0d36544d11c8377ea6cf1ae35fd51733ad96

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page