Skip to main content

A simple CLI to query cloud storage objects with SQL.

Project description

S3SQL

Overview

S3SQL is a lightweight command line utility for querying data stored in s3.

Features

  • Return S3 data straight to your terminal
  • Write SQL queries to filter and manipulate S3 data
  • List S3 objects available to current scoped credentials

Demo

s3sql

Installation

  1. Install with:
    pip install s3sql
    

Usage

  1. Open a new terminal window

  2. Test the CLI is properly configured with:

    s3sql --help
    
  3. Set the AWS access key with:

    s3sql set-key --api-key AKIA*******
    

    This command returns the following message:

    API secret saved successfully!
    

[!TIP] Credentials are stored in the ~/s3sql directory. I.E. Windows = C:\Users\Ian\s3sql\credentials MacOS = /Users/Ian/s3sql

  1. Set the AWS access secret with:

    s3sql set-secret --api-secret Gy096********
    

    This command returns the following message:

    API key saved successfully!
    

[!TIP] Credentials are stored in the ~/s3sql directory. I.E. Windows = C:\Users\Ian\s3sql\credentials MacOS = /Users/Ian/s3sql

  1. List objects available. For example, to list the objects in the bucket named: osg-repo-scan-data:

    s3sql ls --bucket "s3sql-demo"
    

    This command returns a formatted table of the objects available in the bucket:

    Query executed in 0.3216 seconds
    +------------------------------------------+---------------------------+------------------------------------+---------------------+--------+----------------+
    | Key                                      | LastModified              | ETag                               | ChecksumAlgorithm   |   Size |StorageClass   |
    +==========================================+===========================+====================================+=====================+========+================+
    | folder_example/                          | 2025-06-13 01:22:05+00:00 | "d41d8cd98f00b204e9800998ecf8427e" | ['CRC64NVME']       |      0 | STANDARD       |
    +------------------------------------------+---------------------------+------------------------------------+---------------------+--------+----------------+
    | folder_example/sql_database_releases.csv | 2025-06-13 01:57:02+00:00 | "90089983c8e30097002094756b5f7478" | ['CRC64NVME']       |    438 | STANDARD       |
    +------------------------------------------+---------------------------+------------------------------------+---------------------+--------+----------------+
    | sql_database_features.csv                | 2025-06-13 01:45:38+00:00 | "8a46b42d596d9310bcd4ac9db14df718" | ['CRC64NVME']       |   1991 | STANDARD       |
    +------------------------------------------+---------------------------+------------------------------------+---------------------+--------+----------------+
    | sql_engines.csv                          | 2025-06-13 01:21:49+00:00 | "586f00530a11fcbb46c244210f625292" | ['CRC64NVME']       |    407 | STANDARD       |
    +------------------------------------------+---------------------------+------------------------------------+---------------------+--------+----------------+
    | sql_user_base.csv                        | 2025-06-13 01:45:58+00:00 | "ff5e96d2ca7f3475dc38c537cc1f6c36" | ['CRC64NVME']       |    597 | STANDARD       |
    +------------------------------------------+---------------------------+------------------------------------+---------------------+--------+----------------+
    
  2. Query an object with the following command:

    s3sql query --uri "s3://s3sql-demo/sql_engines.csv" --sql "SELECT * FROM df WHERE 1=1"
    

    This command returns a formatted table of the data from the specified object:

    Query executed in 0.4116 seconds
    +------+---------------+-----------+--------------------+-------------------------------------+-------------------------+
    |   Id | engine_name   |   version | license_type       | developer                           | primary_use_case        |
    +======+===============+===========+====================+=====================================+=========================+
    |    1 | SQLite        |      3.46 | Public Domain      | D. Richard Hipp                     | Embedded systems        |
    +------+---------------+-----------+--------------------+-------------------------------------+-------------------------+
    |    2 | PostgreSQL    |     16.4  | PostgreSQL License | PostgreSQL Global Development Group | General-purpose OLTP    |
    +------+---------------+-----------+--------------------+-------------------------------------+-------------------------+
    |    3 | MySQL         |      8.4  | GPLv2              | Oracle Corporation                  | Web applications        |
    +------+---------------+-----------+--------------------+-------------------------------------+-------------------------+
    |    4 | SQLServer     |   2022    | Proprietary        | Microsoft Corporation               | Enterprise applications |
    +------+---------------+-----------+--------------------+-------------------------------------+-------------------------+
    |    5 | DuckDB        |      1    | MIT License        | DuckDB Labs                         | Analytical queries      |
    +------+---------------+-----------+--------------------+-------------------------------------+-------------------------+
    

    Apply a LIMIT 1 to the previous query:

    s3sql query --uri "s3://s3sql-demo/sql_engines.csv" --sql "SELECT * FROM df WHERE 1=1 LIMIT 1"
    

    This command returns a formatted table of the data from the specified object:

    Query executed in 0.3426 seconds
    +------+---------------+-----------+----------------+-----------------+--------------------+
    |   Id | engine_name   |   version | license_type   | developer       | primary_use_case   |
    +======+===============+===========+================+=================+====================+
    |    1 | SQLite        |      3.46 | Public Domain  | D. Richard Hipp | Embedded systems   |
    +------+---------------+-----------+----------------+-----------------+--------------------+
    
  3. Query an object and output the query results to a file with the following command:

    s3sql query --uri "s3://s3sql-demo/folder_example/sql_database_releases.csv" --sql "SELECT * FROM df WHERE 1=1 LIMIT 1" --out "output.csv"
    

    This command returns a formatted table of the data with an additional message specifying the filename the data was written to:

    Query executed in 0.3414 seconds
    +---------------+------------------------+-----------------+-----------------------+--------------------------+
    | engine_name   |   initial_release_year | designer        | organization          | current_stable_version   |
    +===============+========================+=================+=======================+==========================+
    | SQLite        |                   2000 | D. Richard Hipp | Hipp Wyrick & Company | 3.50.1                   |
    +---------------+------------------------+-----------------+-----------------------+--------------------------+
    Data successfully written to file: output.csv
    
  4. Query an object within the folder_example folder with the following command:

    s3sql query --uri "s3://s3sql-demo/folder_example/sql_database_releases.csv" --sql "SELECT * FROM df WHERE 1=1"
    

    This command returns a formatted table of the data from the specified object:

    Query executed in 0.3591 seconds
    +---------------+------------------------+-------------------------------------+------------------------------------+--------------------------+
    | engine_name   |   initial_release_year | designer                            | organization                       | current_stable_version   |
    +===============+========================+=====================================+====================================+==========================+
    | SQLite        |                   2000 | D. Richard Hipp                     | Hipp Wyrick & Company              | 3.50.1                   |
    +---------------+------------------------+-------------------------------------+------------------------------------+--------------------------+
    | PostgreSQL    |                   1986 | Michael Stonebraker                 | University of California  Berkeley | 16.4                     |
    +---------------+------------------------+-------------------------------------+------------------------------------+--------------------------+
    | MySQL         |                   1995 | Michael Widenius and David Axmark   | MySQL AB                           | 8.4                      |
    +---------------+------------------------+-------------------------------------+------------------------------------+--------------------------+
    | SQLServer     |                   1989 | Donald Chamberlin and Raymond Boyce | Microsoft Corporation              | 2022                     |
    +---------------+------------------------+-------------------------------------+------------------------------------+--------------------------+
    | DuckDB        |                   2019 | Mark Raasveldt and Hannes Muhleisen | Centrum Wiskunde & Informatica     | 1.0.0                    |
    +---------------+------------------------+-------------------------------------+------------------------------------+--------------------------+
    
  5. Query an object and save the output to your local file system using the following command:

    s3sql query --uri "s3://s3sql-demo/folder_example/sql_database_releases.csv" --sql "SELECT * FROM df WHERE 1=1 LIMIT 1" --out "output.csv"
    

    This command returns a formatted table of the data from the specified object and an additional Data successfully written to file: output.csv:

    Query executed in 0.3391 seconds
    +---------------+------------------------+-------------------------------------+------------------------------------+--------------------------+
    | engine_name   |   initial_release_year | designer                            | organization                       | current_stable_version   |
    +===============+========================+=====================================+====================================+==========================+
    | SQLite        |                   2000 | D. Richard Hipp                     | Hipp Wyrick & Company              | 3.50.1                   |
    +---------------+------------------------+-------------------------------------+------------------------------------+--------------------------+
    | PostgreSQL    |                   1986 | Michael Stonebraker                 | University of California  Berkeley | 16.4                     |
    +---------------+------------------------+-------------------------------------+------------------------------------+--------------------------+
    | MySQL         |                   1995 | Michael Widenius and David Axmark   | MySQL AB                           | 8.4                      |
    +---------------+------------------------+-------------------------------------+------------------------------------+--------------------------+
    | SQLServer     |                   1989 | Donald Chamberlin and Raymond Boyce | Microsoft Corporation              | 2022                     |
    +---------------+------------------------+-------------------------------------+------------------------------------+--------------------------+
    | DuckDB        |                   2019 | Mark Raasveldt and Hannes Muhleisen | Centrum Wiskunde & Informatica     | 1.0.0                    |
    +---------------+------------------------+-------------------------------------+------------------------------------+--------------------------+
    Data successfully written to file: output.csv
    
  6. Query a .csv object and convert it to a .parquet file with the following command:

s3sql query --uri "s3://s3sql-demo/sql_engines.csv" --sql "SELECT * FROM df WHERE 1=1 LIMIT 1" --out "output.parquet"

This command returns a formatted table of the data from the specified object and an additional Data successfully written to file: output.parquet:

Query executed in 0.3399 seconds
+------+---------------+-----------+----------------+-----------------+--------------------+
|   Id | engine_name   |   version | license_type   | developer       | primary_use_case   |
+======+===============+===========+================+=================+====================+
|    1 | SQLite        |      3.46 | Public Domain  | D. Richard Hipp | Embedded systems   |
+------+---------------+-----------+----------------+-----------------+--------------------+
Data successfully written to file: output.parquet

Directory Structure

project/
├── .github/
│   └── workflows/
│       └── publish.yml
├── dist/
│   └── s3sql-*.*.*-py3-none-any.whl
│   └── s3sql-*.*.*.tar.gz
├── s3sql/
│   └── cli.py
│   └── test_cli.py
├── .env
├── .gitignore
├── mycli.spec
├── poetry.lock
└── README.md

Testing

  • Use the PyTest VS code extension or the following command to run all tests pytest -s s3sql/test_cli.py:

    image

Github

https://github.com/Ian-Fogelman/s3sql

Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/your-feature)
  3. Rebase your current branch (git pull --rebase)
  4. Add changes to your commit (git add .)
  5. Commit your changes (git commit -m 'Add your feature')
  6. Push to the branch (git push origin feature/your-feature)
  7. Create a Pull Request

PyPI

https://pypi.org/project/s3sql/

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

s3sql-0.0.3.tar.gz (6.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

s3sql-0.0.3-py3-none-any.whl (6.7 kB view details)

Uploaded Python 3

File details

Details for the file s3sql-0.0.3.tar.gz.

File metadata

  • Download URL: s3sql-0.0.3.tar.gz
  • Upload date:
  • Size: 6.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.11.5 Windows/10

File hashes

Hashes for s3sql-0.0.3.tar.gz
Algorithm Hash digest
SHA256 b6cab8b6e8021cc5caa691feac1e636f4861ec3e0bfbf9a24b35d62b1f3b83af
MD5 f5ffbe853a94f277c1fdf00824e9d2f4
BLAKE2b-256 8312127667ae36a29b458acae9bd6205024bbc8a398fed68d99a86dedab91da2

See more details on using hashes here.

File details

Details for the file s3sql-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: s3sql-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 6.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.11.5 Windows/10

File hashes

Hashes for s3sql-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 756c9d1e3f713471452182925880aada4ba4d4121411923c3f9f5e1ad82fb207
MD5 0983524e32da163631b2d530ce989b88
BLAKE2b-256 27c8c25e4dc0431075974001a7b1bb65d5e6ce03ad05ef60fcc72e65bd37a76b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page