Skip to main content

A simple CLI to query cloud storage objects with SQL.

Project description

S3SQL

Overview

S3SQL is a lightweight command line utility for querying data stored in s3.

Features

  • Return S3 data straight to your terminal
  • Write SQL queries to filter and manipulate S3 data
  • List S3 objects available to current scoped credentials

Demo

s3sql

Installation

  1. Install with:
    pip install s3sql
    

Usage

  1. Open a new terminal window

  2. Test the CLI is properly configured with:

    s3sql --help
    
  3. Set the AWS access key with:

    s3sql set-key --api-key AKIA*******
    

    This command returns the following message:

    API secret saved successfully!
    

[!TIP] Credentials are stored in the ~/s3sql directory. I.E. Windows = C:\Users\Ian\s3sql\credentials MacOS = /Users/Ian/s3sql

  1. Set the AWS access secret with:

    s3sql set-secret --api-secret Gy096********
    

    This command returns the following message:

    API key saved successfully!
    

[!TIP] Credentials are stored in the ~/s3sql directory. I.E. Windows = C:\Users\Ian\s3sql\credentials MacOS = /Users/Ian/s3sql

  1. List objects available. For example, to list the objects in the bucket named: osg-repo-scan-data:

    s3sql ls --bucket "s3sql-demo"
    

    This command returns a formatted table of the objects available in the bucket:

    Query executed in 0.3216 seconds
    +------------------------------------------+---------------------------+------------------------------------+---------------------+--------+----------------+
    | Key                                      | LastModified              | ETag                               | ChecksumAlgorithm   |   Size |StorageClass   |
    +==========================================+===========================+====================================+=====================+========+================+
    | folder_example/                          | 2025-06-13 01:22:05+00:00 | "d41d8cd98f00b204e9800998ecf8427e" | ['CRC64NVME']       |      0 | STANDARD       |
    +------------------------------------------+---------------------------+------------------------------------+---------------------+--------+----------------+
    | folder_example/sql_database_releases.csv | 2025-06-13 01:57:02+00:00 | "90089983c8e30097002094756b5f7478" | ['CRC64NVME']       |    438 | STANDARD       |
    +------------------------------------------+---------------------------+------------------------------------+---------------------+--------+----------------+
    | sql_database_features.csv                | 2025-06-13 01:45:38+00:00 | "8a46b42d596d9310bcd4ac9db14df718" | ['CRC64NVME']       |   1991 | STANDARD       |
    +------------------------------------------+---------------------------+------------------------------------+---------------------+--------+----------------+
    | sql_engines.csv                          | 2025-06-13 01:21:49+00:00 | "586f00530a11fcbb46c244210f625292" | ['CRC64NVME']       |    407 | STANDARD       |
    +------------------------------------------+---------------------------+------------------------------------+---------------------+--------+----------------+
    | sql_user_base.csv                        | 2025-06-13 01:45:58+00:00 | "ff5e96d2ca7f3475dc38c537cc1f6c36" | ['CRC64NVME']       |    597 | STANDARD       |
    +------------------------------------------+---------------------------+------------------------------------+---------------------+--------+----------------+
    
  2. Query an object with the following command:

    s3sql query --uri "s3://s3sql-demo/sql_engines.csv" --sql "SELECT * FROM df WHERE 1=1"
    

    This command returns a formatted table of the data from the specified object:

    Query executed in 0.4116 seconds
    +------+---------------+-----------+--------------------+-------------------------------------+-------------------------+
    |   Id | engine_name   |   version | license_type       | developer                           | primary_use_case        |
    +======+===============+===========+====================+=====================================+=========================+
    |    1 | SQLite        |      3.46 | Public Domain      | D. Richard Hipp                     | Embedded systems        |
    +------+---------------+-----------+--------------------+-------------------------------------+-------------------------+
    |    2 | PostgreSQL    |     16.4  | PostgreSQL License | PostgreSQL Global Development Group | General-purpose OLTP    |
    +------+---------------+-----------+--------------------+-------------------------------------+-------------------------+
    |    3 | MySQL         |      8.4  | GPLv2              | Oracle Corporation                  | Web applications        |
    +------+---------------+-----------+--------------------+-------------------------------------+-------------------------+
    |    4 | SQLServer     |   2022    | Proprietary        | Microsoft Corporation               | Enterprise applications |
    +------+---------------+-----------+--------------------+-------------------------------------+-------------------------+
    |    5 | DuckDB        |      1    | MIT License        | DuckDB Labs                         | Analytical queries      |
    +------+---------------+-----------+--------------------+-------------------------------------+-------------------------+
    

    Apply a LIMIT 1 to the previous query:

    s3sql query --uri "s3://s3sql-demo/sql_engines.csv" --sql "SELECT * FROM df WHERE 1=1 LIMIT 1"
    

    This command returns a formatted table of the data from the specified object:

    Query executed in 0.3426 seconds
    +------+---------------+-----------+----------------+-----------------+--------------------+
    |   Id | engine_name   |   version | license_type   | developer       | primary_use_case   |
    +======+===============+===========+================+=================+====================+
    |    1 | SQLite        |      3.46 | Public Domain  | D. Richard Hipp | Embedded systems   |
    +------+---------------+-----------+----------------+-----------------+--------------------+
    
  3. Query an object and output the query results to a file with the following command:

    s3sql query --uri "s3://s3sql-demo/folder_example/sql_database_releases.csv" --sql "SELECT * FROM df WHERE 1=1 LIMIT 1" --out "output.csv"
    

    This command returns a formatted table of the data with an additional message specifying the filename the data was written to:

    Query executed in 0.3414 seconds
    +---------------+------------------------+-----------------+-----------------------+--------------------------+
    | engine_name   |   initial_release_year | designer        | organization          | current_stable_version   |
    +===============+========================+=================+=======================+==========================+
    | SQLite        |                   2000 | D. Richard Hipp | Hipp Wyrick & Company | 3.50.1                   |
    +---------------+------------------------+-----------------+-----------------------+--------------------------+
    Data successfully written to file: output.csv
    
  4. Query an object within the folder_example folder with the following command:

    s3sql query --uri "s3://s3sql-demo/folder_example/sql_database_releases.csv" --sql "SELECT * FROM df WHERE 1=1"
    

    This command returns a formatted table of the data from the specified object:

    Query executed in 0.3591 seconds
    +---------------+------------------------+-------------------------------------+------------------------------------+--------------------------+
    | engine_name   |   initial_release_year | designer                            | organization                       | current_stable_version   |
    +===============+========================+=====================================+====================================+==========================+
    | SQLite        |                   2000 | D. Richard Hipp                     | Hipp Wyrick & Company              | 3.50.1                   |
    +---------------+------------------------+-------------------------------------+------------------------------------+--------------------------+
    | PostgreSQL    |                   1986 | Michael Stonebraker                 | University of California  Berkeley | 16.4                     |
    +---------------+------------------------+-------------------------------------+------------------------------------+--------------------------+
    | MySQL         |                   1995 | Michael Widenius and David Axmark   | MySQL AB                           | 8.4                      |
    +---------------+------------------------+-------------------------------------+------------------------------------+--------------------------+
    | SQLServer     |                   1989 | Donald Chamberlin and Raymond Boyce | Microsoft Corporation              | 2022                     |
    +---------------+------------------------+-------------------------------------+------------------------------------+--------------------------+
    | DuckDB        |                   2019 | Mark Raasveldt and Hannes Muhleisen | Centrum Wiskunde & Informatica     | 1.0.0                    |
    +---------------+------------------------+-------------------------------------+------------------------------------+--------------------------+
    
  5. Query an object and save the output to your local file system using the following command:

    s3sql query --uri "s3://s3sql-demo/folder_example/sql_database_releases.csv" --sql "SELECT * FROM df WHERE 1=1 LIMIT 1" --out "output.csv"
    

    This command returns a formatted table of the data from the specified object and an additional Data successfully written to file: output.csv:

    Query executed in 0.3391 seconds
    +---------------+------------------------+-------------------------------------+------------------------------------+--------------------------+
    | engine_name   |   initial_release_year | designer                            | organization                       | current_stable_version   |
    +===============+========================+=====================================+====================================+==========================+
    | SQLite        |                   2000 | D. Richard Hipp                     | Hipp Wyrick & Company              | 3.50.1                   |
    +---------------+------------------------+-------------------------------------+------------------------------------+--------------------------+
    | PostgreSQL    |                   1986 | Michael Stonebraker                 | University of California  Berkeley | 16.4                     |
    +---------------+------------------------+-------------------------------------+------------------------------------+--------------------------+
    | MySQL         |                   1995 | Michael Widenius and David Axmark   | MySQL AB                           | 8.4                      |
    +---------------+------------------------+-------------------------------------+------------------------------------+--------------------------+
    | SQLServer     |                   1989 | Donald Chamberlin and Raymond Boyce | Microsoft Corporation              | 2022                     |
    +---------------+------------------------+-------------------------------------+------------------------------------+--------------------------+
    | DuckDB        |                   2019 | Mark Raasveldt and Hannes Muhleisen | Centrum Wiskunde & Informatica     | 1.0.0                    |
    +---------------+------------------------+-------------------------------------+------------------------------------+--------------------------+
    Data successfully written to file: output.csv
    
  6. Query a .csv object and convert it to a .parquet file with the following command:

s3sql query --uri "s3://s3sql-demo/sql_engines.csv" --sql "SELECT * FROM df WHERE 1=1 LIMIT 1" --out "output.parquet"

This command returns a formatted table of the data from the specified object and an additional Data successfully written to file: output.parquet:

Query executed in 0.3399 seconds
+------+---------------+-----------+----------------+-----------------+--------------------+
|   Id | engine_name   |   version | license_type   | developer       | primary_use_case   |
+======+===============+===========+================+=================+====================+
|    1 | SQLite        |      3.46 | Public Domain  | D. Richard Hipp | Embedded systems   |
+------+---------------+-----------+----------------+-----------------+--------------------+
Data successfully written to file: output.parquet

Directory Structure

project/
├── .github/
│   └── workflows/
│       └── publish.yml
├── dist/
│   └── s3sql-*.*.*-py3-none-any.whl
│   └── s3sql-*.*.*.tar.gz
├── s3sql/
│   └── cli.py
│   └── test_cli.py
├── .env
├── .gitignore
├── mycli.spec
├── poetry.lock
└── README.md

Testing

  • Use the PyTest VS code extension or the following command to run all tests pytest -s s3sql/test_cli.py:

    image

Github

https://github.com/Ian-Fogelman/s3sql

Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/your-feature)
  3. Rebase your current branch (git pull --rebase)
  4. Add changes to your commit (git add .)
  5. Commit your changes (git commit -m 'Add your feature')
  6. Push to the branch (git push origin feature/your-feature)
  7. Create a Pull Request

PyPI

https://pypi.org/project/s3sql/

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

s3sql-0.0.4.tar.gz (6.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

s3sql-0.0.4-py3-none-any.whl (6.9 kB view details)

Uploaded Python 3

File details

Details for the file s3sql-0.0.4.tar.gz.

File metadata

  • Download URL: s3sql-0.0.4.tar.gz
  • Upload date:
  • Size: 6.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.11.5 Windows/10

File hashes

Hashes for s3sql-0.0.4.tar.gz
Algorithm Hash digest
SHA256 51fb733afbcc0094568eea38478485a6a65686fad516c252b207344deecbb4c0
MD5 1f58e0062ca7a93c905157e572e562dc
BLAKE2b-256 2911e70a26d880ff353ab9786784b25470873b639b2e70cf19aede826d44530e

See more details on using hashes here.

File details

Details for the file s3sql-0.0.4-py3-none-any.whl.

File metadata

  • Download URL: s3sql-0.0.4-py3-none-any.whl
  • Upload date:
  • Size: 6.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.11.5 Windows/10

File hashes

Hashes for s3sql-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 6c6574ae5347ea4576965caca807926e09b02aa3ba3b7af8c7a3b4a02ac128e1
MD5 4156981b664e72c8a82ec7977b97f743
BLAKE2b-256 8fd6ab9389a9833eb9cdb4300d2c7d54185756910b24ed64ebf8c69ed7428136

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page