Skip to main content

A simple CLI to query cloud storage objects with SQL.

Project description

S3SQL

Overview

S3SQL is a lightweight command line utility for querying data stored in s3.

Features

  • Return S3 data straight to your terminal
  • Write SQL queries to filter and manipulate S3 data
  • List S3 objects available to current scoped credentials

Demo

s3sql

Installation

  1. Install with:
    pip install s3sql
    

Usage

  1. Open a new terminal window

  2. Test the CLI is properly configured with:

    s3sql --help
    
  3. Set the AWS access key with:

    s3sql set-key --api-key AKIA*******
    

    This command returns the following message:

    API secret saved successfully!
    

[!TIP] Credentials are stored in the ~/s3sql directory. I.E. Windows = C:\Users\Ian\s3sql\credentials MacOS = /Users/Ian/s3sql

  1. Set the AWS access secret with:

    s3sql set-secret --api-secret Gy096********
    

    This command returns the following message:

    API key saved successfully!
    

[!TIP] Credentials are stored in the ~/s3sql directory. I.E. Windows = C:\Users\Ian\s3sql\credentials MacOS = /Users/Ian/s3sql

  1. List objects available. For example, to list the objects in the bucket named: osg-repo-scan-data:

    s3sql ls --bucket "s3sql-demo"
    

    This command returns a formatted table of the objects available in the bucket:

    Query executed in 0.3216 seconds
    +------------------------------------------+---------------------------+------------------------------------+---------------------+--------+----------------+
    | Key                                      | LastModified              | ETag                               | ChecksumAlgorithm   |   Size |StorageClass   |
    +==========================================+===========================+====================================+=====================+========+================+
    | folder_example/                          | 2025-06-13 01:22:05+00:00 | "d41d8cd98f00b204e9800998ecf8427e" | ['CRC64NVME']       |      0 | STANDARD       |
    +------------------------------------------+---------------------------+------------------------------------+---------------------+--------+----------------+
    | folder_example/sql_database_releases.csv | 2025-06-13 01:57:02+00:00 | "90089983c8e30097002094756b5f7478" | ['CRC64NVME']       |    438 | STANDARD       |
    +------------------------------------------+---------------------------+------------------------------------+---------------------+--------+----------------+
    | sql_database_features.csv                | 2025-06-13 01:45:38+00:00 | "8a46b42d596d9310bcd4ac9db14df718" | ['CRC64NVME']       |   1991 | STANDARD       |
    +------------------------------------------+---------------------------+------------------------------------+---------------------+--------+----------------+
    | sql_engines.csv                          | 2025-06-13 01:21:49+00:00 | "586f00530a11fcbb46c244210f625292" | ['CRC64NVME']       |    407 | STANDARD       |
    +------------------------------------------+---------------------------+------------------------------------+---------------------+--------+----------------+
    | sql_user_base.csv                        | 2025-06-13 01:45:58+00:00 | "ff5e96d2ca7f3475dc38c537cc1f6c36" | ['CRC64NVME']       |    597 | STANDARD       |
    +------------------------------------------+---------------------------+------------------------------------+---------------------+--------+----------------+
    
  2. Query an object with the following command:

    s3sql query --uri "s3://s3sql-demo/sql_engines.csv" --query "SELECT * FROM df WHERE 1=1"
    

    This command returns a formatted table of the data from the specified object:

    Query executed in 0.4116 seconds
    +------+---------------+-----------+--------------------+-------------------------------------+-------------------------+
    |   Id | engine_name   |   version | license_type       | developer                           | primary_use_case        |
    +======+===============+===========+====================+=====================================+=========================+
    |    1 | SQLite        |      3.46 | Public Domain      | D. Richard Hipp                     | Embedded systems        |
    +------+---------------+-----------+--------------------+-------------------------------------+-------------------------+
    |    2 | PostgreSQL    |     16.4  | PostgreSQL License | PostgreSQL Global Development Group | General-purpose OLTP    |
    +------+---------------+-----------+--------------------+-------------------------------------+-------------------------+
    |    3 | MySQL         |      8.4  | GPLv2              | Oracle Corporation                  | Web applications        |
    +------+---------------+-----------+--------------------+-------------------------------------+-------------------------+
    |    4 | SQLServer     |   2022    | Proprietary        | Microsoft Corporation               | Enterprise applications |
    +------+---------------+-----------+--------------------+-------------------------------------+-------------------------+
    |    5 | DuckDB        |      1    | MIT License        | DuckDB Labs                         | Analytical queries      |
    +------+---------------+-----------+--------------------+-------------------------------------+-------------------------+
    

    Apply a LIMIT 1 to the previous query:

    s3sql query --uri "s3://s3sql-demo/sql_engines.csv" --query "SELECT * FROM df WHERE 1=1 LIMIT 1"
    

    This command returns a formatted table of the data from the specified object:

    Query executed in 0.3426 seconds
    +------+---------------+-----------+----------------+-----------------+--------------------+
    |   Id | engine_name   |   version | license_type   | developer       | primary_use_case   |
    +======+===============+===========+================+=================+====================+
    |    1 | SQLite        |      3.46 | Public Domain  | D. Richard Hipp | Embedded systems   |
    +------+---------------+-----------+----------------+-----------------+--------------------+
    
  3. Query an object and output the query results to a file with the following command:

    s3sql query --uri "s3://s3sql-demo/folder_example/sql_database_releases.csv" --query "SELECT * FROM df WHERE 1=1 LIMIT 1" --out "output.csv"
    

    This command returns a formatted table of the data with an additional message specifying the filename the data was written to:

    Query executed in 0.3414 seconds
    +---------------+------------------------+-----------------+-----------------------+--------------------------+
    | engine_name   |   initial_release_year | designer        | organization          | current_stable_version   |
    +===============+========================+=================+=======================+==========================+
    | SQLite        |                   2000 | D. Richard Hipp | Hipp Wyrick & Company | 3.50.1                   |
    +---------------+------------------------+-----------------+-----------------------+--------------------------+
    Data successfully written to file: output.csv
    
  4. Query an object within the folder_example folder with the following command:

    s3sql query --uri "s3://s3sql-demo/folder_example/sql_database_releases.csv" --query "SELECT * FROM df WHERE 1=1"
    

    This command returns a formatted table of the data from the specified object:

    +---------------+------------------------+-------------------------------------+------------------------------------+--------------------------+
    | engine_name   |   initial_release_year | designer                            | organization                       | current_stable_version   |
    +===============+========================+=====================================+====================================+==========================+
    | SQLite        |                   2000 | D. Richard Hipp                     | Hipp Wyrick & Company              | 3.50.1                   |
    +---------------+------------------------+-------------------------------------+------------------------------------+--------------------------+
    | PostgreSQL    |                   1986 | Michael Stonebraker                 | University of California  Berkeley | 16.4                     |
    +---------------+------------------------+-------------------------------------+------------------------------------+--------------------------+
    | MySQL         |                   1995 | Michael Widenius and David Axmark   | MySQL AB                           | 8.4                      |
    +---------------+------------------------+-------------------------------------+------------------------------------+--------------------------+
    | SQLServer     |                   1989 | Donald Chamberlin and Raymond Boyce | Microsoft Corporation              | 2022                     |
    +---------------+------------------------+-------------------------------------+------------------------------------+--------------------------+
    | DuckDB        |                   2019 | Mark Raasveldt and Hannes Muhleisen | Centrum Wiskunde & Informatica     | 1.0.0                    |
    +---------------+------------------------+-------------------------------------+------------------------------------+--------------------------+
    

Directory Structure

project/
├── dist/
│   └── s3sql-*.*.*-py3-none-any.whl
│   └── s3sql-*.*.*.tar.gz
├── s3sql/
│   └── cli.py
├── .env
├── .gitignore
├── mycli.spec
├── poetry.lock
└── README.md

Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/your-feature)
  3. Rebase your current branch (git pull --rebase)
  4. Add changes to your commit (git add .)
  5. Commit your changes (git commit -m 'Add your feature')
  6. Push to the branch (git push origin feature/your-feature)
  7. Create a Pull Request

PyPI

https://pypi.org/project/s3sql/

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

s3sql-0.0.2.tar.gz (5.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

s3sql-0.0.2-py3-none-any.whl (5.4 kB view details)

Uploaded Python 3

File details

Details for the file s3sql-0.0.2.tar.gz.

File metadata

  • Download URL: s3sql-0.0.2.tar.gz
  • Upload date:
  • Size: 5.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.11.5 Windows/10

File hashes

Hashes for s3sql-0.0.2.tar.gz
Algorithm Hash digest
SHA256 c2e782146b9746e81ec416ba0fecf516153c46e3c7c9b18572f2cc60b0147419
MD5 1fc6cfd27637420a6a1f8c7bf77e5847
BLAKE2b-256 31c7cb00f3e9a9f73f92eb82415cab4aa9f23d618f07c99e32b5e457c35d8f93

See more details on using hashes here.

File details

Details for the file s3sql-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: s3sql-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 5.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.11.5 Windows/10

File hashes

Hashes for s3sql-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 45ecee52547d7d037ee3ab9e989296cda27c84b848d654c09302799bcef579d2
MD5 a28628ef92756a8471d871065b1dc563
BLAKE2b-256 83ff6510a295d9e4629b8a661321efa3924e911e96a433eaaa1bddbb2782b752

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page