No project description provided
Project description
Columnq
Simple CLI to help you query tabular data with support for a rich set of growing formats and data sources.
It supports JSON, CSV, Parquet, Arrow and all other formats that are supported by ROAPI, which is documented at here.
It also supports querying datasets from remote locations like S3 and HTTPs, see ROAPI's blob store documentation for more info.
Installation
Pre-built binary
The pre-built binaries hosted on GitHub releases. These binaries are self-contained so you can just drop them into your PATH.
The same set of binaries are also distributed through PyPI:
pip install columnq-cli
Build from source
cargo install --locked --git https://github.com/roapi/roapi --branch main --bins columnq-cli
Usage
One off query
The sql
subcommand executes a provided SQL query against specificed static
dataset and returns the result in stdout on exit. This is usually useful for
script automation tasks.
$ columnq sql --table test_data/spacex_launches.json \
"SELECT COUNT(id), DATE_TRUNC('year', CAST(date_utc AS TIMESTAMP)) as d FROM spacex_launches WHERE success = true GROUP BY d ORDER BY d DESC"
+-----------+---------------------+
| COUNT(id) | d |
+-----------+---------------------+
| 4 | 2021-01-01 00:00:00 |
| 26 | 2020-01-01 00:00:00 |
| 13 | 2019-01-01 00:00:00 |
| 21 | 2018-01-01 00:00:00 |
| 18 | 2017-01-01 00:00:00 |
| 8 | 2016-01-01 00:00:00 |
| 6 | 2015-01-01 00:00:00 |
| 6 | 2014-01-01 00:00:00 |
| 3 | 2013-01-01 00:00:00 |
| 2 | 2012-01-01 00:00:00 |
| 2 | 2010-01-01 00:00:00 |
| 1 | 2009-01-01 00:00:00 |
| 1 | 2008-01-01 00:00:00 |
+-----------+---------------------+
By default, the sql
subcommand outputs results in human friendly table
format. You can change the output format using --output
option to make it
more friendly for automations.
$ columnq sql --table test_data/spacex_launches.json --output json "SELECT COUNT(id) AS total_launches FROM spacex_launches"
[{"total_launches":132}]
Automate with UNIX pipes
Just like other UNIX tools, columnq supports consuming data stream from stdin to integrate with other CLI tools using UNIX pipe:
find . -printf "%M|%n|%u|%s|%P\n" | columnq sql \
--table 't=stdin,format=csv,has_header=false,delimiter=|' \
"SELECT SUM(column_4) as total_size FROM t"
+------------+
| total_size |
+------------+
| 9875017987 |
+------------+
Format conversion
The Columnq CLI can also be used as a handy utility to convert tabular data
between various formats: json
, parquet
, csv
, yaml
, arrow
, etc.
$ columnq sql --table 't=test_data/uk_cities_with_headers.csv' 'SELECT * FROM t' --output json
$ cat test_data/blogs.parquet | columnq sql --table 't=stdin,format=parquet' 'SELECT * FROM t' --output json
Interactive console
For dataset exploration, you can use the console
subcommand to query multiple
datasets in an interactive console environment:
$ columnq console \
--table "uk_cities=test_data/uk_cities_with_headers.csv" \
--table "test_data/spacex_launches.json"
columnq(sql)> SELECT * FROM uk_cities WHERE lat > 57;
+-----------------------------+-----------+-----------+
| city | lat | lng |
+-----------------------------+-----------+-----------+
| Elgin, Scotland, the UK | 57.653484 | -3.335724 |
| Aberdeen, Aberdeen City, UK | 57.149651 | -2.099075 |
| Inverness, the UK | 57.477772 | -4.224721 |
+-----------------------------+-----------+-----------+
columnq(sql)> SELECT COUNT(*) FROM spacex_launches WHERE success=true AND upcoming=false;
+-----------------+
| COUNT(UInt8(1)) |
+-----------------+
| 111 |
+-----------------+
Explore in memory catalog and table schemas:
columnq(sql)> SHOW TABLES;
+---------------+--------------------+-----------------+------------+
| table_catalog | table_schema | table_name | table_type |
+---------------+--------------------+-----------------+------------+
| datafusion | public | uk_cities | BASE TABLE |
| datafusion | public | spacex_launches | BASE TABLE |
| datafusion | information_schema | tables | VIEW |
| datafusion | information_schema | columns | VIEW |
+---------------+--------------------+-----------------+------------+
columnq(sql)> SHOW COLUMNS FROM uk_cities;
+---------------+--------------+------------+-------------+-----------+-------------+
| table_catalog | table_schema | table_name | column_name | data_type | is_nullable |
+---------------+--------------+------------+-------------+-----------+-------------+
| datafusion | public | uk_cities | city | Utf8 | NO |
| datafusion | public | uk_cities | lat | Float64 | NO |
| datafusion | public | uk_cities | lng | Float64 | NO |
+---------------+--------------+------------+-------------+-----------+-------------+
Development
Debug mode
Set the RUST_LOG
environment variable to info,columnq=debug
to run columnq
in verbose debug logging.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
File details
Details for the file columnq_cli-0.5.2-py3-none-win_amd64.whl
.
File metadata
- Download URL: columnq_cli-0.5.2-py3-none-win_amd64.whl
- Upload date:
- Size: 19.0 MB
- Tags: Python 3, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.10.14
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5ab20384c7296c2d55cd8b98be4ead067c616e3d7901571b0f58ba9ef526ee8d |
|
MD5 | 4b8f536b3d368106f1ab67d6f28e4310 |
|
BLAKE2b-256 | f000ccb577aa4f5d7ebfc752f9aaa5692457538a491853c413100b8d4ae57382 |
File details
Details for the file columnq_cli-0.5.2-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
.
File metadata
- Download URL: columnq_cli-0.5.2-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
- Upload date:
- Size: 19.6 MB
- Tags: Python 3, manylinux: glibc 2.17+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.10.14
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f86d558f7027f0e151c939d60a3d748f886e2e60fdd31e52394d3c0ea17b1d69 |
|
MD5 | 7a96981cec65ec6f6217698db496c477 |
|
BLAKE2b-256 | 22395e047a6d408c25306c6aec1d033aa4afec6f2236472026e6959b85cee400 |
File details
Details for the file columnq_cli-0.5.2-py3-none-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
.
File metadata
- Download URL: columnq_cli-0.5.2-py3-none-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
- Upload date:
- Size: 21.3 MB
- Tags: Python 3, manylinux: glibc 2.12+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.10.14
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 83b7c08bb4bf04c402167eaab06cb63d9daa51524a07295d0b1e715760c870ce |
|
MD5 | 28d603a0a9326e67a52ef22400d60054 |
|
BLAKE2b-256 | 29c4fd3963c5e4cf2bf1c370c0d698a7af9924944fbf4c244500e6f06e0afe3d |
File details
Details for the file columnq_cli-0.5.2-py3-none-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
.
File metadata
- Download URL: columnq_cli-0.5.2-py3-none-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
- Upload date:
- Size: 37.1 MB
- Tags: Python 3, macOS 10.9+ universal2 (ARM64, x86-64), macOS 10.9+ x86-64, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.10.14
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2396cf2256d628dec275bcebbf759da62f4f33ac48a78d5630701d3e6141d370 |
|
MD5 | 15eb652ddba762180fc1720742e8fb5a |
|
BLAKE2b-256 | 90cad0c3911b0d62b419213937cd89619ed045bea970ab272439e46ca2c5cdc7 |
File details
Details for the file columnq_cli-0.5.2-py3-none-macosx_10_7_x86_64.whl
.
File metadata
- Download URL: columnq_cli-0.5.2-py3-none-macosx_10_7_x86_64.whl
- Upload date:
- Size: 19.3 MB
- Tags: Python 3, macOS 10.7+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.10.14
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7e04de6c600787d36aa01aee9691abb60da17c221cc404b53529fdd2029d9a04 |
|
MD5 | 2a8a1f98d879d925228b75c39180859b |
|
BLAKE2b-256 | c36ac5f6d7edc86d3785268e59cba2934790d899c413838b48ffa5614b86bd47 |