Skip to main content

Analytical sql cell for Jupyter

Project description

Analytical SQL Cell

For the last decade, the data ecosystem has mainly focused on the technologies to store and process big datasets — the bigger the better. Later on, Modern Data Stack emerged as a cloud-native suite of products used for data integration and data analytics by the more technology-forward companies. Data warehouses are now a default piece of the Modern Data Stack and Snowflake’s rapid rise has been the poster child of this trend.

But in real life, most analytical workloads aren’t massive. Users prefer easy and fast answers to their questions instead of waiting for the cloud to spin. Instead of a distributed database in the cloud, most analyses can be handled with an optimized engine on our laptop and the cloud can be leveraged when needed.

To fill the gap, we built Analytical SQL Cell, a free and open-source Jupyter Widget that offers a Personal Data Lake experience for Jupyter Notebook users.

Analytical SQL has the following features:

  • Pain-free setup (pip install asqlcell and you are in).
  • Unleash the power of SQL for analytical workloads.
  • Lightning speed for data loading.
  • Built-in visualization.
  • Low footprint for analytical processing.
  • Interleave Python and SQL programming.
  • Free and open source.

Installation

You can install using pip:

pip install asqlcell

Quick Start

Here's a quick example to get you started with Analytical SQL Cell in Jupyter:

%%sql

SELECT
    continent,
    AVG(life_exp) AS health,
    AVG(gdp_cap) AS wealth,
    SUM(population) AS population
FROM 'gapminder.csv.gz'
WHERE year=2007
GROUP BY continent

In the above sample, %%sql is a cell magic indicating that the cell block will be executed as a SQL statement.

Screenshot

Data Load

You can query from Pandas DataFrame, CSV files, compressed (e.g. compressed with gzip) CSV files, as well as Parquet files.

SQL

DuckDB is the default engine of Analytical SQL Cell. Please find more details at the SQL Introduction of DuckDB.

Result Table

With the SQL query being executed in an Analytical SQL Cell, the result set is presented as a table.

In case of multiple SQL statements being executed in an Analytical SQL Cell, only the result set of the last SQL statement will be presented.

If the last SQL statement didn't have any result set, then only the count of executed data rows is shown.

sample result table

Tutorial

Development

This widget is developed with conda to ensure a consistent developer experience. The project is developed in both Mac OS and Windows Subsystem for Linux.

Please run the following commands to create a conda environment:

conda create -n asqlcell -c conda-forge nodejs=18.15 python=3.8 jupyterlab=3.6 jupyter_packaging=0.12
conda activate asqlcell

Python

Install the python. This will also build the TypeScript package.

pip install -e ".[test, examples, docs]"

Jupyter

When developing your extensions, you need to manually enable your extensions with the notebook / lab frontend.

For lab, this is done by the command:

jupyter labextension develop --overwrite .
jlpm run build

Typescript

You must start watching the change of the widget:

jlpm run watch

Then in another terminal you can run Jupyter Lab (without launching a browser):

jupyter lab --no-browser

Now you can open Google Chrome and navigate to http://localhost:8888 to play with the widget. After a change wait for the build to finish and then refresh your browser and the changes should take effect.

If you make a change to the python code then you will need to restart the notebook kernel to have it take effect.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

asqlcell-0.2.4.1-py2.py3-none-any.whl (7.3 MB view details)

Uploaded Python 2 Python 3

File details

Details for the file asqlcell-0.2.4.1-py2.py3-none-any.whl.

File metadata

  • Download URL: asqlcell-0.2.4.1-py2.py3-none-any.whl
  • Upload date:
  • Size: 7.3 MB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.17

File hashes

Hashes for asqlcell-0.2.4.1-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 ab1d34f8c2537375bd8c450d6a20bebc266423675964903e268e4a6a50f6cbef
MD5 30070f248d880999c1ba5acbe76536c4
BLAKE2b-256 dd0385d053ede84fe8e8b8ca8887729c46e9c88327ceb2f46d04f1e2b1c025f6

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page