Python wrapper for Data Explorer
Project description
dx
This package provides convenient formatting and IPython display formatter registration for tabular data and DEX media types.
A Pythonic Data Explorer, open sourced with ❤️ by Noteable, a collaborative notebook platform that enables teams to use and visualize data, together.
Requirements
Python 3.8+
Installation
Poetry
poetry add dx
Then import the package:
import dx
Pip
pip install dx
Then import the package:
import dx
Usage
The dx
library currently enables DEX media type visualization of pandas DataFrame
and Series
objects, as well as numpy ndarray
objects. This can be handled in two ways:
- explicit
dx.display()
calls - setting the
display_mode
to update the IPython display formatter for a session
With dx.display()
dx.display()
will display a single dataset using the DEX media type. It currently supports:
-
pandas
DataFrame
objectsimport pandas as pd import random df = pd.DataFrame({ 'random_ints': [random.randint(0, 100) for _ in range(500)], 'random_floats': [random.random() for _ in range(500)], }) dx.display(df)
-
tabular data as
dict
orlist
typesdx.display([ [1, 5, 10, 20, 500], [1, 2, 3, 4, 5], [0, 0, 0, 0, 1] ])
-
.csv
or.json
filepathsdf = dx.random_dataframe() df.to_csv("dx_docs_sample.csv", index=False) dx.display("dx_docs_sample.csv")
With dx.set_display_mode()
Using either "simple"
or "enhanced"
display modes will allow dx
will update the current IPython
display formatters to allow DEX media type visualization of pandas DataFrame
objects for an entire notebook / kernel session instead of the default DataFrame
display output.
Details
This will adjust pandas options to:
- increasing the number of rows displayed to
50000
from pandas default of60
- increasing the number of columns displayed to
50
from pandas default of20
- enabling
html.table_schema
(False
by default in pandas)
This will also handle some basic column cleaning and generate a schema for the DataFrame
using pandas.io.json.build_table_schema
. Depending on the display mode, the data will be transformed into either a list of dictionaries or list of lists of columnar values.
"simple"
- list of dictionaries"enhanced"
- list of lists
NOTE: Unlike
dx.display()
, this only affects pandas DataFrames (or any types set insettings.RENDERABLE_TYPES
); it does not affect the display of.csv
/.json
file data, ordict
/list
outputs
-
dx.set_display_mode("simple")
import dx import numpy as np import pandas as pd # enable DEX display outputs from now on dx.set_display_mode("simple") df = pd.read_csv("dx_docs_sample.csv") df
df2 = pd.DataFrame( [ [1, 5, 10, 20, 500], [1, 2, 3, np.nan, 5], [0, 0, 0, np.nan, 1] ], columns=['a', 'b', 'c', 'd', 'e'] ) df2
If, at any point, you want to go back to the default display formatting (vanilla pandas output), use the "plain"
display mode. This will revert the IPython display format update to its original state and put the pandas options back to their default values.
dx.set_display_mode("plain")
# revert to original pandas display outputs from now on dx.set_display_mode("plain") df = pd.read_csv("dx_docs_sample.csv") df
df2 = pd.DataFrame( [ [1, 5, 10, 20, 500], [1, 2, 3, np.nan, 5], [0, 0, 0, np.nan, 1] ], columns=['a', 'b', 'c', 'd', 'e'] ) df2
Custom Settings
Default settings for dx
can be found by calling dx.settings
:
Each can be set using dx.set_option()
:
Setting DISPLAY_MAX_ROWS
to 3
for the current session
...or with the dx.settings_context()
context manager:
Setting DISPLAY_MAX_ROWS
to 3
within the current context, leaving options for the rest of the session alone
Generating Sample Data
Documentation coming soon!
Usage Outside of Noteable
If using this package in a notebook environment outside of Noteable, the frontend should support the following media types:
application/vnd.dataresource+json
for"simple"
display modeapplication/vnd.dex.v1+json
for"enhanced"
display mode
Contributing
See CONTRIBUTING.md.
Code of Conduct
We follow the noteable.io code of conduct.
LICENSE
See LICENSE.md.
Open sourced with ❤️ by Noteable for the community.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file dx-1.3.0.tar.gz
.
File metadata
- Download URL: dx-1.3.0.tar.gz
- Upload date:
- Size: 51.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.3.2 CPython/3.10.6 Linux/5.15.0-1034-azure
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8d8c7f7eac20569f031d00f37eed9dc712cd96cae57a4f09b67c5d11b36d598a |
|
MD5 | ad9a643295eb78faea2a55a82ac7952a |
|
BLAKE2b-256 | d2887bd2e955b475d0e15886150fd312553352f4e39ec2c45b1c724acf0f811e |
File details
Details for the file dx-1.3.0-py3-none-any.whl
.
File metadata
- Download URL: dx-1.3.0-py3-none-any.whl
- Upload date:
- Size: 65.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.3.2 CPython/3.10.6 Linux/5.15.0-1034-azure
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 55b7c9c8381d24500afd013f80634f363dfc250a71853c313430ccc4af120e59 |
|
MD5 | ef01db2379a514111018b806d9973688 |
|
BLAKE2b-256 | eb8f3494fbb2b6ae3691ac28edba3030186c7d54159a8128ad2ae128126856ef |