Python wrapper for Data Explorer
Project description
dx
This package provides convenient formatting and IPython display formatter registration for tabular data and DEX media types.
A Pythonic Data Explorer, open sourced with ❤️ by Noteable, a collaborative notebook platform that enables teams to use and visualize data, together.
Requirements
Python 3.8+
Installation
Poetry
poetry add dx
Then import the package:
import dx
Pip
pip install dx
Then import the package:
import dx
Usage
The dx library currently enables DEX media type visualization of pandas DataFrame and Series objects, as well as numpy ndarray objects. This can be handled in two ways:
- explicit
dx.display()calls - setting the
display_modeto update the IPython display formatter for a session
With dx.display()
dx.display() will display a single dataset using the DEX media type. It currently supports:
-
pandas
DataFrameobjectsimport pandas as pd import random df = pd.DataFrame({ 'random_ints': [random.randint(0, 100) for _ in range(500)], 'random_floats': [random.random() for _ in range(500)], }) dx.display(df)
-
tabular data as
dictorlisttypesdx.display([ [1, 5, 10, 20, 500], [1, 2, 3, 4, 5], [0, 0, 0, 0, 1] ])
-
.csvor.jsonfilepathsdf = dx.random_dataframe() df.to_csv("dx_docs_sample.csv", index=False) dx.display("dx_docs_sample.csv")
With dx.set_display_mode()
Using either "simple" or "enhanced" display modes will allow dx will update the current IPython display formatters to allow DEX media type visualization of pandas DataFrame objects for an entire notebook / kernel session instead of the default DataFrame display output.
Details
This will adjust pandas options to:
- increasing the number of rows displayed to
50000from pandas default of60 - increasing the number of columns displayed to
50from pandas default of20 - enabling
html.table_schema(Falseby default in pandas)
This will also handle some basic column cleaning and generate a schema for the DataFrame using pandas.io.json.build_table_schema. Depending on the display mode, the data will be transformed into either a list of dictionaries or list of lists of columnar values.
"simple"- list of dictionaries"enhanced"- list of lists
NOTE: Unlike
dx.display(), this only affects pandas DataFrames (or any types set insettings.RENDERABLE_TYPES); it does not affect the display of.csv/.jsonfile data, ordict/listoutputs
-
dx.set_display_mode("simple")import dx import numpy as np import pandas as pd # enable DEX display outputs from now on dx.set_display_mode("simple") df = pd.read_csv("dx_docs_sample.csv") df
df2 = pd.DataFrame( [ [1, 5, 10, 20, 500], [1, 2, 3, np.nan, 5], [0, 0, 0, np.nan, 1] ], columns=['a', 'b', 'c', 'd', 'e'] ) df2
If, at any point, you want to go back to the default display formatting (vanilla pandas output), use the "plain" display mode. This will revert the IPython display format update to its original state and put the pandas options back to their default values.
dx.set_display_mode("plain")# revert to original pandas display outputs from now on dx.set_display_mode("plain") df = pd.read_csv("dx_docs_sample.csv") df
df2 = pd.DataFrame( [ [1, 5, 10, 20, 500], [1, 2, 3, np.nan, 5], [0, 0, 0, np.nan, 1] ], columns=['a', 'b', 'c', 'd', 'e'] ) df2
Custom Settings
Default settings for dx can be found by calling dx.settings:
Each can be set using dx.set_option():
Setting
DISPLAY_MAX_ROWS to 3 for the current session
...or with the dx.settings_context() context manager:
Setting
DISPLAY_MAX_ROWS to 3 within the current context, leaving options for the rest of the session alone
Generating Sample Data
Documentation coming soon!
Usage Outside of Noteable
If using this package in a notebook environment outside of Noteable, the frontend should support the following media types:
application/vnd.dataresource+jsonfor"simple"display modeapplication/vnd.dex.v1+jsonfor"enhanced"display mode
Contributing
See CONTRIBUTING.md.
Code of Conduct
We follow the noteable.io code of conduct.
LICENSE
See LICENSE.md.
Open sourced with ❤️ by Noteable for the community.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dx-1.3.0.tar.gz.
File metadata
- Download URL: dx-1.3.0.tar.gz
- Upload date:
- Size: 51.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.3.2 CPython/3.10.6 Linux/5.15.0-1034-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8d8c7f7eac20569f031d00f37eed9dc712cd96cae57a4f09b67c5d11b36d598a
|
|
| MD5 |
ad9a643295eb78faea2a55a82ac7952a
|
|
| BLAKE2b-256 |
d2887bd2e955b475d0e15886150fd312553352f4e39ec2c45b1c724acf0f811e
|
File details
Details for the file dx-1.3.0-py3-none-any.whl.
File metadata
- Download URL: dx-1.3.0-py3-none-any.whl
- Upload date:
- Size: 65.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.3.2 CPython/3.10.6 Linux/5.15.0-1034-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
55b7c9c8381d24500afd013f80634f363dfc250a71853c313430ccc4af120e59
|
|
| MD5 |
ef01db2379a514111018b806d9973688
|
|
| BLAKE2b-256 |
eb8f3494fbb2b6ae3691ac28edba3030186c7d54159a8128ad2ae128126856ef
|