Skip to main content

No project description provided

Project description

canica

canica is an interactive tool to visualize embeddings. Its main current goal is to explore text datasets, representing the input embeddings in a 2D tSNE plot.

canica gif

How to install

Just

pip install canica

And start using the CanicaTSNE and CanicaUMAP class in your notebooks.

How to use

canica is designed to work mainly as a data exploration tool embedded as a widget inside of a jupyter notebook. These are the instructions to explore a dataset in a notebook (the tutorial notebook provides more information).

In a notebook, load a pandas DataFrame and make sure that at least one column contains the embeddings you want to plot. In a cell, run:

from canica.widget import CanicaTSNE
CanicaTSNE(df, embedding_col="embedding_col", text_col="text_col", hue_col="some_score")

Where df is the pandas DataFrame, "embedding_col" is a column in df containing embeddings and hue_var is another column that will be represented using colours (currently it has to be a numerical column with values between 0 and 1). You can also use CanicaUMAP instead of CanicaTSNE to use UMAP.

This will show the canica embedding explorer and will enable interactive exploration of your dataset. Have a look at the tutorial notebook to see it working.

How to contribute

We welcome contributions of all kinds. For more information on how to do it, we refer you to the CONTRIBUTING.md file.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

canica-0.0.19.tar.gz (2.0 MB view hashes)

Uploaded Source

Built Distribution

canica-0.0.19-py3-none-any.whl (308.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page