Skip to main content

Share and Edit Pandas/Polars Dataframes with a Link!

Project description

image

share-df: Instantly Share and Modify Dataframes With a Web Interface From Anywhere

PyPI Downloads        PyPI Latest Release        Demos and Source Code

https://github.com/user-attachments/assets/752e00c1-b7ce-488c-ae4e-1ae90a8e0fec

Problem

Data scientists use Pandas/Polars dataframes to work with their data, but nontechnical contributors need a GUI like Excel. This is frustrating because:

  1. The developers have to manage the back and forth for converting to Excel, downloading the file, uploading it back into thier IDE, and then reformatting the data.
  2. The code previously written to work with the DF may fail if during the conversion process the formatting changes signficantly
  3. It becomes difficult to manage a large amount of edits and different versions of the Excel files especially if the developers are using dataframe versioning tools like Weights and Biases Artifact logging.

Goal

Developers generate a URL for free that they can then send to nontechnical contributors enabling them to modify the dataframe with a web app.

Technical Contributor Features

  • one function call to generate a link to send, accessible anywhere
  • changes made by the client are received back as a dataframe for seamless development
  • compatale for both pandas and polars dataframes

Nontechnical Contributor Features

  • Easy Google OAuth login
  • Seamless UI to modify the dataframe
    • Rename columns (Shift Click)
    • Drag around columns
    • Modify values
    • Add new columns and rows
  • Send the results back with the click of a button
  • Work with large amounts of data quickly
  • Multiple collaborator support
    • See which cells other collaborators are editing
    • Sync all changes other collaborators live

How to Run

  1. pip install share-df
  2. If you do not already have one, generate an auth token for free in less than a minute with ngrok
  3. Create a .env file in your directory with NGROK_AUTHTOKEN=
  4. import and call the function on any df!

Example Code

import pandas as pd
from share_df import pandaBear

df = pd.DataFrame({
    'Name': ['John', 'Alice', 'Bob', 'Carol'],
    'City': ['New York', 'London', 'Paris', 'Tokyo'],
    'Salary': [50000, 60000, 75000, 65000]
})

df = pandaBear(df)
print(df)

Handling Big Data

  • As per the demo, currently, the site takes 6 seconds to load a million rows.
  • After loading, it can handle cell changes, row additions, column sorting, new columns, fast scrolling, and sending the data back frictionlessly.
  • That being said given interest I can optimize this.

Google Colab

  • This code works by creating a localhost and then tunneling traffic to make it accessible to other people.
  • Thereby, since Google Colab code runs on a VM this is an interesting challenge to handle.
  • As of 0.1.7 the package offers support for creating a Google-generated link for DFs but this link is not shareable.
  • For Google Colab instead of using a .env I recommend putting your NGROK_AUTHTOKEN into the Google Colab secrets manager (key icon on the left side of the screen). That way your secrets also can be synced to other notebooks and you don't have to repeat the .env uploading each time.
  • I initially aimed for full functionality (link sharing) with Google Colab however it seems impossible as Colab locks it to Colab session authentification.
  • Google has also stated that they may deprecate their serve_kernel_port_as_window function in the future in which case it will be swapped to serve_kernel_port_as_iframe and the same functionality will remain except it will be in the IFrame.
  • For now, there is an optional parameter that allows you to use the editor in IFrame mode.
  • Check out a demo notebook here.

Community Requested Features (eg. from the reddit thread)

☑ 3rd option for discarding changes (completed as of 1.1.0)

☑ FastAPI template for easier maintenance

☑ Multiple authenticated users

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

share_df-1.4.9.tar.gz (33.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

share_df-1.4.9-py3-none-any.whl (32.6 kB view details)

Uploaded Python 3

File details

Details for the file share_df-1.4.9.tar.gz.

File metadata

  • Download URL: share_df-1.4.9.tar.gz
  • Upload date:
  • Size: 33.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.4 CPython/3.13.2 Darwin/24.4.0

File hashes

Hashes for share_df-1.4.9.tar.gz
Algorithm Hash digest
SHA256 a8c507a41cd30e8bae3d7ef75a9e985fb0c4c7e5ac749b7b4d243cd9458ec16f
MD5 2bc2b19cb4739d46fd6921b029f0c4d5
BLAKE2b-256 8464599898d2c17e5ed82d92bfb911b78321acfe3629b5f3805c8b7638ee11ca

See more details on using hashes here.

File details

Details for the file share_df-1.4.9-py3-none-any.whl.

File metadata

  • Download URL: share_df-1.4.9-py3-none-any.whl
  • Upload date:
  • Size: 32.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.4 CPython/3.13.2 Darwin/24.4.0

File hashes

Hashes for share_df-1.4.9-py3-none-any.whl
Algorithm Hash digest
SHA256 155a4a2cda6bdf3bb59987425aceef74f63c2a54303668a47d622673c236f1cd
MD5 8ef0a43467eca33aa35e889b5e850820
BLAKE2b-256 87fdbbe14ef8dae64922d2a5ceb16c9a498a3c930a69ae93b14876ec10ba0584

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page