Skip to main content

Visualize s3 data

Project description

English | 简体中文

Data browser based on s3

Vis3 is a visualization tool for large language models and machine learning data, supporting cloud storage platforms with S3 protocol (AWS, Aliyun OSS, Tencent Cloud) and various data formats (json, jsonl.gz, warc.gz, md, mobi, epub, etc.). It offers interactive visualization through JSON, HTML, Markdown, and image views for efficient data analysis.

Features

  • Supports JSON, JSONL, WARC, and more, automatically recognizing data structures for clear, visual insights.
  • One-click field previews with seamless switching between Html, Markdown, and image views for intuitive operation.
  • Integrates with S3-compatible cloud storage (Aliyun OSS, AWS, Tencent Cloud) and local file parsing for easy data access.

https://github.com/user-attachments/assets/aa8ee5e8-c6d3-4b20-ae9d-2ceeb2eb2c41

Screenshots

  • File list File list
  • llm chat llm chat
  • JSONL / JSON jsonl
  • Parquet parquet
  • PDF pdf
  • HTML html
  • Video video

Getting Started

# python >= 3.11
pip install vis3

Or create a Python environment using conda:

Install miniconda

# 1. Create Python 3.11 environment using conda
conda create -n vis3 python=3.11

# 2. Activate environment
conda activate vis3

# 3. Install vis3
pip install vis3

# 4. Launch (no authentication)
vis3 --open

Upgrade to the latest version

pip install vis3 -U

Variables

ENABLE_AUTH

Enable authentication.

ENABLE_AUTH=1 vis3

BASE_DATA_DIR

Specify database (SQLite) directory.

BASE_DATA_DIR=your/database/path vis3

BASE_URL

Specity base url to the api call.

BASE_URL=/a/b/c

Local Development

conda create -n vis3-dev python=3.11

# Activate virtual environment
conda activate vis3-dev

# Install poetry
# https://python-poetry.org/docs/#installing-with-the-official-installer

# Install Python dependencies
poetry install

# Install frontend dependencies (install pnpm: https://pnpm.io/installation)
cd web && pnpm install

# Build frontend assets (in web directory)
pnpm build

# Start vis3
uvicorn vis3.main:app --reload

React Component npm

We provide a React component via npm for customizing your data preview ui.

npm i @vis3/kit

Community

Welcome to join the Opendatalab official WeChat group!

Related Projects

  • LabelU Image / Video / Audio annotation tool
  • LabelU-kit Web frontend annotation kit (LabelU is developed based on this kit)
  • LabelLLM Open-source LLM dialogue annotation platform
  • Miner U One-stop high-quality data extraction tool

License

This project is licensed under the Apache 2.0 license.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vis3-1.3.1.tar.gz (1.9 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vis3-1.3.1-py3-none-any.whl (2.0 MB view details)

Uploaded Python 3

File details

Details for the file vis3-1.3.1.tar.gz.

File metadata

  • Download URL: vis3-1.3.1.tar.gz
  • Upload date:
  • Size: 1.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.0.1 CPython/3.11.15 Linux/6.17.0-1010-azure

File hashes

Hashes for vis3-1.3.1.tar.gz
Algorithm Hash digest
SHA256 24e9bb8ba8c2a989a9920ffb9904ac7cb340f40fc64dc3e04f7146d058de8be1
MD5 1b631a06914b9e828409ba50387abcbb
BLAKE2b-256 bbacd5336d4c28eceeb08466a66d1606902b1a0dd329cc6e4ed1616d1f4533fe

See more details on using hashes here.

File details

Details for the file vis3-1.3.1-py3-none-any.whl.

File metadata

  • Download URL: vis3-1.3.1-py3-none-any.whl
  • Upload date:
  • Size: 2.0 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.0.1 CPython/3.11.15 Linux/6.17.0-1010-azure

File hashes

Hashes for vis3-1.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 fbee477cd0489c75eead5e080598e65cd2f5069b6ae9f995164230be5b4b07a8
MD5 c6d9f8852421e920a0d4991f33423c69
BLAKE2b-256 a7e2eeb532d57a040eda5ceeda36c8f19b10ef0a6a56282d735a92ef6a83c164

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page