Organize unstructured data
Project description
🌸 Lilac
NEW: Try the Lilac hosted demo with pre-loaded datasets
👋 Welcome
Lilac is an open-source product that helps you analyze, structure, and clean unstructured data with AI.
Lilac can be used from our UI or from Python.
https://github.com/lilacai/lilac/assets/2294279/cb1378f8-92c1-4f2a-9524-ce5ddd8e0c53
💻 Install
To install Lilac on your machine:
pip install lilac
You can also use Lilac with no installation by forking our public HuggingFace Spaces demo.
🔥 Getting started
Start a Lilac webserver from the CLI:
lilac start ~/my_project
Or start the Lilac webserver from Python:
import lilac as ll
ll.start_server(project_dir='~/my_project')
This will open start a webserver at http://localhost:5432/.
📁 Documentation
Visit our website: lilacml.com
💻 Why Lilac?
Lilac is a visual tool and a Python API that helps you:
- Explore datasets with natural language (e.g. documents)
- Enrich your dataset with metadata (e.g. PII detection, profanity, text statistics, etc.)
- Conceptually search and tag your data (e.g. find paragraphs about injury)
- Remove unwanted or problematic data based on your own criteria
- Analyze patterns in your data
Lilac runs completely on device using powerful open-source LLM technologies.
💬 Contact
For bugs and feature requests, please file an issue on GitHub.
For general questions, please visit our Discord.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.