Skip to main content

Sycamore is an LLM-powered semantic data preparation system for building search applications.

Project description

SycamoreLogoFinal.svg

PyPI PyPI - Python Version Slack Docs License

Sycamore is a conversational search and analytics platform for complex unstructured data, such as documents, presentations, transcripts, embedded tables, and internal knowledge repositories. It retrieves and synthesizes high-quality answers through bringing AI to data preparation, indexing, and retrieval. Sycamore makes it easy to prepare unstructured data for search and analytics, providing a toolkit for data cleaning, information extraction, enrichment, summarization, and generation of vector embeddings that encapsulate the semantics of data. Sycamore uses your choice of generative AI models to make these operations simple and effective, and it enables quick experimentation and iteration. Additionally, Sycamore uses OpenSearch for indexing, enabling hybrid (vector + keyword) search, retrieval-augmented generation (RAG) pipelining, filtering, analytical functions, conversational memory, and other features to improve information retrieval.

Untitled

Features

  • Natural language, conversational interface to ask complex questions on unstructured data. Includes citations to source passages and conversational memory.
  • Includes a variety of query operations over unstructured data, including hybrid search, retrieval augmented generation (RAG), and analytical functions.
  • Prepares and enriches complex unstructured data for search and analytics through advanced data segmentation, LLM-powered UDFs for data enrichment, performant data manipulation with Python, and vector embeddings using a variety of AI models.
  • Helpful features like automatic data crawlers (Amazon S3 and HTTP) and Jupyter notebook support to create and iterate on data preparation scripts.
  • Scalable, secure, and customizable OpenSearch backend for indexing and data retrieval.

Demo

Hosted on Loom

Get Started

You can easily deploy Sycamore locally or on a virtual machine using Docker.

With Docker installed:

  1. Clone the Sycamore repo:

git clone https://github.com/aryn-ai/sycamore

  1. Set OpenAI Key:

export OPENAI_API_KEY=YOUR-KEY

  1. Go to:

./sycamore

  1. Launch Sycamore. Containers will be pulled from DockerHub:

docker compose up --pull=always

  1. The Sycamore demo query UI will be at localhost:3000

You can next choose to run a demo that prepares and ingests data from the Sort Benchmark website, crawl data from a public website, or write your own data preparation script.

For more info about Sycamore’s data ingestion and preparation feature set, visit the Sycamore documentation.

Resources

Contributing

Check out our Contributing Guide for more information about how to contribute to Sycamore and set up your environment for development.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sycamore_ai-0.1.16.tar.gz (11.4 MB view hashes)

Uploaded Source

Built Distribution

sycamore_ai-0.1.16-py3-none-any.whl (11.4 MB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page