Skip to main content

Sycamore is an LLM-powered semantic data preparation system for building search applications.

Project description

SycamoreLogoFinal.svg

PyPI PyPI - Python Version Slack Docs License

Sycamore is a conversational search and analytics platform for complex unstructured data, such as documents, presentations, transcripts, embedded tables, and internal knowledge repositories. It retrieves and synthesizes high-quality answers through bringing AI to data preparation, indexing, and retrieval. Sycamore makes it easy to prepare unstructured data for search and analytics, providing a toolkit for data cleaning, information extraction, enrichment, summarization, and generation of vector embeddings that encapsulate the semantics of data. Sycamore uses your choice of generative AI models to make these operations simple and effective, and it enables quick experimentation and iteration. Additionally, Sycamore uses OpenSearch for indexing, enabling hybrid (vector + keyword) search, retrieval-augmented generation (RAG) pipelining, filtering, analytical functions, conversational memory, and other features to improve information retrieval.

Untitled

Features

  • Natural language, conversational interface to ask complex questions on unstructured data. Includes citations to source passages and conversational memory.
  • Includes a variety of query operations over unstructured data, including hybrid search, retrieval augmented generation (RAG), and analytical functions.
  • Prepares and enriches complex unstructured data for search and analytics through advanced data segmentation, LLM-powered UDFs for data enrichment, performant data manipulation with Python, and vector embeddings using a variety of AI models.
  • Helpful features like automatic data crawlers (Amazon S3 and HTTP) and Jupyter notebook support to create and iterate on data preparation scripts.
  • Scalable, secure, and customizable OpenSearch backend for indexing and data retrieval.

Demo

Hosted on Loom

Get Started

You can easily deploy Sycamore locally or on a virtual machine using Docker.

With Docker installed:

  1. Clone the Sycamore repo:

git clone https://github.com/aryn-ai/sycamore

  1. Set OpenAI Key:

export OPENAI_API_KEY=YOUR-KEY

  1. Go to:

/sycamore

  1. Launch Sycamore. Conatainers will be pulled from DockerHub:

docker compose up --pull=always

  1. The Sycamore demo query UI will be at localhost:3000

You can next choose to run a demo that prepares and ingests data from the Sort Benchmark website, crawl data from a public website, or write your own data preparation script.

For more info about Sycamore’s data ingestion and preparation feature set, visit the Sycamore documentation.

Resources

Contributing

Check out our Contributing Guide for more information about how to contribute to Sycamore and set up your environment for development.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sycamore_ai-0.1.15.tar.gz (11.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sycamore_ai-0.1.15-py3-none-any.whl (11.2 MB view details)

Uploaded Python 3

File details

Details for the file sycamore_ai-0.1.15.tar.gz.

File metadata

  • Download URL: sycamore_ai-0.1.15.tar.gz
  • Upload date:
  • Size: 11.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.0.0 CPython/3.12.3

File hashes

Hashes for sycamore_ai-0.1.15.tar.gz
Algorithm Hash digest
SHA256 1cd62b430cdc3b3aec601f23a08287c642b869ad8c0e1934a39b1566e4374e35
MD5 20d831d6c4192e53b4209e73f080ad48
BLAKE2b-256 6afaca5a87048caadd3dc8d550fbb44c86437c722805432031b252758d9e54f2

See more details on using hashes here.

File details

Details for the file sycamore_ai-0.1.15-py3-none-any.whl.

File metadata

  • Download URL: sycamore_ai-0.1.15-py3-none-any.whl
  • Upload date:
  • Size: 11.2 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.0.0 CPython/3.12.3

File hashes

Hashes for sycamore_ai-0.1.15-py3-none-any.whl
Algorithm Hash digest
SHA256 f25b6462e79b3ae6d5b8e806312c9b183c59c7f0cd0b4c9d2f2f14ceceb8eba0
MD5 5fc2dbf8f8a14529835c3cb07c582180
BLAKE2b-256 47b79272ddb8985d2edacd92f2a502703437f1a23c6fd8658578f3a961e27bb2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page