The Superlinked vector computing library
Project description
Superlinked
Superlinked is a declarative Python SDK that enables you to turn complex data into vectors, in a way that fits the modern data stack and works with your favorite Vector Databases.
3 key areas of focus:
- Custom embedding model creation that fits your complex data entities.
- ETL for your vector index for both streaming and batch use-cases.
- Vector-native query language that helps you convert hybrid search queries to pure vector queries.
Visit Superlinked for more information about the company behind this product and our other initiatives.
Use-cases
- RAG: HR Knowledgebase
- Semantic Search: Movie Recommendations, Business News
- Recommendation Systems: E-commerce
- Analytics: User Acquisition
You can check a full list of examples here.
Reference
- Describe your data using Python classes with the @schema decorator.
- Describe your vector embeddings from building blocks with Spaces.
- Combine your embeddings into a queryable Index.
- Define your search with dynamic parameters and weights as a Query.
- Load your data using a Source.
- Define your transformations with a Parser (e.g.: from
pd.DataFrame
). - Run your configuration with an Executor.
You can check a list of our features or head to our documentation.
Try it out
Example on how to use Superlinked to experiment with the semantic search use-case.
Pre-requisities
In a notebook
Install the superlinked library:
%pip install superlinked
As a script
Ensure your python version is 3.10.x.
$> python -V
Python 3.10.9
If your python version is not 3.10.x
you might use pyenv to install it.
Upgrade pip and install the superlinked library
$> python -m pip install --upgrade pip
$> python -m pip install superlinked
Run the example
First run will take slightly longer as it has to download the embedding model.
from superlinked.framework.common.schema.schema import schema
from superlinked.framework.common.schema.schema_object import String
from superlinked.framework.common.schema.id_schema_object import IdField
from superlinked.framework.dsl.space.text_similarity_space import TextSimilaritySpace
from superlinked.framework.dsl.index.index import Index
from superlinked.framework.dsl.query.param import Param
from superlinked.framework.dsl.query.query import Query
from superlinked.framework.dsl.source.in_memory_source import InMemorySource
from superlinked.framework.dsl.executor.in_memory.in_memory_executor import InMemoryExecutor
@schema # Desribe your schemas.
class Document:
id: IdField # Each schema should have exactly one `IdField`.
body: String # Use `String` for text fields.
document = Document()
relevance_space = TextSimilaritySpace(text=document.body, model="sentence-transformers/all-mpnet-base-v2") # Select your semantic embedding model.
document_index = Index([relevance_space]) # Combine your spaces to a queryable index.
query = Query(document_index).find(document).similar(relevance_space.text, Param("query_text")) # Define your query with dynamic parameters.
source: InMemorySource = InMemorySource(document) # Connect a data source to your schema.
executor = InMemoryExecutor(sources=[source], indices=[document_index]) # Tie it all together to run your configuration.
app = executor.run()
source.put([{"id": "happy_dog", "body": "That is a happy dog"}])
source.put([{"id": "happy_person", "body": "That is a very happy person"}])
source.put([{"id": "sunny_day", "body": "Today is a sunny day"}])
print(app.query(query, query_text="Who is a positive friend?")) # Run your query.
Ready to go to production? We are launching our first Vector DB connectors soon! Tell us which Vector DB we should support!
Articles
- Vector DB Comparison: Open-source collaboritve comparison of vector databases by Superlinked.
- Vector Hub: VectorHub is a free and open-sourced learning hub for people interested in adding vector retrieval to their ML stack
Support
If you encounter any challanges during your experiments, feel free to create an issue, request a feature or to start a discussion. Make sure to group your feedback in separate issues and discussions by topic. Thank you for your feedback!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for superlinked-3.9.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 021ddd1a57da45310339dde3d6c810f7c0a1c3532f1f162386a4e8bbb5c1f046 |
|
MD5 | af966d7029c7d8fd4294ffb7080a0337 |
|
BLAKE2b-256 | bd79d1c16e7fb434436fab3d4d3bb58a671f26e58f021164d3a1667cc94c6480 |