linkml-store
Project description
linkml-store
An AI-ready data management and integration platform. LinkML-Store provides an abstraction layer over multiple different backends (including DuckDB, MongoDB, Neo4j, and local filesystems), allowing for common query, index, and storage operations.
For full documentation, see https://linkml.io/linkml-store/
See these slides for a high level overview.
Warning LinkML-Store is still undergoing changes and refactoring, APIs and command line options are subject to change!
Quick Start
Install, add data, query it:
pip install linkml-store[all]
linkml-store -d duckdb:///db/my.db -c persons insert data/*.json
linkml-store -d duckdb:///db/my.db -c persons query -w "occupation: Bricklayer"
Index it, search it:
linkml-store -d duckdb:///db/my.db -c persons index -t llm
linkml-store -d duckdb:///db/my.db -c persons search "all persons employed in construction"
Validate it:
linkml-store -d duckdb:///db/my.db -c persons validate
Basic usage
- Command Line
- Python
- API
- Streamlit applications
The CRUDSI pattern
Most database APIs implement the CRUD pattern: Create, Read, Update, Delete. LinkML-Store adds Search and Inference to this pattern, making it CRUDSI.
The notion of "Search" and "Inference" is intended to be flexible and extensible, including:
- Search
- Traditional keyword search
- Search using LLM Vector embeddings (without a dedicated vector database)
- Pluggable specialized search, e.g. genomic sequence (not yet implemented)
- Inference (encompassing validation, repair, and inference of missing data)
- Classic rule-based inference
- Inference using LLM Retrieval Augmented Generation (RAG)
- Statistical/ML inference
Features
Multiple Adapters
LinkML-Store is designed to work with multiple backends, giving a common abstraction layer
Coming soon: any RDBMS, any triplestore, Neo4J, HDF5-based stores, ChromaDB/Vector dbs ...
The intent is to give a union of all features of each backend. For example, analytic faceted queries are provided for all backends, not just Solr.
Composable indexes
Many backends come with their own indexing and search schemes. Classically this was Lucene-based indexes, now it is semantic search using LLM embeddings.
LinkML store treats indexing as an orthogonal concern - you can compose different indexing schemes with different backends. You don't need to have a vector database to run embedding search!
See How to Use-Semantic-Search
Use with LLMs
TODO - docs
Validation
LinkML-Store is backed by LinkML, which allows for powerful expressive structural and semantic constraints.
See Indexing JSON
Web API
There is a preliminary API following HATEOAS principles implemented using FastAPI.
To start you should first create a config file, e.g. db/conf.yaml
:
Then run:
export LINKML_STORE_CONFIG=./db/conf.yaml
make api
The API returns links as well as data objects, it's recommended to use a Chrome plugin for JSON viewing for exploring the API. TODO: add docs here.
The main endpoints are:
http://localhost:8000/
- the root of the APIhttp://localhost:8000/pages/
- browse the API via HTMLhttp://localhost:8000/docs
- the Swagger UI
Streamlit app
make app
Background
See these slides for more details
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for linkml_store-0.2.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5023d72593c785ecce6568bcddd0522742679d85d205d094085e080ee221cc54 |
|
MD5 | 01a07cb5c2358ab5ecc6f8675cc01ffc |
|
BLAKE2b-256 | 18012d3c48ac57c9edf0f3478f2c7e09801f7f899072558763c8b1afea820fa6 |