Skip to main content

A kvcache memory pool

Project description

Run pre-commit checks Slack Docs

What's InfiniStore

InfiniStore is an open-source high-performance KV store. It's designed to support LLM Inference clusters, whether the cluster is in prefill-decoding disaggregation mode or not. InfiniStore provides high-performance and low-latency KV cache transfer and KV cache reuse among inference nodes in the cluster.

In addition to inference clusters, InfiniStore can also be used as a standalone KV store to integrate with any other LLM training or inference services.

Usage

There are two major scenarios how InfiniStore supports :

  • Prefill-Decoding disaggregation clusters: in such mode inference workloads are separated into two node pools: prefill nodes and decoding nodes. InfiniStore enables KV cache transfer among these two types of nodes, and also KV cache reuse.
  • Non-disaggregated clusters: in such mode prefill and decoding workloads are mixed on every node. Infinistore serves as an extra large KV cache pool in addition to GPU cache and local CPU cache, and also enables cross-node KV cache reuse.

InfiniStore Usage

Currently InfiniStore has been integrated with vLLM. The integration is done via LMCache for the flexibility purpose.

Integration with SGLang and other inference engines are in progress.

Installation

Install from PIP

Most users just need to deploy and run InfiniStore, and they don't need to understand how InfiniStore works internally. For these users, PIP is the recommended way to install:

pip install InfiniStore

Install from Source Code

For users who need to understand how InfiniStore code works or make code contributions to InfiniStore, it's recommended to install from source code:

apt install libuv1-dev
apt install libflatbuffers-dev
apt install libspdlog-dev libfmt-dev
apt install ibverbs-utils libibverbs-dev
apt install libboost-dev libboost-stacktrace-dev
pip install -e .
pip install pre-commit
pre-commit install

Verify Your Installation

After installation, either from PIP or from source code, run the following command to verify your installation is successful:

InfiniStore --manage-port 8088
curl http://127.0.0.1:8088/selftest

Run InfiniStore

Run As a Standalone Service

  1. Start InfiniStore Server

The first step is to start an InfiniStore server. The server can be running on a GPU machine or a CPU machine.

Your server machine may be equipped with TCP network or RDMA network. The command line to start a server varies depending on the network configurations:

For TCP/IP Network:

InfiniStore --service-port 12345

For RDMA(RoCE):

InfiniStore --service-port 12345 --dev-name mlx5_0 --link-type Ethernet

For RDMA(Infiniband):

InfiniStore --service-port 12345 --dev-name mlx5_0 --link-type IB
  1. Run InfiniStore Client

Check the following example code to run an InfiniStore client:

  • InfiniStore/example/client.py
  • InfiniStore/example/client_async.py
  • InfiniStore/example/client_async_single.py

Run Within a vLLM Cluster

As illustrated in the previous section, InfiniStore enables different functionalities in a vLLM cluster: KV cache transfer between prefill nodes and decoding nodes, extended KV cache pool, cross-node KV cache reuse, etc.

The setup will varies depending on the specific vLLM cluster configurations. But usually it requires the following installations:

  • Install vLLM on all nodes
  • Install LMCache on all nodes
  • Install InfiniStore on all nodes

Because this setup is a complicated process, we've made a separate demo repo for the PD disaggregation setup.

Please refer to the repo for the details: https://github.com/bytedance-iaas/splitwise-demos

Contribute to InfiniStore

InfiniStore is an open-source project and community. We welcome anyone who is interested in helping improve InfiniStore, whether code contributions, document contributions or any other contributions.

If you are submitting a code change, run the following unit test and pre-commit check to ensure your code change doesn't break existing features before submitting the PR:

  1. Make your code changes

    Just clone this repo, make code changes according to your feature design.

  2. Run Unit Tests

pytest InfiniStore/test_InfiniStore.py
  1. Run Pre-commit Checks
pre-commit run --all-files
  1. Submit PR

If you code change passes both unit tests and pre-commit checks, submit the PR.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

infinistore-0.2.35-cp312-cp312-manylinux_2_28_x86_64.whl (9.4 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

infinistore-0.2.35-cp311-cp311-manylinux_2_28_x86_64.whl (9.4 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

infinistore-0.2.35-cp310-cp310-manylinux_2_28_x86_64.whl (9.4 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

File details

Details for the file infinistore-0.2.35-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for infinistore-0.2.35-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 0d8fcc3b2a3fd9d9bd47ab1371af5cb86a2861f839127a0d5b1ca0dc97c49a9c
MD5 bc255d671df94d443720c2059da273a0
BLAKE2b-256 ff0f8f91c150be8cd36d8d95469061ba375017104b1e5f67bf4da617ff090f91

See more details on using hashes here.

File details

Details for the file infinistore-0.2.35-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for infinistore-0.2.35-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 7ae67278c281829e3054015b7ff5d9a54c7c868e785b981a8655aede606c0e41
MD5 141b5072b8a4f365ccfd989f3161087b
BLAKE2b-256 dee24c493482abe8e96c9f69b7983a0eb66723eaaaac1e042cc21e4171c29334

See more details on using hashes here.

File details

Details for the file infinistore-0.2.35-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for infinistore-0.2.35-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 d73e53270bea82251e4e36484cb1664db9fbe218cd5f780072b0a56e22549cde
MD5 b7deaedb6c902f046b50587f0de06eb4
BLAKE2b-256 668289afb55078e18bda5164ae6418ac1e07db1f359442951a8e3a9519c00b65

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page