Skip to main content

Easily distribute language models across multiple systems

Project description

Language Pipes (Beta)

A privacy focused distributed algorithm for llm inference

GitHub license Release

Language Pipes is an open-source distributed network application designed to increase access to local language models by allowing for privacy protected computation between peer to peer nodes.

Disclaimer: This software is currently in Beta. Please be patient and if you encounter an error, please fill out a github issue!


Features

  • Quick Setup
  • OpenAI compatible API
  • Privacy-focused architecture
  • Decentralized peer to peer network
  • Download and use models by HuggingFace ID

What Does Language Pipes do?

Large Language models work by passing information through many layers. At each layer, several matrix multiplicatitons between the layer weights and the system state are performed and the data is moved to the next layer. Language pipes works by hosting different layers on different machines to split up the RAM cost across the system. This project contrasts with existing programs like vLLM by focusing on decentralization and privacy.

Here are some helpful links to get started:

Installation

Ensure that you have Python 3.10.18 (or any 3.10 version) installed. For an easy to use Python version manager use pyenv. This specific version is necessary for the transformers library to work properly.

If you need gpu support, first make sure you have the correct pytorch version installed for your GPU's Cuda compatibility using this link:
https://pytorch.org/get-started/locally/

To download the models from Huggingface, ensure that you have git and git lfs installed.

To start using the application, install the latest version of the package from PyPi.

Using Pip:

pip install language-pipes

Quick Start

The easiest way to get started is with the interactive setup wizard:

language-pipes

This launches a menu where you can create, view, and load configurations. Select Create Config to walk through the setup wizard, which guides you through your first configuration. After creating a config, select Load Config to start the server.

We also support loading toml files directly!
If you need help loading them read the CLI documentation here.


Two Node Example

This example shows how to distribute a model across two computers using the interactive wizard.

Node 1 (First Computer)

Start language pipes:

language-pipes
Prompt Value Description
Node ID node-1 Unique identifier for this node on the network
Model ID Qwen/Qwen3-1.7B HuggingFace model to load
Device cpu Hardware to run inference on
Max memory 1 GB of RAM to use (loads part of the model)
Load embedding/output layers Y Required for the first node to handle input/output
Enable OpenAI API Y Exposes the OpenAI-compatible endpoint
API port 8000 Port for the API server
First node in network Y This node starts the network
Encrypt network traffic N Disable encryption for simplicity

Node 2 (Second Computer)

Start language pipes with this command:

language-pipes
Prompt Value Description
Node ID node-2 Unique identifier for this node on the network
Model ID Qwen/Qwen3-1.7B Must match the model on node-1
Device cpu Hardware to run inference on
Max memory 3 GB of RAM to use (loads remaining layers)
Load embedding/output layers N Node-1 already handles these
Enable OpenAI API N Only node-1 needs the API
First node in network N This node joins an existing network
Bootstrap node IP 192.168.0.10 Node-1's local IP address
Bootstrap port 5000 Node-1's network port
Encrypt network traffic N Must match node-1's setting

Node-2 connects to node-1 and loads the remaining model layers. The model is now ready for inference!

Test the API

The model is accessible via an OpenAI-compatible API. Using the OpenAI Python library:

from openai import OpenAI

client = OpenAI(
    base_url="http://127.0.0.1:8000/v1",  # node-1 IP address
    api_key="not-needed"  # API key not required for Language Pipes
)

response = client.chat.completions.create(
    model="Qwen/Qwen3-1.7B",
    max_completion_tokens=100,
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Write a haiku about distributed systems."}
    ]
)

print(response.choices[0].message.content)

Install the OpenAI library with: pip install openai

To learn about how to work with the Open AI compatable server click here.

Model choice

Currently Language Pipes targets the Qwen3 and Qwen3-moe architectures.

Future Updates

There are plans to update the project in the future if it gets enough traction. These improvements include:

  • More models supported
  • 8 bit and 4 bit quantization support (currently everything is run in fp16)
  • GGUF support (currently everything needs to be in safetensors format)
  • Responses endpoint (currently only /v1/chat/completions is supported)
  • huggingface library support for downloading models that require authentication (currently git-lfs)

So please star the repo if you find it useful :)

Dependencies

Documentation

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

language_pipes-0.19.2.tar.gz (89.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

language_pipes-0.19.2-py3-none-any.whl (53.0 kB view details)

Uploaded Python 3

File details

Details for the file language_pipes-0.19.2.tar.gz.

File metadata

  • Download URL: language_pipes-0.19.2.tar.gz
  • Upload date:
  • Size: 89.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for language_pipes-0.19.2.tar.gz
Algorithm Hash digest
SHA256 fa36b95a76637fdbbcf979602d2893adb90e735b59a924c8a8d92780dd0ebce5
MD5 a2925dc87e050f7e69b5350be748735c
BLAKE2b-256 e7255a05c49873416ef7a98783fcfe249c1d5395d8d8d9251c9b44df2317e1ca

See more details on using hashes here.

Provenance

The following attestation bundles were made for language_pipes-0.19.2.tar.gz:

Publisher: publish.yml on erinclemmer/language-pipes

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file language_pipes-0.19.2-py3-none-any.whl.

File metadata

File hashes

Hashes for language_pipes-0.19.2-py3-none-any.whl
Algorithm Hash digest
SHA256 7ea31f7fcb67c7358f64ea0a85e6511fca5384af2c4dd64f1c6519eaea36d9e5
MD5 f87aca2c0d743340806095eeacb5b1da
BLAKE2b-256 22e9cd3293d94b229d68b26e4f147bf0660919709a86ef5ab4af384063e86ce4

See more details on using hashes here.

Provenance

The following attestation bundles were made for language_pipes-0.19.2-py3-none-any.whl:

Publisher: publish.yml on erinclemmer/language-pipes

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page