Continuously learning memory layer for LLM applications: signals, stacks, decay, two-tier retrieval.
Project description
Street AI
Continuously learning memory layer for LLM applications. Your AI's memory grows forever. Your token bill doesn't.
Street AI sits between your application and the LLM API. It stores conversation as signals organized into stacks, decays old data automatically, and retrieves only what's relevant on each turn — so you send a tiny prompt instead of the full conversation history.
Status
Alpha (0.1.0). API will change. Pin a version if you depend on it.
Install
pip install streetai-memory
The PyPI name is streetai-memory; the import path is streetai:
from streetai import Memory, MemoryRegistry, Config
First use downloads a ~25MB embedding model (all-MiniLM-L6-v2) into a local cache.
To install with provider adapters:
pip install "streetai-memory[anthropic]" # Anthropic
pip install "streetai-memory[openai]" # OpenAI (also DeepSeek, Together, Groq)
pip install "streetai-memory[gemini]" # Google Gemini
pip install "streetai-memory[all]" # all of the above
Quickstart
from streetai import MemoryRegistry
registry = MemoryRegistry("./memory.db")
mem = registry.get("user_123")
mem.add_message("Hi, I'm planning a trip to Japan.", role="user")
mem.add_message("Great! Which cities?", role="assistant")
prompt = mem.build_prompt("What did I say about Japan?")
# prompt.messages -> list of {role, content} ready for any LLM API
# prompt.retrieved -> signals that were pulled in (pass to post_process)
# prompt.inspector -> debug info (stacks activated, scores, etc.)
# After your LLM responds:
# response_text = your_llm(messages=prompt.messages)
# mem.post_process(prompt.retrieved, response_text)
# mem.add_message("What did I say about Japan?", role="user")
# mem.add_message(response_text, role="assistant")
For a fully runnable version, see examples/quickstart.py.
Drop-in adapters
The adapters wrap a real provider client. You use the same SDK API you already know; memory is read and written transparently on every call.
Anthropic
from anthropic import Anthropic
from streetai.adapters.anthropic import with_memory
client = with_memory(Anthropic(), memory_id="user_123")
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
system="You are helpful.",
messages=[{"role": "user", "content": "What did I mention earlier?"}],
)
print(response.content[0].text)
Full example: examples/anthropic_chat.py.
OpenAI
from openai import OpenAI
from streetai.adapters.openai import with_memory
client = with_memory(OpenAI(), memory_id="user_123")
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "What did I mention earlier?"}],
)
print(response.choices[0].message.content)
Full example: examples/openai_chat.py.
DeepSeek (uses the OpenAI adapter)
DeepSeek is OpenAI-API-compatible. Use the OpenAI adapter with base_url:
import os
from openai import OpenAI
from streetai.adapters.openai import with_memory
deepseek = OpenAI(
api_key=os.environ["DEEPSEEK_API_KEY"],
base_url="https://api.deepseek.com/v1",
)
client = with_memory(deepseek, memory_id="user_123")
response = client.chat.completions.create(
model="deepseek-chat",
messages=[{"role": "user", "content": "What did I mention earlier?"}],
)
The same pattern works for Together, Anyscale, Groq, and any other
OpenAI-compatible endpoint. Full example: examples/deepseek_chat.py.
Google Gemini
from google import genai
from streetai.adapters.gemini import with_memory
client = with_memory(genai.Client(api_key="..."), memory_id="user_123")
response = client.models.generate_content(
model="gemini-2.0-flash",
contents="What did I mention earlier?",
)
print(response.text)
Full example: examples/gemini_chat.py.
How it works
your message
|
v
[1] split into chunks (sentence-sized signals)
|
v
[2] embed each chunk to a 384-dim vector
|
v
[3] assign to a stack (cluster of related signals) by cosine similarity
|
v
[4] when a new query arrives:
- find top-K most relevant stacks (FAISS)
- within those stacks, surface signals that pass the activation threshold
- drop signals whose effective weight has decayed below death
|
v
[5] build a small prompt:
[retrieved context] + [last N messages verbatim] + [new query]
|
v
[6] after the LLM responds:
- boost signals that matched the response (they helped)
- demote signals that didn't (they were noise)
- decay continues until the signal is used again
Signals refresh their age clock every time they're retrieved — frequently useful data stays sharp; unused data fades. No retraining, no manual pruning.
Compared to alternatives
| Plain chat history | RAG (vector DB) | Street AI | |
|---|---|---|---|
| Prompt grows with conversation | Yes — linear | No (replaces history) | No (compresses history) |
| Recent context kept verbatim | Yes | No — replaced by retrieval | Yes — recency window |
| Time-aware (decay) | No | No | Yes — built in |
| Learns from outcomes | No | No | Yes — boost/demote |
| Self-organizing | N/A | Manual chunking | Yes — auto-stacks |
| Cross-provider | Yes | Sometimes | Yes |
Configuration
Override defaults with Config:
from streetai import MemoryRegistry, Config
cfg = Config(
recency_turns=5, # last 5 messages verbatim (default 3)
decay_rate=1.0/86400, # 1-day half-life (default ~ 1 week)
stack_threshold=0.65, # tighter stack assignment (default 0.55)
activation_threshold=0.1, # min score for a signal to surface (default 0.15)
)
registry = MemoryRegistry("./memory.db", config=cfg)
All tunables: see streetai/config.py.
Limitations (v0.1)
- Sync clients only. Async wrappers come later.
- Non-streaming only.
stream=TrueraisesNotImplementedError. - English-tuned defaults. Chunking and thresholds may need tuning for other languages.
- fastembed is required. Pluggable encoders come in a future version.
Development
git clone https://github.com/Tem-Degu/streetai-memory.git
cd streetai-memory
pip install -e ".[dev]"
pytest
License
Apache 2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file streetai_memory-0.1.0.tar.gz.
File metadata
- Download URL: streetai_memory-0.1.0.tar.gz
- Upload date:
- Size: 26.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
150171a0e6f423e0c88e4c6b984f9a09e948b223a9aa0051b1ff3162aee5c6c7
|
|
| MD5 |
bf5bce99e30336dbeefe60fe98cc657f
|
|
| BLAKE2b-256 |
fc1abb98955967e19b518ff6f7c3853a2bb7ae8a782cf615dae94f24815da7bb
|
Provenance
The following attestation bundles were made for streetai_memory-0.1.0.tar.gz:
Publisher:
publish.yml on Tem-Degu/streetai-memory
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
streetai_memory-0.1.0.tar.gz -
Subject digest:
150171a0e6f423e0c88e4c6b984f9a09e948b223a9aa0051b1ff3162aee5c6c7 - Sigstore transparency entry: 1601007843
- Sigstore integration time:
-
Permalink:
Tem-Degu/streetai-memory@aa2227a27e321185bef7688fc8880523426abb21 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/Tem-Degu
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@aa2227a27e321185bef7688fc8880523426abb21 -
Trigger Event:
push
-
Statement type:
File details
Details for the file streetai_memory-0.1.0-py3-none-any.whl.
File metadata
- Download URL: streetai_memory-0.1.0-py3-none-any.whl
- Upload date:
- Size: 25.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4e20b94215832dd13b25f59e14471a5059881cbe9c50f8f2941f75947c453007
|
|
| MD5 |
428632fe37d3ece979ea715412bc0b24
|
|
| BLAKE2b-256 |
9198a249f109ce568559904886923fcd0a0e31eb0512d3e59b5a07ea1d0ffaaf
|
Provenance
The following attestation bundles were made for streetai_memory-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on Tem-Degu/streetai-memory
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
streetai_memory-0.1.0-py3-none-any.whl -
Subject digest:
4e20b94215832dd13b25f59e14471a5059881cbe9c50f8f2941f75947c453007 - Sigstore transparency entry: 1601007984
- Sigstore integration time:
-
Permalink:
Tem-Degu/streetai-memory@aa2227a27e321185bef7688fc8880523426abb21 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/Tem-Degu
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@aa2227a27e321185bef7688fc8880523426abb21 -
Trigger Event:
push
-
Statement type: