Personal feed radar built on clawsqlite interest clusters

These details have not been verified by PyPI

Project links

Project description

clawfeedradar

Personal "reading radar" built on top of clawsqlite: it pulls articles from Hacker News / RSS / arXiv and ranks them against your existing knowledge base, generating a personalized RSS feed with optional bilingual summaries.

Goal: simple, controllable, auditable, and loosely coupled with clawsqlite. This document describes the current v1 implementation (branch bot/20260402-embedding).

When running inside OpenClaw

clawfeedradar expects an existing clawsqlite knowledge base with interest clusters.

In an OpenClaw workspace, the recommended way to set this up is:

Install the clawsqlite-knowledge skill (if not already installed):
```
openclaw skills add clawsqlite-knowledge
```
Or via the web catalog:
- https://clawhub.ai/skills/clawsqlite-knowledge
Initialize and build interest clusters using that skill (see the skill README for the exact commands).

Once clawsqlite-knowledge is installed and has built interest_clusters, clawfeedradar can attach to the same DB and reuse the interest space for scoring.

Overview

clawfeedradar does three things:

Fetch candidates from external sources
Currently HN RSS and generic RSS are supported; arXiv and others can be added later.
Score candidates using clawsqlite interest clusters
It uses embeddings + interest clusters to estimate how well each candidate matches your long-term interests, then applies light recency / popularity bias.
Generate an RSS feed with optional bilingual content
For selected items it scrapes fulltext, calls a small LLM to produce preview summary + bilingual body, and writes an XML + JSON pair.

In short: clawsqlite knows what you like; clawfeedradar goes out, finds similar content, and feeds it to your RSS reader.

Quick start

1. Prepare environment

Assume a workspace like:

~/.openclaw/workspace/
  ├── clawsqlite          # clawsqlite repo
  ├── knowledge_data      # your clawsqlite-knowledge DB
  ├── clawfeedradar       # this repo
  └── clawfetch / ...     # fulltext scraper + wrapper

Make sure clawsqlite has built interest clusters:

cd ~/.openclaw/workspace/clawsqlite
clawsqlite knowledge build-interest-clusters \
  --root ~/.openclaw/workspace/knowledge_data

2. Configure clawfeedradar

cd ~/.openclaw/workspace/clawfeedradar
cp ENV.example .env
# edit .env for your local setup

You need to configure at least:

clawsqlite knowledge base
- CLAWSQLITE_ROOT pointing to knowledge_data
- CLAWSQLITE_DB if you want an explicit sqlite path
Embedding service (shared with clawsqlite)
- EMBEDDING_BASE_URL / EMBEDDING_MODEL / EMBEDDING_API_KEY
- CLAWSQLITE_VEC_DIM must match the embedding model dimension
Output directory
- CLAWFEEDRADAR_OUTPUT_DIR for XML/JSON output
Fulltext scraping
- CLAWFEEDRADAR_SCRAPE_CMD pointing to a wrapper that calls clawfetch
- CLAWFEEDRADAR_SCRAPE_WORKERS controlling parallelism (per-host serialization still applies)
Small LLM (summaries + bilingual body)
- SMALL_LLM_BASE_URL / SMALL_LLM_MODEL / SMALL_LLM_API_KEY
- CLAWFEEDRADAR_LLM_CONTEXT_TOKENS (approx token budget, internally converted to char budget)
- CLAWFEEDRADAR_LLM_MAX_PARAGRAPH_CHARS (screen-sized bilingual segments)
- CLAWFEEDRADAR_LLM_SLEEP_BETWEEN_MS
- CLAWFEEDRADAR_LLM_SOURCE_LANG / CLAWFEEDRADAR_LLM_TARGET_LANG
Scoring weights
- CLAWFEEDRADAR_W_RECENCY
- CLAWFEEDRADAR_W_POPULARITY
- CLAWFEEDRADAR_RECENCY_HALF_LIFE_DAYS
- CLAWFEEDRADAR_INTEREST_SIGMOID_K (steepness of S-shaped stretching around 0.5)
Default item count
- CLAWFEEDRADAR_MAX_ITEMS as the default when --max-items is omitted

3. Run a single feed (debug)

cd ~/.openclaw/workspace/clawfeedradar
python -m clawfeedradar.cli run \
  --root ~/.openclaw/workspace/orgmode/clawsqlite/data \
  --url https://feeds.bbci.co.uk/news/technology/rss.xml \
  --output ./feeds/bbc-tech.xml \
  --max-source-items 15 \
  --score-threshold 0.4 \
  --max-items 12 \
  --source-lang en \
  --target-lang zh

This produces:

./feeds/bbc-tech.xml
RSS feed where <description> combines:
- summary_preview (short summary in target language) and
- body_bilingual (full bilingual body, screen-sized segments).
./feeds/bbc-tech.json
Sidecar JSON containing fulltext, summary_preview, body_bilingual, and scoring details including:
- interest_score (S-shaped stretched score)
- interest_score_raw (linear interest score)
- final_score
- best_cluster_id / best_cluster_weight

Publishing feeds via Git (GitHub Pages / Gitee Pages)

clawfeedradar can optionally push the generated XML/JSON to a git repository, so that GitHub Pages or Gitee Pages can host your feed at a stable HTTPS URL.

1) GitHub Pages example

创建一个公开仓库，例如：github.com/yourname/clawfeedradar-feed，在 Settings → Pages 中启用 Pages 功能，选择：
- Branch: gh-pages
- Directory: / 或 feeds/（下面示例使用 feeds/）
在运行 clawfeedradar 的机器上，配置好 git 访问 GitHub 的方式（SSH key 或 HTTPS+PAT）。

在 .env 中添加：

CLAWFEEDRADAR_PUBLISH_GIT_REPO=git@github.com:yourname/clawfeedradar-feed.git
CLAWFEEDRADAR_PUBLISH_GIT_BRANCH=gh-pages
CLAWFEEDRADAR_PUBLISH_GIT_PATH=feeds

之后每次运行 clawfeedradar run / schedule：
- clawfeedradar 会在本地 ./.publish/yourname-clawfeedradar-feed/ 下维护一个 clone；
- 将生成的 *.xml / *.json 拷贝到该 clone 的 feeds/ 目录；
- 自动执行 git add / git commit / git push。

最终订阅地址类似于：

https://yourname.github.io/clawfeedradar-feed/feeds/bbc-tech.xml

2) Gitee Pages example（适合国内网络）

步骤与 GitHub 类似，只是把远端换成 Gitee：

创建 Gitee 仓库，例如：gitee.com/yourname/clawfeedradar-feed，在 Pages 设置中启用 Gitee Pages（选择对应分支和目录）。
在运行环境中配置好访问 git@gitee.com:... 的 SSH key。

在 .env 中添加：

CLAWFEEDRADAR_PUBLISH_GIT_REPO=git@gitee.com:yourname/clawfeedradar-feed.git
CLAWFEEDRADAR_PUBLISH_GIT_BRANCH=gh-pages
CLAWFEEDRADAR_PUBLISH_GIT_PATH=feeds

之后 clawfeedradar 的行为与 GitHub 情况一致：每次生成 XML/JSON 后自动 git add / commit / push 到 Gitee 仓库。

Gitee Pages 的订阅地址通常类似：

https://yourname.gitee.io/clawfeedradar-feed/feeds/bbc-tech.xml

如未配置 CLAWFEEDRADAR_PUBLISH_GIT_REPO，clawfeedradar 仅在本地写 XML/JSON，不会尝试推送任何远端仓库。

Configuration (env overview)

（其余章节与之前相同，略）

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Apr 8, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

clawfeedradar-0.1.0.tar.gz (38.4 kB view details)

Uploaded Apr 8, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

clawfeedradar-0.1.0-py3-none-any.whl (41.7 kB view details)

Uploaded Apr 8, 2026 Python 3

File details

Details for the file clawfeedradar-0.1.0.tar.gz.

File metadata

Download URL: clawfeedradar-0.1.0.tar.gz
Upload date: Apr 8, 2026
Size: 38.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.11

File hashes

Hashes for clawfeedradar-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`0b35549e41f6af520f1a8bb88fdad7be925a15cc028fccfb5ce82c5a32a97286`
MD5	`1f6542a677a3fefb43fc1d3cca38f618`
BLAKE2b-256	`96a62747e9e9546e7626260ed4f3ae0e1c5325a5eca969ae784553ca2377a06c`

See more details on using hashes here.

File details

Details for the file clawfeedradar-0.1.0-py3-none-any.whl.

File metadata

Download URL: clawfeedradar-0.1.0-py3-none-any.whl
Upload date: Apr 8, 2026
Size: 41.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.11

File hashes

Hashes for clawfeedradar-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`9bb12a5f3999d890499a7a901be189ae1110d4843a54ea104b226b25f3c070d3`
MD5	`1a0a532bc348dd771d9fb57f3cf2661d`
BLAKE2b-256	`5410929d5841a41e87bab45424e9f352b1824322781c2321f00b68dae6cf0231`

See more details on using hashes here.

clawfeedradar 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

clawfeedradar

When running inside OpenClaw

Overview

Quick start

1. Prepare environment

2. Configure clawfeedradar

3. Run a single feed (debug)

Publishing feeds via Git (GitHub Pages / Gitee Pages)

1) GitHub Pages example

2) Gitee Pages example（适合国内网络）

Configuration (env overview)

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes