Skip to main content

从 Feedland OPML 解析和提取 RSS/Atom feeds 文章内容

Project description

yonglelaoren-feedland-parser

从 Feedland OPML 解析和提取 RSS/Atom feeds 文章内容的工具。

功能特性

  • 解析 Feedland OPML 接口,提取所有订阅源
  • 支持混合格式的 RSS/Atom feeds
  • 每个 feed 最多提取 5 篇最新文章
  • 使用 Newspaper3k 和 BeautifulSoup 提取文章内容
  • 基于时间戳的去重机制,避免重复提取
  • 支持并行处理,提高效率
  • 输出 JSON 格式的提取结果

安装

环境要求

  • Python 3.11 或更高版本
  • 推荐使用 uv 作为包管理器

使用 uv 安装(推荐)

# 克隆仓库
git clone https://github.com/yonglelaoren/yonglelaoren-feedland-parser.git
cd yonglelaoren-feedland-parser

# 使用 uv 创建虚拟环境(自动使用 Python 3.11+)
uv venv

# 安装依赖
uv pip install -e ".[dev]"

# 激活虚拟环境
source .venv/bin/activate  # Linux/macOS
# 或
.venv\Scripts\activate  # Windows

uv 优势

  • ⚡️ 极快的依赖解析和安装速度
  • 🎯 自动管理 Python 版本
  • 🔒 精确的依赖锁定(uv.lock)
  • 📦 统一的包管理体验

从 PyPI 安装

pip install yonglelaoren-feedland-parser

从源码安装

git clone https://github.com/yonglelaoren/yonglelaoren-feedland-parser.git
cd yonglelaoren-feedland-parser
pip install -e .

配置

创建 config.json 配置文件:

{
  "url": "https://feedland.com/opml?screenname=yonglelaoren",
  "threads": 10,
  "his": {}
}

配置说明

  • url: Feedland OPML 接口地址(必需)
  • threads: 并行处理的线程数(可选,默认值:min(10, cpu_count() * 2 + 1)
  • his: 每个 feed 的最后提取时间映射(自动维护)

配置文件优先级

  1. 命令行 --config 参数指定的路径
  2. 当前目录的 config.json
  3. 用户配置目录 ~/.config/yonglelaoren-feedland-parser/config.json

使用

基本用法

yonglelaoren-feedland-parser

指定配置文件

yonglelaoren-feedland-parser --config /path/to/config.json

查看版本

yonglelaoren-feedland-parser --version

查看帮助

yonglelaoren-feedland-parser --help

输出格式

工具会输出 JSON 格式的提取结果,仅包含成功提取的文章:

[
  {
    "feed_url": "https://example.com/feed.xml",
    "feed_title": "Example Feed",
    "articles": [
      {
        "title": "文章标题",
        "url": "https://example.com/article1",
        "published": "2025-02-09T10:00:00Z",
        "author": "作者",
        "content": "文章主要内容..."
      }
    ]
  }
]

提取失败的信息会记录到日志中,不会影响 JSON 输出。

依赖

  • Python 3.11+
  • feedparser >= 6.0.10
  • newspaper3k >= 0.2.8
  • beautifulsoup4 >= 4.12.0
  • requests >= 2.31.0
  • lxml >= 4.9.0
  • python-dateutil >= 2.8.2

开发

安装开发依赖

# 使用 uv(推荐)
uv pip install -e ".[dev]"

# 或使用 pip
pip install -e ".[dev]"

运行测试

# 使用 uv
uv run pytest

# 或直接运行
pytest

代码格式化

# 使用 black
black src/ tests/

代码检查

# 类型检查
mypy src/

# 风格检查
flake8 src/ tests/

# 代码覆盖率测试
pytest --cov=src --cov-report=html

发布

构建 distribution 包

python -m build

发布到 PyPI

twine upload dist/*

打包为可执行文件

pip install -e ".[exe]"
pyinstaller cli.spec

Docker 部署

docker build -t yonglelaoren-feedland-parser .
docker run -v ./config.json:/app/config.json yonglelaoren-feedland-parser

许可证

MIT License

贡献

欢迎提交 Issue 和 Pull Request!

作者

yonglelaoren

致谢

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

yonglelaoren_feedland_parser-1.0.0.tar.gz (48.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

yonglelaoren_feedland_parser-1.0.0-py3-none-any.whl (25.5 kB view details)

Uploaded Python 3

File details

Details for the file yonglelaoren_feedland_parser-1.0.0.tar.gz.

File metadata

  • Download URL: yonglelaoren_feedland_parser-1.0.0.tar.gz
  • Upload date:
  • Size: 48.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for yonglelaoren_feedland_parser-1.0.0.tar.gz
Algorithm Hash digest
SHA256 494ed3794eda1ec5b3e5382f08abcd960edb0e9d4909cb6c819c3a81c3e0f58b
MD5 3fb7251df060b4ef272b53d3cbf0d5e2
BLAKE2b-256 d96cb9667565da9a51dd96570f48416ba6a158b7177ce46c4ebca978ad8fe76d

See more details on using hashes here.

File details

Details for the file yonglelaoren_feedland_parser-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: yonglelaoren_feedland_parser-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 25.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for yonglelaoren_feedland_parser-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8773cc7b378a186f38d9e0dac59bfe621f3a3d13264777b0be2a6dc701f0b428
MD5 a7749cd71ffd3787a2210eb44648206b
BLAKE2b-256 3c8549af46b2bfc56dd288778d13b39ecd21d1ac73380963aa919aee519cde9a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page