Skip to main content

A professional Weibo crawler library

Project description

Crawl4Weibo

一个开箱即用的微博爬虫Python库,基于实际测试成功的方案,无需Cookie即可使用。

特性

  • 🚀 开箱即用: 无需Cookie,一行代码初始化
  • 🛡️ 防反爬: 自动处理432错误和请求限制
  • 📱 真实模拟: 使用真实手机浏览器UA
  • 🔄 智能重试: 自动重试机制
  • 📊 结构化数据: 清晰的数据模型

安装

pip install crawl4weibo

快速开始

from crawl4weibo import WeiboClient

# 初始化(无需Cookie)
client = WeiboClient()
test_uid = "2656274875"

# 获取用户信息
user = client.get_user_by_uid(test_uid)
print(f"用户名: {user.screen_name}")
print(f"粉丝数: {user.followers_count}")
print(f"微博数: {user.posts_count}")

# 获取微博
posts_page1 = client.get_user_posts(test_uid, page=1)
posts_page2 = client.get_user_posts(test_uid, page=2)
posts = (posts_page1 or []) + (posts_page2 or [])
print(f"获取到 {len(posts)} 条微博")
for i, post in enumerate(posts[:3], 1):
    print(f"  {i}. {post.text[:50]}...")
    print(f"     点赞: {post.attitudes_count} | 评论: {post.comments_count}")

# 根据微博ID获取单条微博
post = client.get_post_by_bid("Q6FyDtbQc")
print(f"微博内容: {post.text[:50]}")
# print(f"发布时间: {post.created_at}")
# print(f"图片数量: {len(post.pic_urls)}")

# 搜索用户
users = client.search_users("新浪")
for user in users:
    print(f"  - {user.screen_name} (粉丝: {user.followers_count})")
        
# 搜索微博
posts = client.search_posts("人工智能", page=1)
for post in posts:
    print(f"  - {post.text[:50]}...")

API参考

WeiboClient

初始化

WeiboClient(cookies=None, log_level="INFO", log_file=None)

主要方法

  • get_user_by_uid(uid) - 获取用户信息
  • get_user_posts(uid, page=1) - 获取用户微博
  • get_post_by_bid(bid) - 根据微博ID获取单条微博
  • search_users(query, page=1, count=10) - 搜索用户
  • search_posts(query, page=1) - 搜索微博

运行示例

python examples/simple_example.py

技术实现

基于你提供的成功代码实现:

# 核心技术栈
- Android Chrome UA模拟
- 移动端API接口
- 自动session管理  
- 432错误智能重试
- 随机请求间隔

许可证

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

crawl4weibo-0.1.4.tar.gz (84.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

crawl4weibo-0.1.4-py3-none-any.whl (15.5 kB view details)

Uploaded Python 3

File details

Details for the file crawl4weibo-0.1.4.tar.gz.

File metadata

  • Download URL: crawl4weibo-0.1.4.tar.gz
  • Upload date:
  • Size: 84.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.14

File hashes

Hashes for crawl4weibo-0.1.4.tar.gz
Algorithm Hash digest
SHA256 8b50120519d5201de45b62369d2f2962bedb4bdf71b0934d6473727171e6519f
MD5 1d9f3e91addd602889f04a0ae955f3e1
BLAKE2b-256 504a9fd44d98ea7e8204205b4c5e39dee539bd801d836283b487ffdd40fb2cbf

See more details on using hashes here.

File details

Details for the file crawl4weibo-0.1.4-py3-none-any.whl.

File metadata

File hashes

Hashes for crawl4weibo-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 2288e244dca6f8eb77242719872e7c647b5a4b9cbef5a8ed1a2f26f366726621
MD5 eb790583e06ab31b58f013e00db7dd98
BLAKE2b-256 4ce89f6b2ea2b2af4fd986ccc623ae9fe042a3d2aef1bbe6424c938e15858496

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page