Skip to main content

WEIBO\_SCRAPY是一个PYTHON实现的,使用多线程抓取WEIBO信息的框架。WEIBO\_SCRAPY框架给用户提供WEIBO的模拟登录和多线程抓取微博信息的接口,让用户只需关心抓取的业务逻辑,而不用处理棘手的WEIBO模拟登录和多线程编程.

Project description

WEIBO_SCRAPY

WEIBO_SCRAPY是一个PYTHON实现的,使用多线程抓取WEIBO信息的框架。WEIBO_SCRAPY框架给用户提供WEIBO的模拟登录和多线程抓取微博信息的接口,让用户只需关心抓取的业务逻辑,而不用处理棘手的WEIBO模拟登录和多线程编程。

WEIBO_SCRAPY is a Multi-Threading SINA WEIBO data extraction Framework in Python. WEIBO_SCRAPY provides WEIBO login simulator and interface for WEIBO data extraction with multi-threading, it saves users a lot of time by getting users out of writing WEIBO login simulator from scratch and multi-threading programming, users now can focus on their own extraction logic.

=======

###WEIBO_SCRAPY的功能 1. 微博模拟登录

2. 多线程抓取框架

3. 抓取任务接口

4. 抓取参数配置

###WEIBO_SCRAPY Provides 1. WEIBO Login Simulator

2. Multi-Threading Extraction Framework

3. Extraction Task Interface

4. Easy Way of Parameters Configuration

###How to Use WEIBO_SCRAPY #!/usr/bin/env python #coding=utf8

from weibo_scrapy import scrapy

class my_scrapy(scrapy):
	
	def scrapy_do_task(self, uid=None):
	     '''
	    User needs to overwrite this method to perform uid-based scrapy task.
	    @param uid: weibo uid
	    @return: a list of uids gained from this task, optional
	    '''
	     super(my_scrapy, self).__init__(**kwds)
	     
	     #do what you want with uid here, note that this scrapy is uid based, so make sure there are uids in task queue, 
	     #or gain new uids from this function
	     print 'WOW...'
	     return 'replace this string with uid list which gained from this task'
	 
if __name__ == '__main__':
	
	s = my_scrapy(uids_file = 'uids_all.txt', config = 'my.ini')
	s.scrapy()

###相关阅读(Readings) 基于UID的WEIBO信息抓取框架WEIBO_SCRAPY

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

weibo_scrapy-2.2.5.tar.gz (4.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

weibo_scrapy-2.2.5-py3-none-any.whl (3.3 kB view details)

Uploaded Python 3

File details

Details for the file weibo_scrapy-2.2.5.tar.gz.

File metadata

  • Download URL: weibo_scrapy-2.2.5.tar.gz
  • Upload date:
  • Size: 4.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.11.2

File hashes

Hashes for weibo_scrapy-2.2.5.tar.gz
Algorithm Hash digest
SHA256 4af2de4a6f1da5e26f2e724f9537be9c79c520a4bbd8044499f5f60f2b87da19
MD5 053fb6e5a734fe423bb092ccfca5a606
BLAKE2b-256 ea7f6feb10c5cd5182bab7b58b0cf270df79061889ac3be8473878755d4f8950

See more details on using hashes here.

File details

Details for the file weibo_scrapy-2.2.5-py3-none-any.whl.

File metadata

  • Download URL: weibo_scrapy-2.2.5-py3-none-any.whl
  • Upload date:
  • Size: 3.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.11.2

File hashes

Hashes for weibo_scrapy-2.2.5-py3-none-any.whl
Algorithm Hash digest
SHA256 d9d2754bc0ebc489a8c511bf8337e258db2d1994af71be579fde05731ff08dad
MD5 84d696824ba3eed7161229fd42142000
BLAKE2b-256 4cb435712e2d263f01951473aeb482d3c84f5cfc480ac47caebeeaf3eb32e4cb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page