Skip to main content

采集中文小说网站的爬虫

Project description

python 环境

  • python 2.7

  • mysql 5.7.9

项目安装

pip install novelSpider

MySQL 配置

# 加入 mysqld.cnf
# * Encode
init_connect='SET collation_connection = utf8_unicode_ci'
init_connect='SET NAMES utf8'
character-set-server=utf8
collation-server=utf8_unicode_ci
skip-character-set-client-handshake

项目运行

from novelSpider.task import Task

class Config(object):
    '''
    @desc:数据库配置
    @param:username 数据库用户名
    @param:password 数据库密码
    '''
    def __init__(self):
        self.username = 'root'
        self.password = 'root'
        self.database = 'novel'

# 创建任务实例
task = Task()

# 创建爬虫实例
spider = task.createDownloader(Config)

# 下载小说书目信息、章节列表
spider.getCharptList(novelNum=1)

# 下载小说书目信息、章节列表
spider.getCharptContent(novelId=0, charptNum=1)

版本日志

0.0.13 版本

# 支持下载小说书目信息、章节列表、章节内容

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

novelSpider-0.0.13.tar.gz (4.7 kB view details)

Uploaded Source

Built Distribution

novelSpider-0.0.13-py2-none-any.whl (8.2 kB view details)

Uploaded Python 2

File details

Details for the file novelSpider-0.0.13.tar.gz.

File metadata

  • Download URL: novelSpider-0.0.13.tar.gz
  • Upload date:
  • Size: 4.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.20.1 setuptools/20.7.0 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/2.7.12

File hashes

Hashes for novelSpider-0.0.13.tar.gz
Algorithm Hash digest
SHA256 e7d74f4e35b79a1573eeb1680ab747ea33a45d953c9344d991593c7a932c3451
MD5 b14c11dbca6fa23a77a134ec3757888f
BLAKE2b-256 8763ac2914d0947a837f731ae0cd16c255d3e9882147773aedb6bf82da1bb51d

See more details on using hashes here.

File details

Details for the file novelSpider-0.0.13-py2-none-any.whl.

File metadata

  • Download URL: novelSpider-0.0.13-py2-none-any.whl
  • Upload date:
  • Size: 8.2 kB
  • Tags: Python 2
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.20.1 setuptools/20.7.0 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/2.7.12

File hashes

Hashes for novelSpider-0.0.13-py2-none-any.whl
Algorithm Hash digest
SHA256 f4fc76ad52984c8b80940d7815583ce42e748abec420afdc3d83cf125fbce854
MD5 23db2b06d0d5c07e83ebeecb63e9f0fc
BLAKE2b-256 1334191f5361baba89f011e2cedc41a27c189834d4d96f3f358bc4d5f2ecdb25

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page