Skip to main content

novel grab crawler module using python3 and lxml

Project description

novel grab crawler module using python3 and lxml

multiprocesssing with multithread version

winxos, AISTLAB Since 2017-02-19

INSTALL:

pip3 install aistlab_novel_grab

1. USAGE:

RUN COMMAND IN CONSOLE:

novel_grab http://the_url_of_novel_chapters_page

EXAMPLE:

novel_grab http://book.zongheng.com/showchapter/654086.html

SUPPORTED SITES: * http://book.zongheng.com * http://www.aoyuge.com * http://www.quanshu.net

2. USAGE AS PYTHON MODULE:

    from novel_grab.novel_grab import Downloader
    d = Downloader()
    print(d.get_info())
    if d.set_url('http://book.zongheng.com/showchapter/221579.html'):
        d.start()

**TIPS** \* When d = Downloader(), d.get\_info() can get supported
sites info. \* Once d.set\_url(url) will return the url is valid or
not. \* Of course you can use d.get\_info() to access the state of d
at any time. \* While finished, will create :math:`novel_name`.zip
file in your current path, default zip method using
zipfile.ZIP\_DEFLATED

Just for educational purpose, take care of yourself.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

AISTLAB_novel_grab-1.2.10.tar.gz (5.9 kB view details)

Uploaded Source

Built Distribution

AISTLAB_novel_grab-1.2.10-py2.py3-none-any.whl (9.1 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file AISTLAB_novel_grab-1.2.10.tar.gz.

File metadata

File hashes

Hashes for AISTLAB_novel_grab-1.2.10.tar.gz
Algorithm Hash digest
SHA256 23fa8148ab9e9f04b26ee61392b239aaa1d2cb0ce5362e283d1566d014da199d
MD5 8901f6a1cd642e4f1bc779521b00afe7
BLAKE2b-256 22a396c5938df9ff1b8bbe3f214190e4058bbe6feff79331129a404c94f74f67

See more details on using hashes here.

File details

Details for the file AISTLAB_novel_grab-1.2.10-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for AISTLAB_novel_grab-1.2.10-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 5c0201604ac923237068b521c1ba5484551f4d7a03b974b894cd855dc45b7f81
MD5 4e7f5412e0aadf7e330817a158725878
BLAKE2b-256 07125326cac1c30d4f9a3c7ab6c2f5f096ed50ae6961fb590238b4e366dffca0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page