Skip to main content

webnovel scraper using selenium

Project description

Webnovel Bot

Webnovel Bot and scraper written in one, optimized for speed.

Provides multiple choices of access for tasks

Install

pip install webnovelbot

or

warning: this method requires prior installation of browser-cookie3

pip install git+https://github.com/mHaisham/webnovelbot.git

Sample

follow the link for an example usage

Signin

there are a few hiccups that one may encounter during signing in to webnovel

  • Captcha: During the signin process user can be asked to fill in a google captcha

  • Guard: After clicking the signin button the form can redirect the user to a guard website

you can handle them in different ways, signin method takes a variable manual which defaults to False. Behaviour of the function changes depending on it.

manual=False

When manual is false signin would throw exceptions corresponding to the situation

try:
    webnovel.signin(USER_EMAIL, USER_PASS)
except CaptchaException: 
    pass
except GuardException:
    pass

Read more on handling Guard

manual=True

When manual is true the process would be expecting user input during the above mentioned situations.

It would by default wait 10 minutes for user input before throwing a TimeoutException.

You may define a custom time by setting webnovel.user_timeout

Cookies

Webnovelbot supports using cookies from other web browsers in both selenium and api using class Cookies

It currently supports all browsers supported by browser_cookie3

chrome firefox opera edge chromium

from webnovel import WebnovelBot, Cookies
from webnovel.api import ParsedApi

webnovel = WebnovelBot(timeout=360)

cookiejar = Cookies.from_browser('chrome')

# this will load the cookie jar into selenium
# depending on what you want to do after, you may want to reload the page
webnovel.add_cookiejar(cookiejar)

# this will create the api with the cookie jar
api = ParsedApi(cookiejar)

Cookies extends from RequestsCookieJar hence can be used as a replacement for it and vice-versa

Conversion tools

from webnovel.tools import UrlTools

UrlTools provides methods to convert and from novel_id, chapter_id, and profile_id to their respective urls

Analytics

Supports multiple analytic tools with an easily extensible interface

Read more

Goals

Primary focus of development is to reduce the usage of selenium.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

webnovelbot-0.3.0.tar.gz (18.3 kB view details)

Uploaded Source

Built Distribution

webnovelbot-0.3.0-py3-none-any.whl (29.4 kB view details)

Uploaded Python 3

File details

Details for the file webnovelbot-0.3.0.tar.gz.

File metadata

  • Download URL: webnovelbot-0.3.0.tar.gz
  • Upload date:
  • Size: 18.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/53.0.0 requests-toolbelt/0.9.1 tqdm/4.57.0 CPython/3.9.2

File hashes

Hashes for webnovelbot-0.3.0.tar.gz
Algorithm Hash digest
SHA256 014dee9aa636aae992cfe0bde5205a72e3d58318f7eac9a39a0d594c25186c4c
MD5 5b306b3ff56198eebade50682ead35bb
BLAKE2b-256 558db7081ffddc7e12c3053b44295f1fa888faf8a08f4d80053dcfb654fe0495

See more details on using hashes here.

File details

Details for the file webnovelbot-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: webnovelbot-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 29.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/53.0.0 requests-toolbelt/0.9.1 tqdm/4.57.0 CPython/3.9.2

File hashes

Hashes for webnovelbot-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f45bbcd1dc3155a61027b5a694d666b3b876154f91a3f68f400ffbb2592f0b18
MD5 b77a6ab892b31e13539026a4f1be0b2f
BLAKE2b-256 275c8ee6cb00f5e7bee987046f3a51364c441b40a79b706eba6b2bfc2691cb26

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page