Skip to main content

A tool to download pixiv pictures

Project description

pixiv-crawler

pixiv image crawler

Github https://github.com/Akaisorani/pixiv-crawler

How to install

pip install pixiv_crawler

To pass the captcha in login, we use selenium+phantomjs. So you need to install selenium and phantomjs (or chrome/firefox with headless model).

Functions

Download image by

  • ranklist such as dailyrank
  • tags
  • illustrator's illustration list
  • your bookmark list
  • DIY urls

or random a image by

  • ranklist
  • tags

How to use

Example

import pixiv_crawler as pc

pc.set_value('username','your account name')
pc.set_value('password','your account password')
# pc.set_value('socks','127.0.0.1:8388')
# pc.set_value("local_save_root","./%y.%m.%d")
# pc.set_value("cookies_file","./cookies.txt") # cookies in json format
# pc.set_value("garage_file","./garage.txt")
pc.set_value("phantomjs","/usr/local/bin/phantomjs") # for simulating log in process. the path will be (bala...)/phantomjs.exe on Windows
pc.login()

pc.dl_rank_daily(20)
pc.dl_bookmark(20)
pc.dl_artist(4187518,pic_num=-1,deep_into_manga=False)
pc.dl_tag('azur lane',20)
pc.dl_diy_urls(['https://www.pixiv.net/ranking.php?mode=weekly'],20)

Features

  • it can download images by 8 threads(the maxnumber of threads can be adjusted) to accelerate the progress
  • in most case it download the first picture of a manga type illustration, but in the illustrator's illustration list it will download the full manga(of course you can adjust the condition to decide when to download full)
  • it can login with your account automatically with your account name and password
  • after once login it will save cookies to local to avoid login each time
  • it can save a garage file as a list of the image id you have downloaded to avoid download images repeatedly(because some ranklist doesn't change a lot next day)
  • it can also synchronize your garage file with your remote server(if you have) to make sure not download repeatedly on your different computers
  • for illustrator's illustration list, artist id must be provided, if set artist name as "?" then it will be found on the website, if set download page number as -1, then it will download all pages from this artist.
  • for some reasons, you know, it need proxies to visit pixiv.net in some area, so you can set proxies in the config.properties.
  • config.properties contains most configs so you needn't to edit the code source file.

Function List

login (save_cookies=True)
set_value (value_name,value)
get_value (value_name,value)
save_garage (garage_file = None)
dl_tag (tag,pic_num,deep_into_manga=False,add_classname_in_path=True)
dl_artist (artist_id,pic_num,deep_into_manga=True,add_classname_in_path=True)
dl_bookmark (pic_num,deep_into_manga=True,add_classname_in_path=True)
dl_rank_global (pic_num,deep_into_manga=False,add_classname_in_path=True)
dl_rank_daily (pic_num,deep_into_manga=False,add_classname_in_path=True)
dl_rank_weekly (pic_num,deep_into_manga=False,add_classname_in_path=True)
dl_rank_original (pic_num,deep_into_manga=False,add_classname_in_path=True)
dl_rank_daily_r18(pic_num,deep_into_manga=False,add_classname_in_path=True)
...
dl_diy_urls (urls,pic_num,deep_into_manga=False,add_classname_in_path=True)
random_one_by_classfi (classi,label="")

Attribute List

username
password
local_save_root
garage_file
cookies_file
max_thread_num
socks: set None if not use
phantomjs
firefox
chrome

Tips

  1. If the log in process failed (because of reCAPTCHA), you can copy your cookies from your browser into cookies.txt in json format. Then the pixiv_crawler will log in with cookies.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pixiv_crawler-0.1.2.tar.gz (11.9 kB view details)

Uploaded Source

Built Distribution

pixiv_crawler-0.1.2-py3-none-any.whl (12.2 kB view details)

Uploaded Python 3

File details

Details for the file pixiv_crawler-0.1.2.tar.gz.

File metadata

  • Download URL: pixiv_crawler-0.1.2.tar.gz
  • Upload date:
  • Size: 11.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.6.0 requests-toolbelt/0.9.1 tqdm/4.37.0 CPython/3.6.9

File hashes

Hashes for pixiv_crawler-0.1.2.tar.gz
Algorithm Hash digest
SHA256 0ab3877c16dbddd8c8a03b3b6e932574f8a625c765b5feed6200fd7f98cfa101
MD5 40d0f7d33a918dd9d9eb62fe28538ea2
BLAKE2b-256 f820058276a0b185ff1854421824b07898febf3e9317e6c3d908bcd06821a943

See more details on using hashes here.

File details

Details for the file pixiv_crawler-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: pixiv_crawler-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 12.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.6.0 requests-toolbelt/0.9.1 tqdm/4.37.0 CPython/3.6.9

File hashes

Hashes for pixiv_crawler-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 42908874f88bb0defd2ea4e49bd4610ea19ac34e3bcb611f9b8fd178b616be3c
MD5 148033173fd18d0ac4c83574ef57b593
BLAKE2b-256 e529507f429ab51d6e26e3c506191ae1e18e09348bc3fb2743272c3397e41051

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page