A tool to download pixiv pictures
Project description
pixiv-crawler
pixiv image crawler
Github https://github.com/Akaisorani/pixiv-crawler
How to install
pip install pixiv_crawler
To pass the captcha in login, we use selenium+phantomjs. So you need to install selenium and phantomjs (or chrome/firefox with headless model).
Functions
Download image by
- ranklist such as dailyrank
- tags
- illustrator's illustration list
- your bookmark list
- DIY urls
or random a image by
- ranklist
- tags
How to use
Example
import pixiv_crawler as pc
pc.set_value('username','your account name')
pc.set_value('password','your account password')
# pc.set_value('socks','127.0.0.1:8388')
# pc.set_value("local_save_root","./%y.%m.%d")
# pc.set_value("cookies_file","./cookies.txt")
# pc.set_value("garage_file","./garage.txt")
pc.set_value("phantomjs","/usr/local/bin/phantomjs") # for simulating log in process. the path will be (bala...)/phantomjs.exe on Windows
pc.login()
pc.dl_rank_daily(20)
pc.dl_bookmark(20)
pc.dl_artist(4187518,pic_num=-1,deep_into_manga=False)
pc.dl_tag('azur lane',20)
pc.dl_diy_urls(['https://www.pixiv.net/ranking.php?mode=weekly'],20)
Features
- it can download images by 8 threads(the maxnumber of threads can be adjusted) to accelerate the progress
- in most case it download the first picture of a manga type illustration, but in the illustrator's illustration list it will download the full manga(of course you can adjust the condition to decide when to download full)
- it can login with your account automatically with your account name and password
- after once login it will save cookies to local to avoid login each time
- it can save a garage file as a list of the image id you have downloaded to avoid download images repeatedly(because some ranklist doesn't change a lot next day)
- it can also synchronize your garage file with your remote server(if you have) to make sure not download repeatedly on your different computers
- for illustrator's illustration list, artist id must be provided, if set artist name as "?" then it will be found on the website, if set download page number as -1, then it will download all pages from this artist.
- for some reasons, you know, it need proxies to visit pixiv.net in some area, so you can set proxies in the config.properties.
- config.properties contains most configs so you needn't to edit the code source file.
Function List
login (save_cookies=True)
set_value (value_name,value)
get_value (value_name,value)
save_garage (garage_file = None)
dl_tag (tag,pic_num,deep_into_manga=False,add_classname_in_path=True)
dl_artist (artist_id,pic_num,deep_into_manga=True,add_classname_in_path=True)
dl_bookmark (pic_num,deep_into_manga=True,add_classname_in_path=True)
dl_rank_global (pic_num,deep_into_manga=False,add_classname_in_path=True)
dl_rank_daily (pic_num,deep_into_manga=False,add_classname_in_path=True)
dl_rank_weekly (pic_num,deep_into_manga=False,add_classname_in_path=True)
dl_rank_original (pic_num,deep_into_manga=False,add_classname_in_path=True)
...
dl_diy_urls (urls,pic_num,deep_into_manga=False,add_classname_in_path=True)
random_one_by_classfi (classi,label="")
Attribute List
username
password
local_save_root
garage_file
cookies_file
max_thread_num
socks: set None if not use
phantomjs
firefox
chrome
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pixiv_crawler-0.1.0.tar.gz
(11.6 kB
view hashes)
Built Distribution
Close
Hashes for pixiv_crawler-0.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 31c70f95c242ffd0a290a80872c5cfebafd268339d4708560d377a091c75caf1 |
|
MD5 | 7bc7c04b17f3327dce6063fc01bdcdbc |
|
BLAKE2b-256 | 906108d1074ca6478dabe7f47483eeda3e2aa0384b889b0014ee3f34e173266b |