A tool to download pixiv pictures
Project description
pixiv-crawler
pixiv image crawler
Github https://github.com/Akaisorani/pixiv-crawler
How to install
pip install pixiv_crawler
To pass the captcha in login, we use selenium+phantomjs. So you need to install selenium and phantomjs (or chrome/firefox with headless model).
Functions
Download image by
- ranklist such as dailyrank
- tags
- illustrator's illustration list
- your bookmark list
- DIY urls
or random a image by
- ranklist
- tags
How to use
Example
import pixiv_crawler as pc
pc.set_value('username','your account name')
pc.set_value('password','your account password')
# pc.set_value('socks','127.0.0.1:8388')
# pc.set_value("local_save_root","./%y.%m.%d")
# pc.set_value("cookies_file","./cookies.txt") # cookies in json format
# pc.set_value("garage_file","./garage.txt")
pc.set_value("phantomjs","/usr/local/bin/phantomjs") # for simulating log in process. the path will be (bala...)/phantomjs.exe on Windows
pc.login()
pc.dl_rank_daily(20)
pc.dl_bookmark(20)
pc.dl_artist(4187518,pic_num=-1,deep_into_manga=False)
pc.dl_tag('azur lane',20)
pc.dl_diy_urls(['https://www.pixiv.net/ranking.php?mode=weekly'],20)
Features
- it can download images by 8 threads(the maxnumber of threads can be adjusted) to accelerate the progress
- in most case it download the first picture of a manga type illustration, but in the illustrator's illustration list it will download the full manga(of course you can adjust the condition to decide when to download full)
- it can login with your account automatically with your account name and password
- after once login it will save cookies to local to avoid login each time
- it can save a garage file as a list of the image id you have downloaded to avoid download images repeatedly(because some ranklist doesn't change a lot next day)
- it can also synchronize your garage file with your remote server(if you have) to make sure not download repeatedly on your different computers
- for illustrator's illustration list, artist id must be provided, if set artist name as "?" then it will be found on the website, if set download page number as -1, then it will download all pages from this artist.
- for some reasons, you know, it need proxies to visit pixiv.net in some area, so you can set proxies in the config.properties.
- config.properties contains most configs so you needn't to edit the code source file.
Function List
login (save_cookies=True)
set_value (value_name,value)
get_value (value_name,value)
save_garage (garage_file = None)
dl_tag (tag,pic_num,deep_into_manga=False,add_classname_in_path=True)
dl_artist (artist_id,pic_num,deep_into_manga=True,add_classname_in_path=True)
dl_bookmark (pic_num,deep_into_manga=True,add_classname_in_path=True)
dl_rank_global (pic_num,deep_into_manga=False,add_classname_in_path=True)
dl_rank_daily (pic_num,deep_into_manga=False,add_classname_in_path=True)
dl_rank_weekly (pic_num,deep_into_manga=False,add_classname_in_path=True)
dl_rank_original (pic_num,deep_into_manga=False,add_classname_in_path=True)
dl_rank_daily_r18(pic_num,deep_into_manga=False,add_classname_in_path=True)
...
dl_diy_urls (urls,pic_num,deep_into_manga=False,add_classname_in_path=True)
random_one_by_classfi (classi,label="")
Attribute List
username
password
local_save_root
garage_file
cookies_file
max_thread_num
socks: set None if not use
phantomjs
firefox
chrome
Tips
- If the log in process failed (because of reCAPTCHA), you can copy your cookies from your browser into cookies.txt in json format. Then the pixiv_crawler will log in with cookies.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pixiv_crawler-0.1.2.tar.gz
(11.9 kB
view details)
Built Distribution
File details
Details for the file pixiv_crawler-0.1.2.tar.gz
.
File metadata
- Download URL: pixiv_crawler-0.1.2.tar.gz
- Upload date:
- Size: 11.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.6.0 requests-toolbelt/0.9.1 tqdm/4.37.0 CPython/3.6.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0ab3877c16dbddd8c8a03b3b6e932574f8a625c765b5feed6200fd7f98cfa101 |
|
MD5 | 40d0f7d33a918dd9d9eb62fe28538ea2 |
|
BLAKE2b-256 | f820058276a0b185ff1854421824b07898febf3e9317e6c3d908bcd06821a943 |
File details
Details for the file pixiv_crawler-0.1.2-py3-none-any.whl
.
File metadata
- Download URL: pixiv_crawler-0.1.2-py3-none-any.whl
- Upload date:
- Size: 12.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.6.0 requests-toolbelt/0.9.1 tqdm/4.37.0 CPython/3.6.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 42908874f88bb0defd2ea4e49bd4610ea19ac34e3bcb611f9b8fd178b616be3c |
|
MD5 | 148033173fd18d0ac4c83574ef57b593 |
|
BLAKE2b-256 | e529507f429ab51d6e26e3c506191ae1e18e09348bc3fb2743272c3397e41051 |