Skip to main content

Scrape and download from Piccoma.

Project description

Pyccoma

Directly scrape images from Piccoma.

Prerequisites

  • Python 3.8+

Setup

You can get started by installing using pip or manually from source.

Install from PyPI

$ pip install pyccoma

Install from source

  • First, you should get a copy of this project in your local machine by either downloading the zip file or cloning the repository. git clone https://github.com/catsital/pyccoma.git
  • cd into pyccoma directory.
  • Run python setup.py install to install package.

Quickstart

Using the command-line utility

To download a single episode, simply use:

$ pyccoma https://piccoma.com/web/viewer/8195/1185884

You can also pass multiple links (separated by whitespace) to download in one go:

$ pyccoma https://piccoma.com/web/viewer/60171/1575237 https://piccoma.com/web/viewer/5796/332058 https://piccoma.com/web/viewer/13034/623225

See more examples on how to aggregate and batch download using the command-line utility on the next section below.

Using Python shell

Create a Scraper instance and use fetch to scrape and download images from a viewer page.

>>> from pyccoma import Scraper
>>> pyc = Scraper()
>>> pyc.fetch('https://piccoma.com/web/viewer/8195/1185884')

Title: ひげを剃るそして女子高生を拾う(しめさば ぶーた 足立いまる)
Episode: 第1話 失恋と女子高生 (1)
  |███████████████████████████████████████████████████████████| 100.0% (14/14)
Elapsed time: 00:00:17

You can use login to have access to rental or paywalled episodes from your own library. Only email method is supported as of the moment.

>>> pyc.login(email, password)
>>> pyc.fetch('https://piccoma.com/web/viewer/s/4995/1217972')

Title: かぐや様は告らせたい天才たちの恋愛頭脳戦(赤坂アカ)
Episode: 第135話
  |███████████████████████████████████████████████████████████| 100.0% (20/20)
Elapsed time: 00:00:23

Images are stored under the path specified in fetch(url, path). A fixed directory tree structure like below is created with subfolders for the product_name and episode_name. It will return an absolute path if a relative one is given. If none is supplied, an extract folder will be automatically created inside the current working directory.

.
└───<extract>
    └───<product_name>
        └───<episode_name>
            ├───1.png
            ├───2.png
            └───...

Usage

Using filter to aggregate and download in batch

  • Downloading all free episodes in a single product page:
$ pyccoma https://piccoma.com/web/product/67171/episodes?etype=E --filter all --include is_free
  • Downloading using custom with range:
$ pyccoma https://piccoma.com/web/product/16070/episodes?etype=E --filter custom --range 1 5
  • Downloading all free episodes across multiple products:
$ pyccoma https://piccoma.com/web/product/5523/episodes?etype=E https://piccoma.com/web/product/23019/episodes?etype=E --filter all --include is_free --exclude is_limited_free

Accessing your library

  • Downloading all your purchased items:
$ pyccoma purchase --filter all --include is_purchased --email foo@bar.com
>>> purchase = pyc.get_purchase().values()
>>> purchase = [episode['url'] for title in purchase for episode in pyc.get_list(title).values() if episode['is_purchased']]
>>> for episode in purchase:
      pyc.fetch(episode)

Title: かぐや様は告らせたい天才たちの恋愛頭脳戦(赤坂アカ)
Episode: 第135話
  |███████████████████████████████████████████████████████████| 100.0% (20/20)
Elapsed time: 00:00:23

Title: かぐや様は告らせたい天才たちの恋愛頭脳戦(赤坂アカ)
Episode: 第136話
  |███████████████████████████████████████████████████████████| 100.0% (22/22)
Elapsed time: 00:00:25
  • Downloading the most recent episodes you have read from your history:
$ pyccoma history --filter max --include "is_limited_read|(is_already_read&is_free)" --email foo@bar.com
>>> bookmark = pyc.get_bookmark().values()
>>> bookmark = [[episode['url'] for episode in pyc.get_list(title).values()
      if episode['is_limited_read'] or (episode['is_already_read'] and episode['is_free'])] for title in bookmark]
>>> bookmark = [episode[-1] for episode in bookmark if episode]
>>> for episode in bookmark:
      pyc.fetch(episode)

Title: ひげを剃るそして女子高生を拾う(しめさば ぶーた 足立いまる)
Episode: 第2話 生活と遠慮 (1)
  |███████████████████████████████████████████████████████████| 100.0% (16/16)
Elapsed time: 00:00:18

Title: 悪女はマリオネット(Manggle hanirim)
Episode: 第29話
  |███████████████████████████████████████████████████████████| 100.0% (98/98)
Elapsed time: 00:01:38
  • Downloading latest unread episodes using free pass (if available) from your bookmarks:
$ pyccoma bookmark --filter min --include is_limited_free --exclude is_already_read --email foo@bar.com
>>> bookmark = pyc.get_bookmark().values()
>>> bookmark = [[episode['url'] for episode in pyc.get_list(title).values()
      if episode['is_limited_free'] and not episode['is_already_read']] for title in bookmark]
>>> bookmark = [episode[0] for episode in next_free_episodes if episode]
>>> for episode in next_free_episodes:
      pyc.fetch(episode, 'piccoma')

Title: 悪女はマリオネット(Manggle hanirim)
Episode: 第30話
  |███████████████████████████████████████████████████████████| 100.0% (95/95)
Elapsed time: 00:01:45

Narrowing down the results

You can set restrictive conditions using the include and exclude options in the command-line utility. These two options take in statuses as arguments. To have a clearer picture on what these statuses correspond to, see figure below.

s_piccoma_web_episode_etypeE

{
  1437090: {'title': '第1話 (1)', 'url': 'https://piccoma.com/web/viewer/54715/1437090', 'is_free': True, 'is_limited_read': False, 'is_already_read': True, 'is_limited_free': False, 'is_purchased': False},
  1437091: {'title': '第1話 (2)', 'url': 'https://piccoma.com/web/viewer/54715/1437091', 'is_free': True, 'is_limited_read': False, 'is_already_read': True, 'is_limited_free': False, 'is_purchased': False},
  1437092: {'title': '第2話 (1)', 'url': 'https://piccoma.com/web/viewer/54715/1437092', 'is_free': True, 'is_limited_read': False, 'is_already_read': True, 'is_limited_free': False, 'is_purchased': False},
  1437093: {'title': '第2話 (2)', 'url': 'https://piccoma.com/web/viewer/54715/1437093', 'is_free': False, 'is_limited_read': False, 'is_already_read': True, 'is_limited_free': True, 'is_purchased': False},
  1437094: {'title': '第3話 (1)', 'url': 'https://piccoma.com/web/viewer/54715/1437094', 'is_free': False, 'is_limited_read': False, 'is_already_read': True, 'is_limited_free': True, 'is_purchased': False},
  1437095: {'title': '第3話 (2)', 'url': 'https://piccoma.com/web/viewer/54715/1437095', 'is_free': False, 'is_limited_read': True, 'is_already_read': True, 'is_limited_free': False, 'is_purchased': False},
  1437096: {'title': '第4話 (1)', 'url': 'https://piccoma.com/web/viewer/54715/1437096', 'is_free': False, 'is_limited_read': True, 'is_already_read': True, 'is_limited_free': False, 'is_purchased': False},
  1437097: {'title': '第4話 (2)', 'url': 'https://piccoma.com/web/viewer/54715/1437097', 'is_free': False, 'is_limited_read': False, 'is_already_read': False, 'is_limited_free': True, 'is_purchased': False}
}

Options

Required

Option Description Examples
url Must be a valid url https://piccoma.com/web/product/4995/episodes?etype=V, https://piccoma.com/web/product/12482/episodes?etype=E, https://piccoma.com/web/viewer/12482/631201, bookmark, history, purchase

Optional

Option Description Examples
-o, --output Local directory to save downloaded images D:/piccoma/ (absolute path), /piccoma/download/ (relative path)
-f, --format Image format jpeg, jpg, bmp, png (default)
--omit-author Omit author names from titles

Login

Option Description Examples
--email Your registered email address; this does not support OAuth authentication foo@bar.com

Filter

Option Description Examples
--etype Preferred episode type to scrape manga and smartoon when scraping history, bookmark, purchase; takes in two arguments, the first one for manga and the other for smartoon volume to scrape for volumes, episode to scrape for episodes
--filter Filter to use when scraping manga and smartoon from a product page or your library min, max, all, or custom by defining --range. Use min to scrape for the first item, max for the last item, all to scrape all items, and custom to scrape for a specific index range
--range Range to use when scraping manga and smartoon episodes; takes in two arguments, start and end; will always override --filter to parse custom, if omitted or otherwise 0 10 will scrape the first up to the ninth episode
--include Status arguments to include when parsing a library or product; can parse in | and & operators as conditionals, see use cases above is_purchased, is_free, is_already_read, is_limited_read, is_limited_free
--exclude Status arguments to exclude when parsing a library or product; can parse in | and & operators as conditionals, see use cases above is_purchased, is_free, is_already_read, is_limited_read, is_limited_free

License

See LICENSE for details.

Disclaimer

Pyccoma was made for the sole purpose of helping users download media from Piccoma for offline consumption. This is for private use only, do not use this tool to promote piracy.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyccoma-0.2.1.tar.gz (18.6 kB view details)

Uploaded Source

Built Distribution

pyccoma-0.2.1-py2.py3-none-any.whl (17.0 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file pyccoma-0.2.1.tar.gz.

File metadata

  • Download URL: pyccoma-0.2.1.tar.gz
  • Upload date:
  • Size: 18.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.10

File hashes

Hashes for pyccoma-0.2.1.tar.gz
Algorithm Hash digest
SHA256 87386168d36dadb5907053fd034ca63d36319071606ec0e0c17e44a9b408d473
MD5 073c40938fc3d600135cadac25f0dcc4
BLAKE2b-256 9266dba12f466f295a4b73c1bd81024db01c21c50ac8582ff561c3d56a066b28

See more details on using hashes here.

Provenance

File details

Details for the file pyccoma-0.2.1-py2.py3-none-any.whl.

File metadata

  • Download URL: pyccoma-0.2.1-py2.py3-none-any.whl
  • Upload date:
  • Size: 17.0 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.10

File hashes

Hashes for pyccoma-0.2.1-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 de5bc6ca1fbfd6d63f5e9e803e8127948d2abc1de3590fd0c6b31c4246fafebe
MD5 f271386ecd23649af4c024220b31b004
BLAKE2b-256 255cc537ac94f88e12c9127246791b5ace1dbf31f93e6e2cff5a0c5620c69079

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page