Python module to scrape Shopify store URLs
Project description
shopify-scrape
Installation
pip install shopify_scrape
Usage
Extracts json data for given URL.
python -m shopify_scrape.extract url -h
usage: extract.py url [-h] [-d DEST_PATH] [-p PAGE_RANGE [PAGE_RANGE ...]]
[-c] [-f FILE_PATH]
url
positional arguments:
url URL to extract. An attempt will be made to fix
inproperly formatted URLs.
optional arguments:
-h, --help show this help message and exit
-d DEST_PATH, --dest_path DEST_PATH
Destination folder for extracted files. If
subdirectories present, they will be created if they
do not exist. Defaults to current directory './'
-p PAGE_RANGE [PAGE_RANGE ...], --page_range PAGE_RANGE [PAGE_RANGE ...]
Inclusive page range as tuple to extract. There are 30
items per page. If not provided, all pages with
products will be taken.
-c, --collections If true, extracts '/collections.json' instead of
'/products.json'
-f FILE_PATH, --file_path FILE_PATH
File path to write. Defaults to
'[dest_path]/[url].products' or
'[dest_path]/[url].collections'
Extracts json data for URLs given in a specified column of a csv file.
python -m shopify_scrape.extract batch -h
usage: extract.py batch [-h] [-d DEST_PATH] [-p PAGE_RANGE [PAGE_RANGE ...]]
[-c] [-r ROW_RANGE [ROW_RANGE ...]] [-l [LOG]]
urls_file_path url_column
positional arguments:
urls_file_path File path of csv file containing URLs to extract.
url_column Name of unique column with URLs.
optional arguments:
-h, --help show this help message and exit
-d DEST_PATH, --dest_path DEST_PATH
Destination folder for extracted files. If
subdirectories present, they will be created if they
do not exist. Defaults to current directory './'
-p PAGE_RANGE [PAGE_RANGE ...], --page_range PAGE_RANGE [PAGE_RANGE ...]
Inclusive page range as tuple to extract. There are 30
items per page. If not provided, all pages with
products will be taken.
-c, --collections If true, extracts '/collections.json' instead of
'/products.json'
-r ROW_RANGE [ROW_RANGE ...], --row_range ROW_RANGE [ROW_RANGE ...]
Inclusive row range specified as two integers. Should
be positive, with second argument greater or equal
than first.
-l [LOG], --log [LOG]
File path of log file. If none, the log file is named
logs/[unix_time_in_seconds]_log.csv. 'logs' folder
created if it does not exist.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
shopify-scrape-0.0.5.tar.gz
(8.0 kB
view details)
Built Distribution
File details
Details for the file shopify-scrape-0.0.5.tar.gz
.
File metadata
- Download URL: shopify-scrape-0.0.5.tar.gz
- Upload date:
- Size: 8.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/51.0.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.7.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 557179732777efcccb086a472df0c0d9d0b4a4e3b5af259e9c3fcd95cb5d7e2d |
|
MD5 | 2ca25f088cb06c1724816e8cde012e5a |
|
BLAKE2b-256 | 1dfc75df2176d4acb2a7b6ebb2eebf84ee3521661576fba6c890a5e0f73c2059 |
File details
Details for the file shopify_scrape-0.0.5-py3-none-any.whl
.
File metadata
- Download URL: shopify_scrape-0.0.5-py3-none-any.whl
- Upload date:
- Size: 10.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/51.0.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.7.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b7ae78b4efd183bb9be5ecda53f465399b2e374518d3632539a278e4bcdc665c |
|
MD5 | 9a5caf5c4c80d4f3fce30031b304a5c4 |
|
BLAKE2b-256 | c7933aec28c50dbd9da329835c7f0880039c853ee92effe0525d7593d9bdd237 |