Fetch public domain artwork from https://www.artvee.com
Project description
artvee-scraper
artvee-scraper is an easy to use command line utility for fetching public domain artwork from https://www.artvee.com.
Installation
Using PyPI
$ python -m pip install artvee-scraper
Python 3.8+ is officially supported.
Synopsis
artvee-scraper <command> [optional arguments] [positional arguments]
Examples
View help
$ artvee-scraper -h
usage: artvee-scraper [-h] {log-json,file-json,file-multi} ...
Scrape artwork from https://www.artvee.com
positional arguments:
{log-json,file-json,file-multi}
log-json Artwork is output to the log as a JSON object
file-json Artwork is represented as a JSON object and written to a file
file-multi Artwork image and metadata are written as separate files
optional arguments:
-h, --help show this help message and exit
View help for the file-json command
$ artvee-scraper file-json -h
usage: artvee-scraper file-json [-h] [-t [1-16]] [-l {DEBUG,INFO,WARNING,ERROR,CRITICAL}]
[-c {abstract,figurative,landscape,religion,mythology,posters,animals,illustration,still-life,botanical,drawings,asian-art}]
[--log-dir LOG_DIR] [--log-max-size [1-10240]] [--log-max-backups [0-100]]
[--space-level [2-6]] [--sort-keys] [--overwrite-existing]
dir_path
positional arguments:
dir_path JSON file output directory
optional arguments:
-h, --help show this help message and exit
-t [1-16], --worker-threads [1-16]
Number of worker threads (1-16)
-l {DEBUG,INFO,WARNING,ERROR,CRITICAL}, --log-level {DEBUG,INFO,WARNING,ERROR,CRITICAL}
Set the application log level
-c {abstract,figurative,landscape,religion,mythology,posters,animals,illustration,still-life,botanical,drawings,asian-art}, --category {abstract,figurative,landscape,religion,mythology,posters,animals,illustration,still-life,botanical,drawings,asian-art}
Category of artwork to scrape
--space-level [2-6] Enable pretty-printing; number of spaces to indent (2-6)
--sort-keys Sort JSON keys in alphabetical order
--overwrite-existing Overwrite existing files
optional log file arguments:
--log-dir LOG_DIR Log file output directory
--log-max-size [1-10240]
Maximum log file size in MB (1-10,240)
--log-max-backups [0-100]
Maximum number of log files to keep (0-100)
Download artwork from artvee.com and save each as individal files (JSON format) in the directory ~/artvee/downloads
$ artvee-scraper file-json ~/artvee/downloads
Available Commands
log-json
Download artwork and output each to the log as a JSON objects. Note: This command is intended for development test usage; typically it is not desirable to dump the data to the log.
$ artvee-scraper log-json [optional arguments]
Optional arguments
-h
|--help
(boolean)Display help message.
-t
|--worker-threads
(integer)The number of worker threads used for processing. Range of values is [1-16]. The default value is 3.
-l
|--log-level
(string)Application log level. One of: DEBUG, INFO, WARNING, ERROR, CRITICAL. The default value is INFO.
-c
|--category
(string)Category of artwork to fetch. One of: abstract, figurative, landscape, religion, mythology, posters, animals, illustration, still-life, botanical, drawings, asian-art. May be repeatedly used to specify multiple categories (-c animals, -c drawings). The default value is ALL categories.
Optional log file arguments
--log-dir
(string)Path to existing directory used to store artvee_scraper.log log files. Disabled by default.
--log-max-size
(integer)Maximum size in MB the log file should reach before triggering a rollover. Only applies if --log-dir has been specified. Range of values is [1-10240]. The default value is 1024MB (1GB).
--log-max-backups
(integer)Maximum number of log file archives to keep. Only applies if --log-dir has been specified. The actively written file is artvee_scraper.log. Backup files will have an incrementing numerical suffix; artvee_scraper.log.1 ... artvee_scraper.log.N. If this value is zero, rollovers will be disabled. Range of values is [0-100]. The default value is 10.
Optional writer arguments
--space-level
(integer)Pretty print JSON; number of spaces to indent. Range of values is [2-6]. Disabled by default.
--sort-keys
(boolean)Sort JSON keys in alphabetical order. Disabled by default.
--include-image
(boolean)Image will be included in output. Excessive output warning! Disabled by default.
Basic Example
$ artvee-scraper log-json
Output:
...
2038-01-19 03:14:07.988 DEBUG [ThreadPoolExecutor-0_0] scraper._image_link_from(120) | Retrieving image download link from URL https://artvee.com/dl/study-for-old-canal-red-green/
2038-01-19 03:14:07.989 DEBUG [ThreadPoolExecutor-0_0] connectionpool._new_conn(1001) | Starting new HTTPS connection (1): artvee.com:443
2038-01-19 03:14:07.999 INFO [ThreadPoolExecutor-0_0] log_writer.write(44) | {"url": "https://artvee.com/dl/study-for-old-canal-red-green/", "title": "Study for Old Canal (Red & Green)", "category": "Abstract", "artist": "Oscar Bluemner", "date": "1916", "origin": "American, 1867-1938"}
...
Advanced Example
$ artvee-scraper log-json --worker-threads 2 --log-level DEBUG --category abstract --log-dir /var/log/artvee --log-max-size 2048 --log-max-backups 10 --space-level 2 --sort-keys --include-image
Output:
$ cat /var/log/artvee/artvee_scraper.log
...
2038-01-19 03:14:07.988 DEBUG [ThreadPoolExecutor-0_0] scraper._image_link_from(120) | Retrieving image download link from URL https://artvee.com/dl/study-for-old-canal-red-green/
2038-01-19 03:14:07.989 DEBUG [ThreadPoolExecutor-0_0] connectionpool._new_conn(1001) | Starting new HTTPS connection (1): artvee.com:443
2038-01-19 03:14:07.999 INFO [ThreadPoolExecutor-0_0] log_writer.write(44) | {
"artist": "Oscar Bluemner",
"category": "Abstract",
"date": "1916",
"image": "/9j/4AAQSkZJRgABA ... o4xSSSVkumh//9k="
"origin": "American, 1867-1938",
"title": "Study for Old Canal (Red & Green)",
"url": "https://artvee.com/dl/study-for-old-canal-red-green/"
}
...
file-json
Download artwork and write each to the filesystem. Each artwork is stored as a JSON object.
$ artvee-scraper file-json [optional arguments] <dir_path>
Positional arguments
dir_path
(string) Position 0.Path to existing directory used to store output files.
Optional arguments
-h
|--help
(boolean)Display help message.
-t
|--worker-threads
(integer)The number of worker threads used for processing. Range of values is [1-16]. The default value is 3.
-l
|--log-level
(string)Application log level. One of: DEBUG, INFO, WARNING, ERROR, CRITICAL. The default value is INFO.
-c
|--category
(string)Category of artwork to fetch. One of: abstract, figurative, landscape, religion, mythology, posters, animals, illustration, still-life, botanical, drawings, asian-art. May be repeatedly used to specify multiple categories (-c animals, -c drawings). The default value is ALL categories.
Optional log file arguments
--log-dir
(string)Path to existing directory used to store artvee_scraper.log log files. Disabled by default.
--log-max-size
(integer)Maximum size in MB the log file should reach before triggering a rollover. Only enabled if --log-dir has been specified. Range of values is [1-10240]. The default value is 1024MB (1GB).
--log-max-backups
(integer)Maximum number of log file archives to keep. Only enabled if --log-dir has been specified. The actively written file is artvee_scraper.log. Backup files will have an incrementing numerical suffix; artvee_scraper.log.1 ... artvee_scraper.log.N. If this value is zero, rollovers will be disabled. Range of values is [0-100]. The default value is 10.
Optional writer arguments
--space-level
(integer)Pretty print JSON; number of spaces to indent. Range of values is [2-6]. Disabled by default.
--sort-keys
(boolean)Sort JSON keys in alphabetical order. Disabled by default.
--overwrite-existing
(boolean)Allow existing duplicate files to be overwritten. Disabled by default.
Basic Example
$ artvee-scraper file-json ~/artvee/downloads
Output:
$ cat ~/artvee/downloads/peter-nicolai-arbo-the-valkyrie.json
{"url": "https://artvee.com/dl/the-valkyrie-2/", "title": "The Valkyrie", "category": "Mythology", "artist": "Peter Nicolai Arbo", "date": "1869", "origin": "Norwegian, 1831–1892", "image": "/9j/4AAQSkZJRgABA ... o4xSSSVkumh//9k="}
Advanced Example
$ artvee-scraper file-json --worker-threads 1 --log-level INFO --category mythology --log-dir /var/log/artvee --log-max-size 512 --log-max-backups 10 --space-level 4 --sort-keys --overwrite-existing ~/artvee/downloads
Output:
$ cat ~/artvee/downloads/peter-nicolai-arbo-the-valkyrie.json
{
"artist": "Peter Nicolai Arbo",
"category": "Mythology",
"date": "1869",
"image": "/9j/4AAQSkZJRgABA ... o4xSSSVkumh//9k="
"origin": "Norwegian, 1831–1892",
"title": "The Valkyrie",
"url": "https://artvee.com/dl/the-valkyrie-2/"
}
file-multi
Download artwork and write each to the filesystem. Each artwork is stored as two files: metadata (JSON) & image (JPG).
$ artvee-scraper file-multi [optional arguments] <metadata_dir_path> <image_dir_path>
Positional arguments
metadata_dir_path
(string) Position 0.Path to existing directory used to store output metadata files.
image_dir_path
(string) Position 1.Path to existing directory used to store output image files.
Optional arguments
-h
|--help
(boolean)Display help message.
-t
|--worker-threads
(integer)The number of worker threads used for processing. Range of values is [1-16]. The default value is 3.
-l
|--log-level
(string)Application log level. One of: DEBUG, INFO, WARNING, ERROR, CRITICAL. The default value is INFO.
-c
|--category
(string)Category of artwork to fetch. One of: abstract, figurative, landscape, religion, mythology, posters, animals, illustration, still-life, botanical, drawings, asian-art. May be repeatedly used to specify multiple categories (-c animals -c drawings). The default value is ALL categories.
Optional log file arguments
--log-dir
(string)Path to existing directory used to store artvee_scraper.log log files. Disabled by default.
--log-max-size
(integer)Maximum size in MB the log file should reach before triggering a rollover. Only enabled if --log-dir has been specified. Range of values is [1-10240]. The default value is 1024MB (1GB).
--log-max-backups
(integer)Maximum number of log file archives to keep. Only enabled if --log-dir has been specified. The actively written file is artvee_scraper.log. Backup files will have an incrementing numerical suffix; artvee_scraper.log.1 ... artvee_scraper.log.N. If this value is zero, rollovers will be disabled. Range of values is [0-100]. The default value is 10.
Optional writer arguments
--space-level
(integer)Pretty print JSON; number of spaces to indent. Range of values is [2-6]. Disabled by default.
--sort-keys
(boolean)Sort JSON keys in alphabetical order. Disabled by default.
--overwrite-existing
(boolean)Allow existing duplicate files to be overwritten. Disabled by default.
Basic Example
$ artvee-scraper file-multi ~/artvee/downloads/metadata ~/artvee/downloads/images
Output:
$ cat ~/artvee/downloads/metadata/peter-nicolai-arbo-the-valkyrie.json
{"url": "https://artvee.com/dl/the-valkyrie-2/", "title": "The Valkyrie", "category": "Mythology", "artist": "Peter Nicolai Arbo", "date": "1869", "origin": "Norwegian, 1831–1892"}
$ cat ~/artvee/downloads/images/peter-nicolai-arbo-the-valkyrie.jpg
<FF><D8><FF><E0>^@^PJFIF^@^A^A^A^A,^A,^@^@<FF><E1><D5>$Exif^@^@II*^@^
...
^<X-nA2_vއ%6gS`QErVOOqk;R,u{w9~onDbsEWQ㿟xyr
Advanced Example
$ artvee-scraper file-multi --worker-threads 1 --log-level INFO --category mythology --log-dir /var/log/artvee --log-max-size 512 --log-max-backups 10 --space-level 2 --sort-keys --overwrite-existing ~/artvee/downloads/metadata ~/artvee/downloads/images
Output:
$ cat ~/artvee/downloads/metadata/peter-nicolai-arbo-the-valkyrie.json
{
"artist": "Peter Nicolai Arbo",
"category": "Mythology",
"date": "1869",
"origin": "Norwegian, 1831–1892",
"title": "The Valkyrie",
"url": "https://artvee.com/dl/the-valkyrie-2/"
}
$ cat ~/artvee/downloads/images/peter-nicolai-arbo-the-valkyrie.jpg
<FF><D8><FF><E0>^@^PJFIF^@^A^A^A^A,^A,^@^@<FF><E1><D5>$Exif^@^@II*^@^
...
^<X-nA2_vއ%6gS`QErVOOqk;R,u{w9~onDbsEWQ㿟xyr
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file artvee-scraper-3.0.1.tar.gz
.
File metadata
- Download URL: artvee-scraper-3.0.1.tar.gz
- Upload date:
- Size: 17.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3b22ef2718813b8ec810b853ac1b03f4b612f22d4a4099842de140a29f4e8b98 |
|
MD5 | 4c0fa82aa46833a81c24a6745553f6a4 |
|
BLAKE2b-256 | 4b4fe245f6e5ff7d1b319b392f4a5f27b7649281e552dbbaee93dcd4cb2b370a |
File details
Details for the file artvee_scraper-3.0.1-py3-none-any.whl
.
File metadata
- Download URL: artvee_scraper-3.0.1-py3-none-any.whl
- Upload date:
- Size: 20.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e179630322c21a25552c71eec42d922f655d395d49d708afa9812dbcd1d1c386 |
|
MD5 | ff0a5fcbfc903c42de1fa9fe8df2eef2 |
|
BLAKE2b-256 | 1f3a450e847729e28a99c45157c884cace3fb585bb5db183556dd10155863226 |