Skip to main content

Structured data on cloud servers collected by sparecores-crawler.

Project description

Spare Cores Data

Build Last Run Project Status: Beta Maintenance Status: Active CC-BY-SA 4.0 License PyPI - Python Version NGI Search Open Call 3 beneficiary

SC Data is a Python package and related tools making use of sparecores-crawler to pull and standardize data on cloud compute resources. This repository actually runs the crawler every 5 minutes to update spot prices, and every hour to update all cloud resources in an internal SCD table and public SQLite snapshot as well.

Installation

Stable version from PyPI:

pip install sparecores-data

Most recent version from GitHub:

pip install "sparecores-data @ git+https://git@github.com/SpareCores/sc-data.git"

Usage

For easy access to the most recent version of the SQLite database file, import the db object of the sc_data Python package, which runs an updater thread in the background to keep the SQLite file up-to-date:

from sc_data import db
print(db.path)

The database is cached locally in a persistent directory and automatically updated when needed. On import, the package:

  1. Checks the local cache for a valid (non-stale) database
  2. If cached and fresh, uses it immediately
  3. Otherwise, downloads the latest version from our public S3 bucket
  4. Falls back to a limited version bundled with the package (without pricing information) if download fails

The cache is stored in a platform-specific location:

  • Linux: $XDG_CACHE_HOME/sparecores-data/ or ~/.cache/sparecores-data/
  • macOS: ~/Library/Caches/sparecores-data/
  • Windows: %LOCALAPPDATA%/sparecores-data/

To enforce waiting for the update to complete, you can use the updated event:

db.updated.wait()

Configuration

The package comes with the following set of default parameters, which can be overridden by builtins or environment variables:

Configuration Description Default Value Builtin Name Environment Variable
Custom Database Path Custom file path for the database (bypasses cache) - sc_data_db_path SC_DATA_DB_PATH
Disable Updates Whether to disable automatic updates False sc_data_no_update SC_DATA_NO_UPDATE
Database URL The URL of the most recent version of the database file https://...sc-data-all.db.bz2 sc_data_db_url SC_DATA_DB_URL
HTTP Timeout The timeout in seconds for downloading the database file 30 sc_data_http_timeout SC_DATA_HTTP_TIMEOUT
Refresh Interval The interval in seconds to check for database updates 600 sc_data_db_refresh_seconds SC_DATA_DB_REFRESH_SECONDS
Cache TTL Time in seconds before the cached database is considered stale 86400 (1 day) sc_data_db_cache_ttl SC_DATA_DB_CACHE_TTL

Note: Setting SC_DATA_DB_PATH disables caching and uses the specified file directly.

References

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sparecores_data-0.3.4.tar.gz (379.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sparecores_data-0.3.4-py3-none-any.whl (380.2 kB view details)

Uploaded Python 3

File details

Details for the file sparecores_data-0.3.4.tar.gz.

File metadata

  • Download URL: sparecores_data-0.3.4.tar.gz
  • Upload date:
  • Size: 379.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for sparecores_data-0.3.4.tar.gz
Algorithm Hash digest
SHA256 3a16010f316f858a6e1a1e97e4f88a090578905c0aff7b5508e22427f9b77b46
MD5 283b6f9a347960b97dd9a769abe7af0a
BLAKE2b-256 5b62ff2c48084fc046779e594ffd7db7d66229366353ef1bbff30aba6ed203fd

See more details on using hashes here.

File details

Details for the file sparecores_data-0.3.4-py3-none-any.whl.

File metadata

File hashes

Hashes for sparecores_data-0.3.4-py3-none-any.whl
Algorithm Hash digest
SHA256 3384e431d6377a67f028d2eb8566eb9a015e48baaa32bcb063a3f2f83744e495
MD5 60523602c6aef6a8af72928154d5274e
BLAKE2b-256 138b17592902b024dc7a74f5ef830ffbeca83871877936bf334ff1affd2b0967

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page