Skip to main content

Cloud Optimized Fits

Project description

Cloud Optimized Fits

cloud-fits provides the means to index large FITS files and have them served over HTTP(s) for efficient access. A scientist or team can index the FITS file-directory, then upload the file-directory to A Static Cloud Provider. Static Cloud Providers are Amazon WebServices, Google Cloud, Digital Ocean Spaces, or Microsoft Blob Storage. The FITS Cloud Index can than be checked into a Github Repository, shared, or uploaded to a Static Cloud Provider

cloud-fits returns Astropy Datatypes as much as possible

Lets index a FITS files and extract Metadata from it

Accessing Data

Registry of Open Data on AWS provides a bucket called stpubdata. It contains data uploaded from the Transiting Exoplanet Survey Satellite. The type of data files we'll be working with in this tutorial are about 44GB. It'll cost about $1.05 to download and index one of these 352 files from the Registry. To save time and money, we'll only download one and index it

Prerequisite

Before we can download data off the Registry. Setup an AWS account and configure your credentials file. Then install aws-cli

Tutorial

Lets create our environment and download one data-file from the Registry

$ mkdir -p /tmp/tess-data
$ cd /tmp/tess-data
$ aws s3 cp s3://stpubdata/tess/public/mast/ . --recursive --exclude "*" --include "tess-s0022-4-4-cube.fits" --request-payer

With our data downloaded, its time to create a bucket that'll hold our FITS Cloud Index. In some cases, we might not have write access to the data we're indexing. In this case, we want to generate the index from a public data-set. Then well store the index in a Static Cloud Provider of our choosing. cloud-fits can then provide a Pythonic API augmenting this abstraction

Lets create a bucket on AWS S3 and upload the index there

$ AWS_DEFAULT_REGION=us-east-1 aws s3api create-bucket --bucket tess-fits-cloud-index
$ pip install cloud-fits -U
$ cloud-fits-index --index-bucket-name tess-fits-cloud-index --fits-files-directory /tmp/tess-data/ --data-bucket-path s3://stpubdata/tess/public/mast

The arguments --index-bucket-name and --fits-file-directory are intended to be straight forward with the naming. --data-bucket-path is used to arugment the file-structure difference between --fits-file-directory and --data-bucket-path. For example, the data-cubes are located in tess/public/mast, but this data isn't captured in --fits-file-directory. So, --data-bucket-path was introduced to augment the paths used to download sections of the file

Okay, great. We have everything we need. Now lets do some science. Start python and enter the following,

from cloud_fits.bucket_operations import download_index
from cloud_fits.datatypes import FitsCloudIndex

from pprint import pprint

BUCKET_NAME: str = 'tess-fits-cloud-index'

index: FitsCloudIndex = download_index(BUCKET_NAME)
bintable_index = index.headers[2]
print(bintable_index[0, 10])

Feature Map

  • Amazon Web Services S3
  • Pythonic API Refinement ( Planned Update )
  • Remote Indexing for all Static Cloud Providers ( Planned Support )
  • Digital Ocean Spaces ( Planned Support )
  • Google Object Storage ( Planned Support )
  • Microsoft Azure ( Planned Support )

Development and Deployment

Building Docs Locally

build.sh has a script that'll generate docs locally into /tmp/docs. The static HTML files can then be served from that directory using these commands

$ bash build.sh docs
$ cd /tmp/docs
$ python3 -m http.server

PYPI Automation

build.sh has a script that'll generate the python wheel, sdist, and associated files. Then it'll upload it to PYPI or PYPI-Test according to how you invoke the build.sh file

$ bash build.sh publish-test

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cloud-fits-0.1.1.tar.gz (19.4 kB view details)

Uploaded Source

Built Distribution

cloud_fits-0.1.1-py2.py3-none-any.whl (17.3 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file cloud-fits-0.1.1.tar.gz.

File metadata

  • Download URL: cloud-fits-0.1.1.tar.gz
  • Upload date:
  • Size: 19.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.6.9

File hashes

Hashes for cloud-fits-0.1.1.tar.gz
Algorithm Hash digest
SHA256 07ffb6bc65c9c55e14e0b070fcd496853e3d050d095fe0dfe14182475c35e19e
MD5 435d955e4f49f3b2295118209c65c1ca
BLAKE2b-256 efb815e629a212f6b8012810af2318b48105088e450abadb86e9f92f26ff5211

See more details on using hashes here.

File details

Details for the file cloud_fits-0.1.1-py2.py3-none-any.whl.

File metadata

  • Download URL: cloud_fits-0.1.1-py2.py3-none-any.whl
  • Upload date:
  • Size: 17.3 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.6.9

File hashes

Hashes for cloud_fits-0.1.1-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 f982012bf363c293543905c509e7dda94f594847a38aa01145c25f9f19996129
MD5 03d977457f32d578f0bad0d6b245ee54
BLAKE2b-256 0f2331c12c7a5d7ac9802406bf626ca0d7782222ac0f83f9786ba87ceecb6586

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page