Skip to main content

A utility to make accessing static content in the cloud, efficient

Project description

astro-cloud

astro-cloud provides three API tiers to access large files from numerous Static Storage Provides. Initial release of this software includes support for loading FITS headers from Amazon Web Services Static Storage Service or an service with HTTP loaded onto it

Install

$ pip install -U astro-cloud

Let's access FITS Headers from the Transiting Exoplanet Survey Satellite

Registry of Open Data on AWS provides a bucket called stpubdata. It contains data uploaded from the Transiting Exoplanet Survey Satellite. The type of data files we'll be working with in this tutoral are about 44GB. If we downloaded or scanned the entire file, it'll cost about $1.05. Instead astro-cloud scans the headers of the FITS files and calculates the data offsets, then jumps to the next header and downloads that until all headers have been found

#!/usr/bin/env python

from astro_cloud.fits import load_headers, CloudService, PaymentSolution

url = 'https://s3.us-east-1.amazonaws.com/stpubdata/tess/public/mast/tess-s0022-4-4-cube.fits'
for header in load_headers(url, CloudService.S3, PaymentSolution.AWSRequestPayer):
    if header.fits.get('SIMPLE', False) is True:
        header_keys = [key for key in header.fits.keys()]
        print(f'Primary Header: {len(header_keys)}')

    else:
        xtension = header.fits['XTENSION']
        header_keys = [key for key in header.fits.keys()]
        print(f'{xtension} Header Key Length: {len(header_keys)}')

Dedication to Performance

astro-cloud returns as many astropy datatypes as possible. Where ever possible everything is kept in memory and never touches disk to help keep operations running quickly without much slowdown. There are also plans to implement cluster computing frameworks such as DASK and Python mulitprocessing module.

Maintenance

Please feel welcome to open a conversation with me on astropy slack channel and tell me how you'd like to use this package!

http://astropy-slack-invite.herokuapp.com/

I'll be in the #fits channel

File Format Specific API Tier

astro-cloud three API tiers provides access to perform complex operations on files being accessed from a Static Service Provider. The highest being an implemantion of authentication layers for all major Static Storage Provides.

  • Amazon Web Services: Simple Storage Service ( AWS: S3 )
  • Digital Ocean: Spaces ( Spaces by Digital Ocean ) [ planned support ]
  • Google Compute Plateform: Object Storage ( GCP: Object Storage ) [ planned support ]
  • Microsoft Azure: Blob Storage ( Azure: Blob Storage ) [ planned support ]

Flexible Image Transport System Examples

AWS S3

Lets load headers from a FITS file on AWS S3 inside the bucket stpubdata

#!/usr/bin/env python

from astro_cloud.fits import load_headers, CloudService, PaymentSolution

url = 'https://s3.us-east-1.amazonaws.com/stpubdata/tess/public/mast/tess-s0022-4-4-cube.fits'
for header in load_headers(url, CloudService.S3, PaymentSolution.AWSRequestPayer):
    print(header.fits.get('XTENSION'))

Low Level Cloud API

Downloading a whole FITS file from a Static Storage Provider is possible to. astro-cloud uses the HTTP Header Range to request partial content from a provider or service. We can bypass that by using psf/requests instead of load_headers.

#!/usr/bin/env python
import os
import requests

from astro_cloud.auth.aws import AWSAuth

from astropy.io import fits

CHUNK_SIZE = 1024
ENCODING = 'utf-8'
url = 'https://s3.amazonaws.com/datum-storage.org/fits-files/502nmos.fits'
filename = os.path.basename(url)
filepath = f'/tmp/{filename}'

response = requests.get(url, auth=AWSAuth(request_payer=True), headers={
    'Range': 'bytes=0-2779',
})
assert response.status_code == 206

primary_header = fits.Header.fromstring(response.content.decode(ENCODING))
assert primary_header['SIMPLE'] is True

# Save the file to the file-system and load it with `fits.io`
streaming_response = requests.get(url, auth=AWSAuth(request_payer=True), headers={}, stream=True)
assert streaming_response.status_code == 200
with open(filepath, 'wb') as file_stream:
    for part in streaming_response.iter_content(CHUNK_SIZE):
        file_stream.write(part)

fits_file = fits.open(filepath)
print(fits_file.info())

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

astro-cloud-0.0.2.tar.gz (13.2 kB view details)

Uploaded Source

File details

Details for the file astro-cloud-0.0.2.tar.gz.

File metadata

  • Download URL: astro-cloud-0.0.2.tar.gz
  • Upload date:
  • Size: 13.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.6.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.8

File hashes

Hashes for astro-cloud-0.0.2.tar.gz
Algorithm Hash digest
SHA256 de25fa196353a36ff813a354783d80e49f89d5b647443c05d2f3d3c1fb182746
MD5 b26c756539ac7f0f289d74a5fae6da50
BLAKE2b-256 a3b1bf16c4ad0ce81c86086408f358e94ee27f1ed21137010776e69f8ab3b2d9

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page