Skip to main content

This atlas s3 hook uses s3fs package to gather the metadata of bucket, pseudo_dir and object, then it inserts these metadata into Atlas instances.

Project description

Apache Atlas S3 hook in Python

This python client uses a s3 client package s3fs to get the metadata of s3 entities such as Bucket, pseudo_dir and object, then it inserts these metadata into Atlas instances.

Quick start

Create a client to connect to an Atlas instance

from atlas_client.client import Atlas
# login with your token
hostname = "https://atlas.lab.sspcloud.fr"
port = 443
oidc_token = "<your_token>"
atlas_client = Atlas(hostname, port, oidc_token=oidc_token)

# login with your username and password
atlas_client = Atlas(hostname, port, username='',password='')

Create a s3 metadata client to collect metadata of s3 entities

from atlas_s3_hook.S3MetadataClient import S3MetadataClient

s3_end_point = ''
s3_access_key = ''
s3_secret_key = ''
s3_token = ''

s3_client = S3MetadataClient(s3_end_point, s3_access_key, s3_secret_key, s3_token)

Load a single s3 entity into atlas

If you want to load the metadata of a single s3 entity, you can use the following code example

from atlas_s3_hook.S3Hook import S3Hook

# Indicate the path of the entity which you want to 
path=''
description=''
s3_hook = S3Hook(s3_client, atlas_client)
# Get the class of the s3 entity
path_class = s3_client.get_class_from_path(path)
print(path_class)

# Get the metadata of the s3 entity
meta_data = s3_client.get_path_meta_data(path)

# based on the class of the s3 entity, s3 hook provides different loaders. You need to choose the correct one

# bucket loader 
s3_hook.create_atlas_bucket(meta_data,description)

# directory loader
s3_hook.create_atlas_ps_dir(meta_data,description)

# object loader
s3_hook.create_atlas_object(meta_data,description)

Load multiple s3 entities into atlas

If you want to load the metadata of multiple s3 entities, you can use the following code example. The S3Scanner class takes a path of s3 entity and load all the metadata of its contents (e.g. sub-directory, objects).

from atlas_s3_hook.S3Scanner import S3Scanner

s3_entity_path=''
entity_owner=''

minio_scanner = S3Scanner(minio_client, atlas_client, owner=entity_owner)
minio_scanner.scan_path(s3_entity_path)

Prerequisites

This tool only requires python 3.7 or above

Supported OS

Windows XP/7/8/10

Linux

MacOS

Authors

  • Pengfei Liu

License

This project is licensed under the MIT License - see the LICENSE.md file for details

Acknowledgement

This package was created by using s3fs project

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

atlass3hook-0.0.2.tar.gz (5.0 kB view details)

Uploaded Source

Built Distribution

atlass3hook-0.0.2-py3-none-any.whl (8.5 kB view details)

Uploaded Python 3

File details

Details for the file atlass3hook-0.0.2.tar.gz.

File metadata

  • Download URL: atlass3hook-0.0.2.tar.gz
  • Upload date:
  • Size: 5.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.7.9

File hashes

Hashes for atlass3hook-0.0.2.tar.gz
Algorithm Hash digest
SHA256 2b24c699a244bda84e208bd92808b687cdb9bcfe060826c1f1a8a60c1165cec8
MD5 382229f48ed25e8a7791e3a3359dc927
BLAKE2b-256 c6c41759c77e735803ac6eb62914e14b1b1526bacfba67302ffebfcf8fcfa06a

See more details on using hashes here.

File details

Details for the file atlass3hook-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: atlass3hook-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 8.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.7.9

File hashes

Hashes for atlass3hook-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 0ee2137df2248e19d645484aebeddeeef176aa31d782e79bea0e615b4379f2d3
MD5 c063dd889b82c813111366bd5fd6f77b
BLAKE2b-256 7c1b5463495268c0362360ef11943d487297ee93f9d75acd3007aff50f8eb907

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page