Skip to main content

This atlas s3 hook uses s3fs package to gather the metadata of bucket, pseudo_dir and object, then it inserts these metadata into Atlas instances.

Project description

Apache Atlas S3 hook in Python

This python client uses a s3 client package s3fs to get the metadata of s3 entities such as Bucket, pseudo_dir and object, then it inserts these metadata into Atlas instances.

Quick start

Create a client to connect to an Atlas instance

from atlas_client.client import Atlas
# login with your token
hostname = "https://atlas.lab.sspcloud.fr"
port = 443
oidc_token = "<your_token>"
atlas_client = Atlas(hostname, port, oidc_token=oidc_token)

# login with your username and password
atlas_client = Atlas(hostname, port, username='',password='')

Create a s3 metadata client to collect metadata of s3 entities

from atlas_s3_hook.S3MetadataClient import S3MetadataClient

s3_end_point = ''
s3_access_key = ''
s3_secret_key = ''
s3_token = ''

s3_client = S3MetadataClient(s3_end_point, s3_access_key, s3_secret_key, s3_token)

Load a single s3 entity into atlas

If you want to load the metadata of a single s3 entity, you can use the following code example

from atlas_s3_hook.S3Hook import S3Hook

# Indicate the path of the entity which you want to 
path=''
description=''
s3_hook = S3Hook(s3_client, atlas_client)
# Get the class of the s3 entity
path_class = s3_client.get_class_from_path(path)
print(path_class)

# Get the metadata of the s3 entity
meta_data = s3_client.get_path_meta_data(path)

# based on the class of the s3 entity, s3 hook provides different loaders. You need to choose the correct one

# bucket loader 
s3_hook.create_atlas_bucket(meta_data,description)

# directory loader
s3_hook.create_atlas_ps_dir(meta_data,description)

# object loader
s3_hook.create_atlas_object(meta_data,description)

Load multiple s3 entities into atlas

If you want to load the metadata of multiple s3 entities, you can use the following code example. The S3Scanner class takes a path of s3 entity and load all the metadata of its contents (e.g. sub-directory, objects).

from atlas_s3_hook.S3Scanner import S3Scanner

s3_entity_path=''
entity_owner=''

minio_scanner = S3Scanner(minio_client, atlas_client, owner=entity_owner)
minio_scanner.scan_path(s3_entity_path)

Prerequisites

This tool only requires python 3.7 or above

Supported OS

Windows XP/7/8/10

Linux

MacOS

Authors

  • Pengfei Liu

License

This project is licensed under the MIT License - see the LICENSE.md file for details

Acknowledgement

This package was created by using s3fs project

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

atlass3hook-0.0.3.tar.gz (5.0 kB view details)

Uploaded Source

Built Distribution

atlass3hook-0.0.3-py3-none-any.whl (8.5 kB view details)

Uploaded Python 3

File details

Details for the file atlass3hook-0.0.3.tar.gz.

File metadata

  • Download URL: atlass3hook-0.0.3.tar.gz
  • Upload date:
  • Size: 5.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.7.9

File hashes

Hashes for atlass3hook-0.0.3.tar.gz
Algorithm Hash digest
SHA256 fa41104e3ff85d9014279d78e2e93475e34dadadd0c7ff67f35eaa14eaba28fe
MD5 64f042011bd88d7b237c6226c002eb56
BLAKE2b-256 dc214b4c6670824f90a2b79e229384518416296c991d00d7194e1f9250cc8073

See more details on using hashes here.

File details

Details for the file atlass3hook-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: atlass3hook-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 8.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.7.9

File hashes

Hashes for atlass3hook-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 87a0c767dd3735f82ff75a7dcc86885e74faa80a22de295ad08c6e055052e622
MD5 eba4857656a104fc55e885ef1915f402
BLAKE2b-256 04e1d100c41b9ff5962e824e27a4f486a477e9fae83420605e2ee6a985bfb1a5

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page