Skip to main content

A small tool for managing Azure DataLake Store (ADLS) Access Control Lists (ACLs).

Project description

Azure DataLake Storage (ADLS)- Access Control List (ACL) Manager

A small CLI tool for managing Azure DataLake Storage (ADLS) Access Control Lists (ACLs) for containers and directories.

It allows you to take control of your ADLS account's directory structure and ACLs as Infrastructure as Code through the use of YAML configuration files.

Tests PyPI

Requirements

  • Python 3.12+
  • Yamale
  • azure-identity
  • azure-storage-file-datalake

Install

pip

$ pip install adls-acl 

Usage

Command line

adls-acl can be run from the command line to create directories and set desired ACLs in the Azure Storage Account Gen v2 as defined in a user supplied YAML files.

Containers and directories defined in the config file, but not present in the storage account, will be created during adls-acl run. The ACLs for existing directories in the storage account, will be overwritten with those specified in the input config file. Future releases shall enable alternative behaviors. For that reason, the current version of adls-acl is best for green field deployments.

The Azure Identity client (Python SDK) is used for authenticating to Microsoft Entra ID (former Azure AD). It currently uses DefaultAzureCredential (MS DOCS: DefaultCredential), which enables authentication with multitude of methods (in the future a user will be able to target a specific authentication mechanism via a CLI option in adls-acl for better control).

Usage:

Usage: adls-acl [OPTIONS] COMMAND [ARGS]...

Options:
  --debug          Enable debug messages.
  --silent         Suppress logs to stdout.
  --log-file TEXT  Redirect logs to a file.
  --help           Show this message and exit.

Commands:
  get-acl  Read the current fs and acls on dirs.
  set-acl  Read and set direcotry structure and ACLs from a YAML file.

Options:

  • --debug log levels for the adls-acl and Azure SDK libraries will be set to DEBUG
  • --silent nothing gets printed to stdout.
  • --log-file a copy of log messages will be printed to that file

set-acl command

Usage: adls-acl set-acl [OPTIONS] FILE

  Read and set direcotry structure and ACLs from a YAML file.

Options:
  --auth-method [default|environment|workload|managedid|azurecli|azureps|azuredevcli]
                                  Azure AD Authentication method
  --auth-opt <TEXT TEXT>...       Keyword arguments to pass to Azure SDK
                                  credential constructor
  --help                          Show this message and exit.

Options:

  • --auth-method allows the user to choose from a Azure Python SDK Authentication methods
  • --auth-opt keyword arguments to be passed to the Azure Python SDK authentication constructors. Can be used multiple times in a call.

To set acls from an input file test.yml the shell command would look like:

adls-acl set-acl test.yml

get-acl command

Usage: adls-acl get-acl [OPTIONS] ACCOUNT_NAME OUTFILE

  Read the current fs and acls on dirs.

Options:
  --omit-special                  Omit special ACLs when reading the account.
  --auth-method [default|environment|workload|managedid|azurecli|azureps|azuredevcli]
                                  Azure AD Authentication method
  --auth-opt <TEXT TEXT>...       Keyword arguments to pass to Azure SDK
                                  credential constructor
  --help                          Show this message and exit.

This will print the current filesystem of an account (directories only, no files) and their ACLs to a file on a path pass as OUTFILE argument. Options:

  • --omit-special Special ACLs can be omitted and not printed to the output file
  • --auth-method allows the user to choose from a Azure Python SDK Authentication methods
  • --auth-opt keyword arguments to be passed to the Azure Python SDK authentication constructors. Can be used multiple times in a call.

To read ACLs of a ADLS storage account named testaccount to file dump.yml:

adls-acl get-acl testaccount dump.yml

Input file

The YAML schema reference for the input files. Each input file represents a desired directory structure and ACLs for a single Azure Storage account.

Input File Example

Example of an input file for a fictitious storage account. All elements of the schema are explained in the following sections.

account: testaccount
containers:
  - name: testcontainer1 
    acls:
      - oid: "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
        type: "user"
        acl: r-x
      - oid: "yyyyyyyy-yyyy-yyyy-yyyy-yyyyyyyyyyyy"
        type: "user"
        acl: --x
    folders:
      - name: directory_a
        acls:
          - oid: "yyyyyyyy-yyyy-yyyy-yyyy-yyyyyyyyyyyy"
            type: "group"
            acl: rwx
            scope: default
        folders:
          - name: subdir_a 
            acls:
              - oid: "yyyyyyyy-yyyy-yyyy-yyyy-yyyyyyyyyyyy"
                type: "user"
                acl: --x
      - name: directory_b
        acls:
          - oid: "yyyyyyyy-yyyy-yyyy-yyyy-yyyyyyyyyyyy"
            type: "group"
            acl: rwx
            scope: default

The above input would create the following directory structure in the storage account testaccount:

testcontainer1 (storage container)
root/
|
├── directory_a/
|   ├── subdir_a/
|
├── directory_b/

  • Multiple containers can be specified in the config file.
  • ACLs on the container are applied on the container's root directory.
  • Subdirectories can be nested to create a desired directory hierarchy.

Account - definition

account: string # Required. The name of the Azure storage acccount.
containers: [ folder ] # list of containers in the account.

account string. Required. Azure Storage Account name as in: https://<account>.dfs.core.windows.net/

container folder, A list of objects describing directories and their ACLS. In the context of the container it defines container's name, ACLs on the container root and subdirectories.

Folder - definition

name: string # Required. Direcotry name.
acls: [ acl ] # A list of ACLs to set on the directory.
folders: [ folder ] # A list of subdirectory objects.

name string. Required. A name of a directory.

acls acl A list of ACLs to set on the directory.

folders folder A list of objects describing subdirectories.

Acl - definition

oid: string. # Required. Security principal Object ID in Microsoft Entra ID.
type: string # Required. Security principal type.
acl: string 
scope: string
recursive: bool

oid string. Required. Object ID of the principal (user/group/managed identity/service principal) in Microsoft Entra (former Azure Active Directory).

type string. Required. Type of the service principal. Allowed values: user (for users, service principals, and managed identities), group (for Entra ID groups), other (for all other users ACLs), mask (for setting masks on directories)

acl string. Required. A string defining desired permissions in the short form. MS DOCS: ADLS ACLs e.g.: r-- for read-only permissions

scope string. Optional If set to default it will set the specified ACLs as default ACLs MS DOCS: Types of ACLs. If not present, ACLs will be set as access ACLs.

recursive bool. Optional If set to True that ACL will be applied recursively to every subdirectroy and file inside the directory this ACL is to be set on.

Special ACLs

adls-acl also allows for managing ACLs for owning user, owning group, all other users, as well as setting masks. Examples of how to specify each of the above, in the adls-acl YAML input file (as acl block) are provided below:

  • owning user
oid: ""
type: user
acl: ---
  • owning group
oid: ""
type: group
acl: ---
  • other users
oid: ""
type: other
acl: ---
  • masks
oid: ""
type: mask
acl: ---

All of the above can be set as default ACLs by adding scope: default parameter.

Default ACLs

The default ACLs defined or set on the higher level directories are pushed down to subdirectories specified in the input file. They will not be set on any files that had existed in the directories prior to the execution of adls-acl. Moreover, any subdirectories that exist in the account but are not specified in the input file remain untouched by adls-acl.

Future releases will allow for more control over this behaviour (i.e, updating default ACLs on all files created prior to the change of ACLs).

Authentication Methods

The Azure Python SDK authentication is by default handled with DefaultAzureCredenial.

In addition, a user can target one of the supported Azure Python SDK, by using --auth-method option:

To pass keyword arguments to the credential constructs use --auth-opt option. This option can be used multiple times, one instance per keyword argument.

e.g. to pass managed_identity_client_id and exclude_cli_credential to DefaultAzureCredental:

--auth-method default --auth-opt managed_identity_client_id xxxx-xxxx-xxxxx --auth-opt exlcude_cli_credential False

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

adls_acl-0.1.0.tar.gz (17.8 kB view details)

Uploaded Source

Built Distribution

adls_acl-0.1.0-py3-none-any.whl (13.9 kB view details)

Uploaded Python 3

File details

Details for the file adls_acl-0.1.0.tar.gz.

File metadata

  • Download URL: adls_acl-0.1.0.tar.gz
  • Upload date:
  • Size: 17.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.0.0 CPython/3.12.3

File hashes

Hashes for adls_acl-0.1.0.tar.gz
Algorithm Hash digest
SHA256 ccc3deb28414f6bd1030bc6bf8fe808fb2746891f277d2ff7eba5955eabcb49f
MD5 bd051837289286773ede4ecb59bf9717
BLAKE2b-256 864786dd74b3db9394699ab22a3624504ed43963dba5fe8e58b41276b4c11b24

See more details on using hashes here.

File details

Details for the file adls_acl-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: adls_acl-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 13.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.0.0 CPython/3.12.3

File hashes

Hashes for adls_acl-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 402b686069a78449f914bda585b9962a660e538e31f3a046d956027419cd3534
MD5 3a843c56cc414dd7fb106a58b89ace48
BLAKE2b-256 b5e842dcc14bade4efdbef586cec7b6b2f169de632f589729fb28744f1fddde2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page