Skip to main content

Tool for manage Hadoop access using Apache Atlas and Ranger.

Project description

Cobra-policytool

Cobra-policytool is a tool to ease management of Apache Ranger together with tags in Apache Atlas. These tools manage access policies in Hadoop environments. Cobra-policytool makes it easy to apply configuration files direct to Atlas and Ranger at scale.

The advantages are:

  • configurations can be version controlled
  • changes can be reviewed, audited and tracked
  • integrates with existing CI/CD

Cobra-policytool does also add functionality to Atlas and Ranger. Cobra-policytool can manage have row level filtering policies for Apache Hive based on tags. Ranger requires one row level policy per table, but with cobra-policytool one can have one rule per tag. This rule is then expanded by the cobra-policytool to one rule fore each table having the tag.

Most often one want to have the same access rights hive tables and corresponding files and directories on hdfs. Cobra-policytool can automatically convert a policy for a Hive table to policy for hdfs.

This eases the maintenance and reduce risks for errors.

To be able to use the tool you need to have the right permissions in the environment you are using. For Atlas you must be able to read and create tags and to be able to add and delete them from resources. For the Ranger rules you must be admin, unfortunately.

Cobra-policytool is idempotent, that means you can rerun it as much as you want, the result will not change if on have not changed the input.

There is an introduction how to use cobra-policytool tool on Medium

A presentation about how Cobra-policytool is used within Svenska Spel can be found at Slideshare. and Youtube

Goals

  • Make it easy to manage access policies and metadata within a Apache Hadoop environment that uses Apache Atlas and Apache Ranger.
  • Provide an easy way to apply policies from configuration files, that can be version controlled.
  • Configuration files shall be easy to generate, for instance from a central metadata management system.

Non-Goals

  • Handle the security within the Hadoop environment. We rely on Apache Atlas, Atlas Ranger and other tools within the Hadoop ecosystem.

Contributing

We welcome contributions. In order for us to be able to accept them, please review our contributor guidelines.

License

This project is released as open source according to the terms laid out in the LICENSE.

Supported features

Tagging of resources

  • Sync of table and column tags from metadata files to Atlas.
  • Keep tags between hive corresponding directory on hdfs in sync (use option --hdfs)
  • Audit to show differences between metadata and Atlas.
  • New tag definitions are automatically added to Atlas on sync.
  • Verbose output to provide changes done.
  • Authentication using kerberos ticket.

Creating policies

  • Sync policies from metadata file to Ranger.
  • Expand tag based row filtering rules to Hive row based filtering.

Requirements

  • Atlas, Ranger, and Hive installed and working.
  • Kerberos turned on on the Hadoop cluster, including Atlas and Ranger. Your client do also need to have a valid kerberos ticket.
  • Python 2.7.
  • We have successfully used it on MS Windows, MacOS and Linux.

Installation

pip install cobra-policytool

Usage of CLI

To get up to date help how to use the tool: cobra-policy --help

For any use where policytool talks to the Atlas server a kerberos ticket must be available.

Create a configfile matching your environment, see docs/Configfile.md.

Read about the indata files in docs/indata.md.

Sync tag metadata information to Atlas

Policytool takes files in --srcdir directory created according to indata specification and sync them with the metadata store in Hadoop called Atlas. To do this run:

$ cobra-policy tags_to_atlas --srcdir src/main/tags/ --environment utv

There is an option --verbose to get more output from cobra-policytool describing what tables and columns was changed. Note! If you run same cobra-policytool command twice you will not get any changes the second time since all changes happened the first round.

Sync Ranger policies works in a similar fashion, though it requires that project-name is provided. Project-name is a name of the project you are working in. It is used to find already existing policies in Ranger and to be able to separate the ranger rules into multiple projects.

$ cobra-policy rules_to_ranger --srcdir src/main/tags/ --environment dev --project-name dimension_out

Usage of API

The package can also be used as a python library. Here is a short example to use the Atlas Client class.

from requests_kerberos import HTTPKerberosAuth
import policytool.atlas

c = policytool.atlas.Client(
        'http://atlas.test.my.org:21000/api/atlas',
         auth=HTTPKerberosAuth())
c.known_tags()
c.get_tables("hadoop_out_utv")

For details read the Python doc for the code and look how the command line client is implemented.

Other documentation

Beside this document there are more in the docs directory. You can also find a todo list including future plans.

We recommend to read the convention document and indata document before you start.


Copyright 2018 AB SvenskaSpel

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cobra-policytool-1.1.6.tar.gz (26.6 kB view details)

Uploaded Source

Built Distributions

cobra_policytool-1.1.6-py2.py3-none-any.whl (29.9 kB view details)

Uploaded Python 2 Python 3

cobra_policytool-1.1.6-py2.7.egg (67.9 kB view details)

Uploaded Source

cobra_policytool-1.1.6-py2-none-any.whl (29.9 kB view details)

Uploaded Python 2

File details

Details for the file cobra-policytool-1.1.6.tar.gz.

File metadata

  • Download URL: cobra-policytool-1.1.6.tar.gz
  • Upload date:
  • Size: 26.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/2.7.15

File hashes

Hashes for cobra-policytool-1.1.6.tar.gz
Algorithm Hash digest
SHA256 e3818935b15541b9a9066671a298eb7640a92a6663e802accc48a014cefb1d10
MD5 8f4b17db5b479e3b82c5f2bd0bd39ef4
BLAKE2b-256 1f58eb2ec756abd40a5fc4e846acbfefb65379a21e3f5bf5c237207fb7b375b2

See more details on using hashes here.

File details

Details for the file cobra_policytool-1.1.6-py2.py3-none-any.whl.

File metadata

  • Download URL: cobra_policytool-1.1.6-py2.py3-none-any.whl
  • Upload date:
  • Size: 29.9 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/2.7.15

File hashes

Hashes for cobra_policytool-1.1.6-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 9748c85fe2c64caf823879643e0b1cbacf75207fb09dcfd5effa0fdd57b7e759
MD5 c08cb645655bacc84db94c2acd9bb71f
BLAKE2b-256 7b3d9015943a6e93e6b60ac9d0a72b676e34c74d5195a3adf330a7d86e1c97bb

See more details on using hashes here.

File details

Details for the file cobra_policytool-1.1.6-py2.7.egg.

File metadata

  • Download URL: cobra_policytool-1.1.6-py2.7.egg
  • Upload date:
  • Size: 67.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/2.7.15

File hashes

Hashes for cobra_policytool-1.1.6-py2.7.egg
Algorithm Hash digest
SHA256 290bfdb40149fc6d285dc63ce3477f66862d35f8ec8e8e1e2ebaebf38ab53da6
MD5 7bebd2442c63c4cdcf82769723d66bba
BLAKE2b-256 9dc84f62f5ae790cbc9f464037bb603bbdfd9f64ddb673e2d8b7af1c676b3786

See more details on using hashes here.

File details

Details for the file cobra_policytool-1.1.6-py2-none-any.whl.

File metadata

  • Download URL: cobra_policytool-1.1.6-py2-none-any.whl
  • Upload date:
  • Size: 29.9 kB
  • Tags: Python 2
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/2.7.15

File hashes

Hashes for cobra_policytool-1.1.6-py2-none-any.whl
Algorithm Hash digest
SHA256 4ebb7209093fd4262fef6c6826a74cb24eabd27385fff7964911d21121c5bc78
MD5 0369b82f06cd4c9f5ed08bfa7c9059f5
BLAKE2b-256 3c84051886339a2792e365305798c448a342ca4db571973416eb4dea96e62580

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page