Hadoop URI utility
Project description
LindexURI
BigData URI utility
The idea is that every information stored is addressable using a URI, in this case we like to work with HDFS and HIVE. There
LindexURI.isValid( uri ) : returns true if the URI is valid, it's a static method and can be used in a quick way.
luri = LindexURI(uri)
luri.isPartitioned()
returns true if the HIVE uri is defining a partitioned table
if uri == "hive://databasename/tablename?dt=201212" luri.isPartitioned returns True.
luri.getPartitions()
returns a dictionary that describes the HIVE partition
if uri == "hive://databasename/tablename?dt=201212" luri.getPartitions() returns
OrderedDict( 'dt': '201212' )
luri.getDatabase()
gets the database name from the HIVE uri ( this can be modified to work also with HDFS paths )
if uri == "hive://databasename/tablename?dt=201212" luri.getDatabase() returns 'databasename'
luri.getTable()
gets the table name from HIVE uri, can be modified to work also with HDFS paths
if uri == "hive://databasename/tablename?dt=201212" luri.getDatabase() returns 'tablename'
luri.getHDFSHostName()
gets the HDFS hostname
if uri == "hdfs://hdfs-prod/warehouse/databasename.db/tablename.db/dt=201212" luri.getHDFSHostName returns 'hdfs-prod'
luri.getHDFSPath()
gets the path from the HDFS uri
if uri == "hdfs://hdfs-prod/warehouse/databasename.db/tablename.db/dt=201212" luri.getHDFSPath() returns 'warehouse/databasename.db/tablename.db/dt=201212'
luri.getSchema()
gets the schema
if uri == "hdfs://hdfs-prod/warehouse/databasename.db/tablename.db/dt=201212" luri.getSchema() returns 'hdfs'
luri.getPartitionsAsHDFSPath()
converts the partition coordinates into an HDFS path
p = OrderedDict( 'dt' : '201212', 'country': 'AU' ) dt=201212&country=AU
luri.getHDFSPathAsPartition()
converts the HDFS path into a partition coordinates dictionary
'hdfs://hdfs-production/Vault/Docomodigital/Production/Newton/events/prod/year=2018/month=08/day=07/hour=09'
root path : "/Vault/Docomodigital/Production/Newton/events/prod/"
partitions : {
"year" : "2018",
"month" : "08",
"day" : "07",
"hour" : "09"
}
luri.looksPartitioned()
returns true if the HDFS path can define a partition
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for LindexURI-0.1.2-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c97baf5ad18903117a493f6c68780e220bb69a0ceca8036ebce24052f0b9dfaf |
|
MD5 | a7c61011c9a3b7345e62dcd75a2e1c8c |
|
BLAKE2b-256 | c3c82c6bef75a562576a08593a2f1337e334525926a021c835182cbf897b48d9 |