A modern and asynchronous web client for WebHDFS
Project description
aiowebhdfs
I know, nobody uses Hadoop anymore, but for those who do, here is a library that handles large files with async features for web requests using the httpx library and aiofiles for streaming data from HDFS
Features
- Implements retries and timeout windows with
retry_asyncfromopnieuwlibrary - Implements streaming through the
aiofileslibrary - Implments async requests through the
httpxlibrary - Fully tested for core subset of operations in WebHDFS
v3.2.1
CREATE = Write File
from aiowebhdfs import WebHdfsAsyncClient
client = WebHdfsAsyncClient(host='namenode.local', port=8443, user='spark', kerberos_token=token)
client.create('c:\\temp\\bigfile.txt', '/data/agg/bigfile.txt', overwrite=False)
OPEN = Read File
from aiowebhdfs import WebHdfsAsyncClient
client = WebHdfsAsyncClient(host='namenode.local', port=8443, user='spark', kerberos_token=token)
client.open('/data/agg/bigfile.txt')
Content of the file
GETFILESTATUS = File Info
from aiowebhdfs import WebHdfsAsyncClient
client = WebHdfsAsyncClient(host='namenode.local', port=8443, user='spark', kerberos_token=token)
client.get_file_status('/data/agg/bigfile.txt')
{
"FileStatus":
{
"accessTime" : 0,
"blockSize" : 0,
"group" : "supergroup",
"length" : 0, //in bytes, zero for directories
"modificationTime": 1320173277227,
"owner" : "webuser",
"pathSuffix" : "",
"permission" : "777",
"replication" : 0,
"type" : "DIRECTORY" //enum {FILE, DIRECTORY}
}
}
LISTSTATUS = List Directory
from aiowebhdfs import WebHdfsAsyncClient
client = WebHdfsAsyncClient(host='namenode.local', port=8443, user='spark', kerberos_token=token)
client.list_directory('/tmp')
{
"FileStatuses":
{
"FileStatus":
[
{
"accessTime" : 1320171722771,
"blockSize" : 33554432,
"group" : "supergroup",
"length" : 24930,
"modificationTime": 1320171722771,
"owner" : "webuser",
"pathSuffix" : "a.patch",
"permission" : "644",
"replication" : 1,
"type" : "FILE"
},
{
"accessTime" : 0,
"blockSize" : 0,
"group" : "supergroup",
"length" : 0,
"modificationTime": 1320895981256,
"owner" : "szetszwo",
"pathSuffix" : "bar",
"permission" : "711",
"replication" : 0,
"type" : "DIRECTORY"
},
...
]
}
}
GETCONTENTSUMMARY = Summary of Directory
from aiowebhdfs import WebHdfsAsyncClient
client = WebHdfsAsyncClient(host='namenode.local', port=8443, user='spark', kerberos_token=token)
client.list_directory('/tmp')
{
"FileStatuses":
{
"FileStatus":
[
{
"accessTime" : 1320171722771,
"blockSize" : 33554432,
"group" : "supergroup",
"length" : 24930,
"modificationTime": 1320171722771,
"owner" : "webuser",
"pathSuffix" : "a.patch",
"permission" : "644",
"replication" : 1,
"type" : "FILE"
},
{
"accessTime" : 0,
"blockSize" : 0,
"group" : "supergroup",
"length" : 0,
"modificationTime": 1320895981256,
"owner" : "szetszwo",
"pathSuffix" : "bar",
"permission" : "711",
"replication" : 0,
"type" : "DIRECTORY"
},
...
]
}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
aiowebhdfs-0.0.1.tar.gz
(4.0 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file aiowebhdfs-0.0.1.tar.gz.
File metadata
- Download URL: aiowebhdfs-0.0.1.tar.gz
- Upload date:
- Size: 4.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6b2e7f3a417796d442786608008c49589d4505538ceeee98a79b342c71ee730f
|
|
| MD5 |
a2dd11fb100c01d6bbb2d02246853be0
|
|
| BLAKE2b-256 |
906f5f9bf425a05b4e100bf93913046cb2671c9e5bb9d46e2c22a9629dfdde56
|
File details
Details for the file aiowebhdfs-0.0.1-py3-none-any.whl.
File metadata
- Download URL: aiowebhdfs-0.0.1-py3-none-any.whl
- Upload date:
- Size: 15.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f2df99cf354421a3b4d8dff2b0a89508aaa930c3d011e3086fb97fe238fab608
|
|
| MD5 |
63d9ccab5ffd16a470cccd28fe97cd6e
|
|
| BLAKE2b-256 |
440dfb5aa60a0bf48880d8327b75d29fdfa3df4c71a94095b21481f1b7bd39b4
|