A Python wrapper library to access Hadoop WebHDFS REST API
Project description
webhdfspy
A Python wrapper library to access the Hadoop WebHDFS REST API.
Installation
pip install webhdfspy
Python versions
webhdfspy requires Python 3.9+
Usage
import webhdfspy
# Basic usage
client = webhdfspy.WebHDFSClient("localhost", 50070, "username")
print(client.listdir("/"))
client.mkdir("/foo")
client.create("/foo/foo.txt", "just put some text here", overwrite=True)
print(client.open("/foo/foo.txt"))
client.remove("/foo", recursive=True)
client.close()
# Context manager (recommended)
with webhdfspy.WebHDFSClient("localhost", 50070, "username") as client:
client.mkdir("/data")
client.create("/data/hello.txt", "Hello, HDFS!", overwrite=True)
print(client.open("/data/hello.txt"))
# HTTPS support
with webhdfspy.WebHDFSClient("host", 9871, "user", scheme="https") as client:
print(client.listdir("/"))
# Custom timeout (default: 60s)
client = webhdfspy.WebHDFSClient("host", 50070, timeout=30.0)
Available operations
| Method | Description |
|---|---|
listdir(path) |
List directory contents |
mkdir(path, permission=None) |
Create directories |
remove(path, recursive=False) |
Delete files/directories |
rename(src, dst) |
Rename files/directories |
open(path, offset=None, length=None, buffersize=None) |
Read a file |
create(path, file_data, overwrite=None) |
Create a file |
append(path, file_data, buffersize=None) |
Append to a file |
copyfromlocal(local_path, hdfs_path, overwrite=None) |
Upload a local file |
status(path) |
Get file/directory status |
chmod(path, permission) |
Set permissions |
set_owner(path, owner=None, group=None) |
Set owner/group |
set_replication(path, replication_factor) |
Set replicaton factor |
set_times(path, modificationtime=None, accesstime=None) |
Set modification/access time |
get_checksum(path) |
Get file checksum |
get_content_summary(path) |
Get directory content summary |
environ_home() |
Get user home directory |
get_delegation_token(renewer) |
Get a delegation token |
renew_delegation_token(token) |
Renew a delegation token |
cancel_delegation_token(token) |
Cancel a delegation token |
Documentation
http://webhdfspy.readthedocs.org/en/latest/
Hadoop configuration
To enable WebHDFS in Hadoop, add this to your $HADOOP_DIR/conf/hdfs-site.xml:
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
To enable append on HDFS:
<property>
<name>dfs.support.append</name>
<value>true</value>
</property>
More about WebHDFS: https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/WebHDFS.html
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file webhdfspy-1.0.0.tar.gz.
File metadata
- Download URL: webhdfspy-1.0.0.tar.gz
- Upload date:
- Size: 11.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
48d2d8cb096f3396f0499e91266a82123dbeccb3234f9e0c3e943a29b4b76186
|
|
| MD5 |
cd79cc3301728269060ba9adef582e72
|
|
| BLAKE2b-256 |
94724d9b77707df816a95b47303a4950a908f7773fb848cac4dd61229850c1b1
|
Provenance
The following attestation bundles were made for webhdfspy-1.0.0.tar.gz:
Publisher:
publish.yml on fasouto/webhdfspy
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
webhdfspy-1.0.0.tar.gz -
Subject digest:
48d2d8cb096f3396f0499e91266a82123dbeccb3234f9e0c3e943a29b4b76186 - Sigstore transparency entry: 1008772405
- Sigstore integration time:
-
Permalink:
fasouto/webhdfspy@c03dd244faab021cb5b00af206d773640a57d6ad -
Branch / Tag:
refs/tags/v1.0.0 - Owner: https://github.com/fasouto
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@c03dd244faab021cb5b00af206d773640a57d6ad -
Trigger Event:
release
-
Statement type:
File details
Details for the file webhdfspy-1.0.0-py3-none-any.whl.
File metadata
- Download URL: webhdfspy-1.0.0-py3-none-any.whl
- Upload date:
- Size: 7.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
680f2dbb9d4349c4cf597e47847d298c625c70eebd1d201dd55c50812b2e270c
|
|
| MD5 |
af8b5941ad65944d302ad24fbc4b4113
|
|
| BLAKE2b-256 |
89b625cf051ede588087d2e89cd34602f42f17a31cbd71b33c7f5c71cd42eb90
|
Provenance
The following attestation bundles were made for webhdfspy-1.0.0-py3-none-any.whl:
Publisher:
publish.yml on fasouto/webhdfspy
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
webhdfspy-1.0.0-py3-none-any.whl -
Subject digest:
680f2dbb9d4349c4cf597e47847d298c625c70eebd1d201dd55c50812b2e270c - Sigstore transparency entry: 1008772418
- Sigstore integration time:
-
Permalink:
fasouto/webhdfspy@c03dd244faab021cb5b00af206d773640a57d6ad -
Branch / Tag:
refs/tags/v1.0.0 - Owner: https://github.com/fasouto
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@c03dd244faab021cb5b00af206d773640a57d6ad -
Trigger Event:
release
-
Statement type: