Skip to main content

A Python wrapper library to access Hadoop WebHDFS REST API

Project description

webhdfspy

A Python wrapper library to access the Hadoop WebHDFS REST API.

Installation

pip install webhdfspy

Python versions

webhdfspy requires Python 3.9+

Usage

import webhdfspy

# Basic usage
client = webhdfspy.WebHDFSClient("localhost", 50070, "username")
print(client.listdir("/"))
client.mkdir("/foo")
client.create("/foo/foo.txt", "just put some text here", overwrite=True)
print(client.open("/foo/foo.txt"))
client.remove("/foo", recursive=True)
client.close()

# Context manager (recommended)
with webhdfspy.WebHDFSClient("localhost", 50070, "username") as client:
    client.mkdir("/data")
    client.create("/data/hello.txt", "Hello, HDFS!", overwrite=True)
    print(client.open("/data/hello.txt"))

# HTTPS support
with webhdfspy.WebHDFSClient("host", 9871, "user", scheme="https") as client:
    print(client.listdir("/"))

# Custom timeout (default: 60s)
client = webhdfspy.WebHDFSClient("host", 50070, timeout=30.0)

Available operations

Method Description
listdir(path) List directory contents
mkdir(path, permission=None) Create directories
remove(path, recursive=False) Delete files/directories
rename(src, dst) Rename files/directories
open(path, offset=None, length=None, buffersize=None) Read a file
create(path, file_data, overwrite=None) Create a file
append(path, file_data, buffersize=None) Append to a file
copyfromlocal(local_path, hdfs_path, overwrite=None) Upload a local file
status(path) Get file/directory status
chmod(path, permission) Set permissions
set_owner(path, owner=None, group=None) Set owner/group
set_replication(path, replication_factor) Set replicaton factor
set_times(path, modificationtime=None, accesstime=None) Set modification/access time
get_checksum(path) Get file checksum
get_content_summary(path) Get directory content summary
environ_home() Get user home directory
get_delegation_token(renewer) Get a delegation token
renew_delegation_token(token) Renew a delegation token
cancel_delegation_token(token) Cancel a delegation token

Documentation

http://webhdfspy.readthedocs.org/en/latest/

Hadoop configuration

To enable WebHDFS in Hadoop, add this to your $HADOOP_DIR/conf/hdfs-site.xml:

<property>
    <name>dfs.webhdfs.enabled</name>
    <value>true</value>
</property>

To enable append on HDFS:

<property>
    <name>dfs.support.append</name>
    <value>true</value>
</property>

More about WebHDFS: https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/WebHDFS.html

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

webhdfspy-1.0.0.tar.gz (11.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

webhdfspy-1.0.0-py3-none-any.whl (7.5 kB view details)

Uploaded Python 3

File details

Details for the file webhdfspy-1.0.0.tar.gz.

File metadata

  • Download URL: webhdfspy-1.0.0.tar.gz
  • Upload date:
  • Size: 11.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for webhdfspy-1.0.0.tar.gz
Algorithm Hash digest
SHA256 48d2d8cb096f3396f0499e91266a82123dbeccb3234f9e0c3e943a29b4b76186
MD5 cd79cc3301728269060ba9adef582e72
BLAKE2b-256 94724d9b77707df816a95b47303a4950a908f7773fb848cac4dd61229850c1b1

See more details on using hashes here.

Provenance

The following attestation bundles were made for webhdfspy-1.0.0.tar.gz:

Publisher: publish.yml on fasouto/webhdfspy

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file webhdfspy-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: webhdfspy-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 7.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for webhdfspy-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 680f2dbb9d4349c4cf597e47847d298c625c70eebd1d201dd55c50812b2e270c
MD5 af8b5941ad65944d302ad24fbc4b4113
BLAKE2b-256 89b625cf051ede588087d2e89cd34602f42f17a31cbd71b33c7f5c71cd42eb90

See more details on using hashes here.

Provenance

The following attestation bundles were made for webhdfspy-1.0.0-py3-none-any.whl:

Publisher: publish.yml on fasouto/webhdfspy

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page