Convenient Filesystem interface over GCS
Project description
gcsfs
GCSFS is a Python library that provides a familiar, file-system-like interface to Google Cloud Storage (GCS). Built on top of fsspec, it allows you to interact with cloud buckets as if they were local directories, making it a favorite for data scientists and engineers.
Getting Started
Installation
Install via pip or conda:
# Using pip
pip install gcsfs
# OR using conda
conda install -c conda-forge gcsfs
Basic Usage
import gcsfs
# Initialize the filesystem
fs = gcsfs.GCSFileSystem(project='my-google-project')
# List files in a bucket
files = fs.ls('my-bucket')
# Read a file directly into a string/bytes
with fs.open('my-bucket/data.txt', 'rb') as f:
content = f.read()
Specialized Bucket Support
GCSFS now automatically supports advanced Google Cloud Storage features through its ExtendedFileSystem implementation.
1. Hierarchical Namespace (HNS)
Hierarchical Namespace (HNS) replaces the traditional "flat" GCS structure with true logical directories.
- Atomic Renames: Moving or renaming a directory is an
O(1)metadata operation. No more slow "copy-then-delete" for large folders. - High Performance: Offers up to 8x higher initial Queries Per Second (QPS) for read/write operations.
- AI/ML Ready: Ideal for heavy checkpointing and managing millions of small files.
2. Rapid Buckets (Zonal Storage)
Rapid Buckets are zonal storage resources designed for ultra-low latency and maximum throughput.
- Zonal Co-location: Place your data in the same zone as your GPU/TPU clusters to minimize network lag.
- True Appends: Unlike standard GCS objects, you can append data to existing objects in Rapid buckets without a full rewrite.
- Streaming I/O: Optimized for high-speed model loading and real-time logging.
Integration & Auth
GCSFS plays nicely with the rest of the Python data ecosystem.
Authentication Modes
- Default: Uses your local gcloud credentials or environment service accounts.
- Cloud: Explicitly use Google Metadata service (
token='cloud'). - Anonymous: Access public data without a login (
token='anon'). - Service Account: Pass the path to your JSON key file (
token='path/to/key.json').
[!TIP] Note on Async: GCSFS is built on
aiohttp. If you are building high-concurrency applications, you can use the asynchronous API by passingasynchronous=Trueto theGCSFileSystemconstructor.
Support
Work on this repository is supported in part by:
"Anaconda, Inc. - Advancing AI through open source."
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file gcsfs-2026.5.0.tar.gz.
File metadata
- Download URL: gcsfs-2026.5.0.tar.gz
- Upload date:
- Size: 922.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
283941a7a53bdbfed2133f6b471c640e61c0a1379eb52280a26618b83acc49c0
|
|
| MD5 |
89835508041c95f3c126df7858cfbbc1
|
|
| BLAKE2b-256 |
78da915029bc34b541c5ac3e6ba8b32ca14d278cd17588bbbb6e4b297a944cd9
|
Provenance
The following attestation bundles were made for gcsfs-2026.5.0.tar.gz:
Publisher:
release.yml on fsspec/gcsfs
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
gcsfs-2026.5.0.tar.gz -
Subject digest:
283941a7a53bdbfed2133f6b471c640e61c0a1379eb52280a26618b83acc49c0 - Sigstore transparency entry: 1462146410
- Sigstore integration time:
-
Permalink:
fsspec/gcsfs@255e4f866ae2c66dbca14a0aaa3dea024156997f -
Branch / Tag:
refs/tags/2026.5.0 - Owner: https://github.com/fsspec
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@255e4f866ae2c66dbca14a0aaa3dea024156997f -
Trigger Event:
push
-
Statement type:
File details
Details for the file gcsfs-2026.5.0-py3-none-any.whl.
File metadata
- Download URL: gcsfs-2026.5.0-py3-none-any.whl
- Upload date:
- Size: 77.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a6abf1e6ee4b8ad7e0f137ca423c2f1441610ac98c3b933e11c505f8c62a90a9
|
|
| MD5 |
0ada03987b28f955fd43d15b37abe621
|
|
| BLAKE2b-256 |
0dda7bed8e864afbb8f1870f8120dc4148ecec5480ba988d4241769a2b308a35
|
Provenance
The following attestation bundles were made for gcsfs-2026.5.0-py3-none-any.whl:
Publisher:
release.yml on fsspec/gcsfs
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
gcsfs-2026.5.0-py3-none-any.whl -
Subject digest:
a6abf1e6ee4b8ad7e0f137ca423c2f1441610ac98c3b933e11c505f8c62a90a9 - Sigstore transparency entry: 1462146416
- Sigstore integration time:
-
Permalink:
fsspec/gcsfs@255e4f866ae2c66dbca14a0aaa3dea024156997f -
Branch / Tag:
refs/tags/2026.5.0 - Owner: https://github.com/fsspec
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@255e4f866ae2c66dbca14a0aaa3dea024156997f -
Trigger Event:
push
-
Statement type: