An FSSpec Implementation using the Pelican System
Project description
PelicanFS
Overview
PelicanFS is a file system interface (fsspec) for the Pelican Platform. For more information about pelican, see our main website or Github page. For more information about fsspec, visit the filesystem-spec page.
Limitations
PelicanFS is built on top of the http fsspec implementation. As such, any functionality that isn’t available in the http implementation is also not available in PelicanFS.
Installation
To install pelican, run:
pip install pelicanfs
To install from source, run:
git clone https://github.com/PelicanPlatform/pelicanfs.git
cd pelicanfs
pip install -e .
Using PelicanFS
To use pelicanfs, first create a PelicanFileSystem
and provide it with the pelican federation url. As an example using the OSDF federation
from pelicanfs.core import PelicanFileSystem
pelfs = PelicanFileSystem("pelican://osg-htc.org")
Once pelfs
is pointed at your federation's director, fsspec commands can be applied to Pelican namespaces. For example:
hello_world = pelfs.cat('/ospool/uc-shared/public/OSG-Staff/validation/test.txt')
print(hello_world)
Getting an FSMap
Sometimes various systems that interact with an fsspec want a key-value mapper rather than a url. To do that, call the PelicanMap
function with the namespace path and a PelicanFileSystem
object rather than using the fsspec get_mapper
call. For example:
from pelicanfs.core import PelicanFileSystem, PelicanMap
pelfs = PelicanFileSystem(“some-director-url”)
file1 = PelicanMap(“/namespace/file/1”, pelfs=pelfs)
file2 = PelicanMap(“/namespace/file/2”, pelfs=pelfs)
ds = xarray.open_mfdataset([file1,file2], engine='zarr')
Specifying Endpoints
The following describes how to specify endpoints to get data from, rather than letting PelicanFS and the director determine the best cache. PelicanFS allows you to specify whether to read directly from the origin (bypassing data staging altogether) or to name a specific cache to stage data into.
Note
If both direct reads and a specific cache are set, PelicanFS will use the specified cache and ignore the direct reads setting.
Enabling Direct Reads
Sometimes you might wish to read data directly from an origin rather than via a cache. To enable this at PelicanFileSystem creation, just pass in direct_reads=True
to the constructor.
pelfs = PelicanFileSystem("pelican://osg-htc.org", direct_reads=True)
Specifying a Cache
If you want to specify a specific cache to stage your data into (as opposed to the highest priority working cache), this can be done by passing in a cache URL during PelicanFileSystem construction via the preferred_caches
variable:
pelfs = PelicanFileSystem("pelican://osg-htc.org", preferred_caches=["https://cache.example.com"])
or
pelfs = PelicanFileSystem("pelican://osg-htc.org", preferred_caches=["https://cache.example.com",
"https://cache2.example.com", "+"])
Note that the special cache value "+"
indicates that the provided preferred caches should be prepended to the
list of caches from the director.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file pelicanfs-1.0.2.tar.gz
.
File metadata
- Download URL: pelicanfs-1.0.2.tar.gz
- Upload date:
- Size: 18.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.0 CPython/3.12.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0f72f8c754473f2fa8826cda62c2d91fd8f547146d217b1e37c2f0b03b280fa2 |
|
MD5 | e0d44aefb9a9c994e99388dfcc5ad725 |
|
BLAKE2b-256 | d57477015f570c17548ef5149294d8f1a44be3340e9474c43f9fb7186e5706d3 |
File details
Details for the file pelicanfs-1.0.2-py3-none-any.whl
.
File metadata
- Download URL: pelicanfs-1.0.2-py3-none-any.whl
- Upload date:
- Size: 14.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.0 CPython/3.12.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 314e610322c7ac8dea3dfe2c879c94a801d02a3504453670397f8ce323091380 |
|
MD5 | 2b3ef77833469f2f5eafaefaa5106429 |
|
BLAKE2b-256 | 9875c4db2e0bb487f21d577d798b6a377be8136b5c11e21f66208480c8359301 |