Skip to main content

An FSSpec Implementation using the Pelican System

Project description

PelicanFS

DOI

Overview

PelicanFS is a file system interface (fsspec) for the Pelican Platform. For more information about pelican, see our main website or Github page. For more information about fsspec, visit the filesystem-spec page.

Limitations

PelicanFS is built on top of the http fsspec implementation. As such, any functionality that isn’t available in the http implementation is also not available in PelicanFS.

Installation

To install pelican, run:

pip install pelicanfs

To install from source, run:

git clone https://github.com/PelicanPlatform/pelicanfs.git
cd pelicanfs
pip install -e .

Using PelicanFS

To use pelicanfs, first create a PelicanFileSystem and provide it with the pelican federation url. As an example using the OSDF federation

from pelicanfs.core import PelicanFileSystem

pelfs = PelicanFileSystem("pelican://osg-htc.org")

Once pelfs is pointed at your federation's director, fsspec commands can be applied to Pelican namespaces. For example:

hello_world = pelfs.cat('/ospool/uc-shared/public/OSG-Staff/validation/test.txt')
print(hello_world)

Getting an FSMap

Sometimes various systems that interact with an fsspec want a key-value mapper rather than a url. To do that, call the PelicanMap function with the namespace path and a PelicanFileSystem object rather than using the fsspec get_mapper call. For example:

from pelicanfs.core import PelicanFileSystem, PelicanMap

pelfs = PelicanFileSystem(some-director-url)
file1 = PelicanMap(/namespace/file/1, pelfs=pelfs)
file2 = PelicanMap(/namespace/file/2, pelfs=pelfs)
ds = xarray.open_mfdataset([file1,file2], engine='zarr')

Specifying Endpoints

The following describes how to specify endpoints to get data from, rather than letting PelicanFS and the director determine the best cache. PelicanFS allows you to specify whether to read directly from the origin (bypassing data staging altogether) or to name a specific cache to stage data into.

Note

If both direct reads and a specific cache are set, PelicanFS will use the specified cache and ignore the direct reads setting.

Enabling Direct Reads

Sometimes you might wish to read data directly from an origin rather than via a cache. To enable this at PelicanFileSystem creation, just pass in direct_reads=True to the constructor.

pelfs = PelicanFileSystem("pelican://osg-htc.org", direct_reads=True)

Specifying a Cache

If you want to specify a specific cache to stage your data into (as opposed to the highest priority working cache), this can be done by passing in a cache URL during PelicanFileSystem construction via the preferred_caches variable:

pelfs = PelicanFileSystem("pelican://osg-htc.org", preferred_caches=["https://cache.example.com"])

or

pelfs = PelicanFileSystem("pelican://osg-htc.org", preferred_caches=["https://cache.example.com",
    "https://cache2.example.com", "+"])

Note that the special cache value "+" indicates that the provided preferred caches should be prepended to the list of caches from the director.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pelicanfs-1.0.2.tar.gz (18.7 kB view details)

Uploaded Source

Built Distribution

pelicanfs-1.0.2-py3-none-any.whl (14.6 kB view details)

Uploaded Python 3

File details

Details for the file pelicanfs-1.0.2.tar.gz.

File metadata

  • Download URL: pelicanfs-1.0.2.tar.gz
  • Upload date:
  • Size: 18.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.0 CPython/3.12.5

File hashes

Hashes for pelicanfs-1.0.2.tar.gz
Algorithm Hash digest
SHA256 0f72f8c754473f2fa8826cda62c2d91fd8f547146d217b1e37c2f0b03b280fa2
MD5 e0d44aefb9a9c994e99388dfcc5ad725
BLAKE2b-256 d57477015f570c17548ef5149294d8f1a44be3340e9474c43f9fb7186e5706d3

See more details on using hashes here.

File details

Details for the file pelicanfs-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: pelicanfs-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 14.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.0 CPython/3.12.5

File hashes

Hashes for pelicanfs-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 314e610322c7ac8dea3dfe2c879c94a801d02a3504453670397f8ce323091380
MD5 2b3ef77833469f2f5eafaefaa5106429
BLAKE2b-256 9875c4db2e0bb487f21d577d798b6a377be8136b5c11e21f66208480c8359301

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page