Skip to main content

A Synapse implementation of the fsspec interface

Project description

fs-synapse

A Synapse implementation of the fsspec interface.

fs-synapse allows us to leverage the fsspec API to interface with Synapse files, folders, and projects. By learning this API, you can write code that is agnostic to where your files are physically located. This is achieved by referring to Synapse entities using URLs. Commented examples are included below.

syn://syn50545516               # Synapse project

syn://syn50557597               # Folder in the above Synapse project
syn://syn50545516/syn50557597   # Same folder, but using a full path
syn://syn50545516/TestSubDir    # Same folder, but referenced by name

syn://syn50555279               # File in the above Synapse project
syn://syn50545516/syn50555279   # Same file, but using a full path
syn://syn50545516/test.txt      # Same file, but referenced by name

syn://syn50545516/ExploratoryTests/report.json      # Nested file

Benefits

There are several benefits to using the fs-synapse API over synapseclient.

from synapsefs import SynapseFS

fs = SynapseFS()

# Or using fsspec directly
import fsspec
fs = fsspec.filesystem("syn")

Interact with Synapse using a Pythonic interface

file_url = "syn://syn50555279"

with fs.open(file_url, "a") as fp:
    fp.write("Appending some text to a Synapse file")

Access to several convenience functions

folder_url = "syn://syn50696438"

fs.makedirs(f"{folder_url}/creating/nested/folders/with/one/operation")

Refer to Synapse files and folders by name

You don't have to track as many Synapse IDs. You only need to care about the top-level projects or folders and refer to subfolders and files by name.

project_url = "syn://syn50545516"

data_url = f"{project_url}/data/raw.csv"
output_url = f"{project_url}/outputs/processed.csv"

with fs.open(data_url, "r") as data_fp, fs.open(output_url, "a") as output_fp:
    results = number_cruncher(data)
    output.write(results)

Write Synapse-agnostic code

Unfortunately, every time you use synapseclient for file and folder operations, you are hard-coding a dependency on Synapse into your project. Leveraging fs-synapse helps avoid this hard dependency and makes your code more portable to other file backends (e.g. S3). You can swap for any other file system by using their URL scheme (e.g. s3://). Here's an index of available file systems that you can swap for.

Rely on code covered by integration tests

So you don't have to write the Synapse integration tests yourself! These tests tend to be slow, so delegating that responsibilty to an externally managed package like fs-synapse keeps your test suite fast and focused on what you care about.

In your test code, you can use the memory filesystem for faster I/O instead of storing and retrieving files on Synapse.

def test_some_feature_of_your_code():
    fs = fsspec.filesystem("memory")
    cruncher = NumberCruncher(fs=fs)
    cruncher.save("report.json")
    assert fs.exists("report.json")

Migration from PyFilesystem2 to fsspec

This package previously used PyFilesystem2 (fs) as its base. It now uses fsspec. The table below maps the old API to the new one.

Initialization

Old (PyFilesystem2) New (fsspec)
from fs import open_fs import fsspec
fs = open_fs("syn://") fs = fsspec.filesystem("syn")
fs = open_fs("syn://syn50545516") fs = SynapseFS(root="syn50545516")

File operations

Old (PyFilesystem2) New (fsspec)
fs.open(path, "r") fs.open(path, "r")
fs.readtext(path) fs.cat_file(path).decode()
fs.readbytes(path) fs.cat_file(path)
fs.writetext(path, text) fs.pipe_file(path, text.encode())
fs.writebytes(path, data) fs.pipe_file(path, data)
fs.create(path) / fs.touch(path) fs.touch(path)
fs.download(name, file_obj) fs.get(path, local_path)

Directory operations

Old (PyFilesystem2) New (fsspec)
fs.listdir(path) fs.ls(path, detail=False) (returns full paths)
fs.makedir(path) fs.mkdir(path)
fs.makedirs(path) fs.makedirs(path)
fs.opendir(path) (no equivalent; use full paths)
fs.tree(path=path) fs.ls(path, detail=True)

Removal

Old (PyFilesystem2) New (fsspec)
fs.remove(path) fs.rm(path)
fs.removedir(path) fs.rmdir(path)
fs.removetree(path) fs.rm(path, recursive=True)

Info and metadata

Old (PyFilesystem2) New (fsspec)
info = fs.getinfo(path, namespaces=["details", "synapse"]) info = fs.info(path)
info.name info["name"]
info.is_dir info["type"] == "directory"
info.get("details", "size") info["size"]
info.get("synapse", "id") info["synapse_id"]
info.get("synapse", "content_type") info["synapse_content_type"]
info.get("synapse", "etag") info["synapse_etag"]
fs.getsize(path) fs.info(path)["size"]
fs.gettype(path) fs.info(path)["type"]
fs.exists(path) fs.exists(path)

Errors

Old (PyFilesystem2) New (fsspec)
fs.errors.ResourceNotFound FileNotFoundError
fs.errors.FileExists / DirectoryExists FileExistsError
fs.errors.FileExpected IsADirectoryError
fs.errors.DirectoryExpected NotADirectoryError
fs.errors.CreateFailed ValueError
fs.errors.ResourceInvalid ValueError
fs.errors.RemoveRootError PermissionError
fs.errors.DirectoryNotEmpty OSError

PyScaffold

This project has been set up using PyScaffold 4.3. For details and usage information on PyScaffold see PyScaffold.

putup --name fs-synapse --markdown --github-actions --pre-commit --license Apache-2.0 fs-synapse

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fs_synapse-3.0.0.tar.gz (117.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fs_synapse-3.0.0-py3-none-any.whl (18.9 kB view details)

Uploaded Python 3

File details

Details for the file fs_synapse-3.0.0.tar.gz.

File metadata

  • Download URL: fs_synapse-3.0.0.tar.gz
  • Upload date:
  • Size: 117.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for fs_synapse-3.0.0.tar.gz
Algorithm Hash digest
SHA256 6626c52e0a3b5ed1da6a1f6cb97e33f8f4c0b690d8324f895aeb281a0208ed05
MD5 a512f80182397af022935d81cb96a8ca
BLAKE2b-256 c2ebbd9779fa4692481447bf450170b4d043ff76a97ff7d77fee2edcc7d4d587

See more details on using hashes here.

File details

Details for the file fs_synapse-3.0.0-py3-none-any.whl.

File metadata

  • Download URL: fs_synapse-3.0.0-py3-none-any.whl
  • Upload date:
  • Size: 18.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for fs_synapse-3.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ee2e3a637a066854c678a7c917c2e3b2cc57ed8269e99fc9e4d639e205d3ac57
MD5 5635ff6ab3d9699e30a3e0bd976e53c9
BLAKE2b-256 be192f3d0101825bdfe4df4a7bf76c5966fbacc9a7a93f59827149b92fb10fae

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page