Skip to main content

fsspec filesystem for OSS

Project description

PyPI Status Python Version License

Tests Codecov pre-commit Black

OSSFS is a Python-based interface for file systems that enables interaction with OSS (Object Storage Service). Through OSSFS, users can utilize fsspec’s standard API to operate on OSS objects

Installation

You can install OSSFS via pip from PyPI:

$ pip install ossfs

Up-to-date package also provided through conda-forge distribution:

$ conda install -c conda-forge ossfs

Quick Start

Here is a simple example of locating and reading an object in OSS.

import ossfs
fs = ossfs.OSSFileSystem(endpoint='http://oss-cn-hangzhou.aliyuncs.com')
fs.ls('/dvc-test-anonymous/LICENSE')
[{'name': '/dvc-test-anonymous/LICENSE',
  'Key': '/dvc-test-anonymous/LICENSE',
  'type': 'file',
  'size': 11357,
  'Size': 11357,
  'StorageClass': 'OBJECT',
  'LastModified': 1622761222}]
with fs.open('/dvc-test-anonymous/LICENSE') as f:
...     print(f.readline())
b'                                 Apache License\n'

For more use case and apis please refer to the documentation of fsspec

Async OSSFS

Async OSSFS is a variant of ossfs that utilizes the third-party async OSS backend aiooss2, rather than the official sync one, oss2. Async OSSFS allows for concurrent calls within bulk operations, such as cat, put, and get etc even from normal code, and enables the direct use of fsspec in async code without blocking. The usage of async OSSFS is similar to the synchronous variant; one simply needs to replace OSSFileSystem with AioOSSFileSystem need to do is replacing the OSSFileSystem with the AioOSSFileSystem

import ossfs
fs = ossfs.AioOSSFileSystem(endpoint='http://oss-cn-hangzhou.aliyuncs.com')
print(fs.cat('/dvc-test-anonymous/LICENSE'))
b'                                 Apache License\n'
...

Although aiooss2 is not officially supported, there are still some features that are currently lacking. However, in tests involving the put/get of 1200 small files, the async version of ossfs ran ten times faster than the synchronous variant (depending on the pool size of the concurrency).

Task

time cost in (seconds)

put 1200 small files via OSSFileSystem

35.2688 (13.53)

put 1200 small files via AioOSSFileSystem

2.6060 (1.0)

get 1200 small files via OSSFileSystem

32.9096 (12.63)

get 1200 small files via AioOSSFileSystem

3.3497 (1.29)

Contributing

Contributions are very welcome. To learn more, see the Contributor Guide.

License

Distributed under the terms of the Apache 2.0 license, Ossfs is free and open source software.

Issues

If you encounter any problems, please file an issue along with a detailed description.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ossfs-2023.12.0.tar.gz (41.6 kB view hashes)

Uploaded Source

Built Distribution

ossfs-2023.12.0-py3-none-any.whl (25.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page