Skip to main content

fsspec interface for Weights & Biases (wandb)

Project description

🍱 fsspec interface for Weights & Biases (wandb)

Quoting Weights and Biases (wandb), "Weights & Biases is the machine learning platform for developers to build better models faster. Use W&B's lightweight, interoperable tools to quickly track experiments, version and iterate on datasets, evaluate model performance, reproduce models, visualize results and spot regressions, and share findings with colleagues.". Reference at https://docs.wandb.ai/

So you may be thinking, what does wandb have to do with anything close to a File System? Well, it's not but it actually provides a way to upload/download files and store them in a remote, which makes it somehow a File System. Also, wandb provides an API that lets you interact with that "File System", so this is why wandbfsspec makes sense, in order to ease that interface between wandb's File System and anyone willing to use it.

The wandbfsspec implementation is based on https://github.com/fsspec/filesystem_spec.

🔮 Future TODOs

Obviously, since wandb's main purpose is to track and monitor ML experiments, it contains an artifact store, so as to dump there the experiment artifacts for data versioning and model tracking. More information in https://wandb.ai/site/artifacts.

So on, a new interface will be implemented in wandbfsspec not just to handle the files that can be uploaded/downloaded to/from wandb, but also the artifacts. So the next release will implement a new AbstractFileSystem class named WandbArtifactStore with the protocol wandbas in order to access the artifact store as if it was a default File System.

Some more notes on how to actually use wandb's artifact store at https://docs.wandb.ai/guides/artifacts.

Once that's done, we'll fill a PR in https://github.com/fsspec/filesystem_spec, so as to register both protocols supported by wandbfsspec: wandbfs and wandbas.

🚸 Usage

Here's an example on how to locate and open a file:

>>> from wandbfsspec.core import WandbFileSystem
>>> fs = WandbFileSystem(api_key="YOUR_API_KEY")
>>> fs.ls("alvarobartt/wandbfsspec-tests/3s6km7mp")
['alvarobartt/wandbfsspec-tests/3s6km7mp/config.yaml', 'alvarobartt/wandbfsspec-tests/3s6km7mp/file.yaml', 'alvarobartt/wandbfsspec-tests/3s6km7mp/files', 'alvarobartt/wandbfsspec-tests/3s6km7mp/output.log', 'alvarobartt/wandbfsspec-tests/3s6km7mp/requirements.txt', 'alvarobartt/wandbfsspec-tests/3s6km7mp/wandb-metadata.json', 'alvarobartt/wandbfsspec-tests/3s6km7mp/wandb-summary.json']
>>> with fs.open("alvarobartt/wandbfsspec-tests/3s6km7mp/file.yaml", "rb") as f:
...     print(f.read())
b'some: data\nfor: testing'

📌 Note that it can also be done through fsspec as long as wandbfsspec is installed:

>>> import fsspec
>>> fs = fsspec.filesystem("wandbfs")
>>> fs.ls("alvarobartt/wandbfsspec-tests/3s6km7mp")
['alvarobartt/wandbfsspec-tests/3s6km7mp/config.yaml', 'alvarobartt/wandbfsspec-tests/3s6km7mp/file.yaml', 'alvarobartt/wandbfsspec-tests/3s6km7mp/files', 'alvarobartt/wandbfsspec-tests/3s6km7mp/output.log', 'alvarobartt/wandbfsspec-tests/3s6km7mp/requirements.txt', 'alvarobartt/wandbfsspec-tests/3s6km7mp/wandb-metadata.json', 'alvarobartt/wandbfsspec-tests/3s6km7mp/wandb-summary.json']
>>> with fs.open("alvarobartt/wandbfsspec-tests/3s6km7mp/file.yaml", "rb") as f:
...     print(f.read())
b'some: data\nfor: testing'

📝 Documentation

Coming soon... (https://github.com/mkdocs/mkdocs)

🧪 How to test it

In order to test it, you should first set the following environment variables so as to use wandb as a file system for the tests.

WANDB_ENTITY = ""
WANDB_PROJECT = ""
WANDB_API_KEY = ""

Both entity and project values can be found in your https://wandb.ai/ account, as the entity name is your account name, and the project name can either be already created or you can just specify it and it'll be created during pytest init. Then, regarding the API Key, you just need to go to https://wandb.ai/settings, scroll down to Danger Zone -> API Keys, and copy your personal API Key from there.

⚠️ Make sure that you don't publish your API Key anywhere, that's why we're defining it as an environment value, so as to avoid potential issues on commiting code with the actual API Key value.

Then, in order to actually run the tests you can either run:

  • poetry run pytest
  • poetry run make tests

Or, if you're not using poetry, you can just run both those commands without it.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wandbfsspec-0.1.1.tar.gz (6.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

wandbfsspec-0.1.1-py3-none-any.whl (6.5 kB view details)

Uploaded Python 3

File details

Details for the file wandbfsspec-0.1.1.tar.gz.

File metadata

  • Download URL: wandbfsspec-0.1.1.tar.gz
  • Upload date:
  • Size: 6.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.14 CPython/3.9.13 Linux/5.15.0-1014-azure

File hashes

Hashes for wandbfsspec-0.1.1.tar.gz
Algorithm Hash digest
SHA256 f871cb796c344308aa0d659a38f34c201b38797e07c9edf17b4108a4c2ab221d
MD5 77ddb1fe1e69b7ca00c4ba372b2a9d61
BLAKE2b-256 12df5cb08b08e0b5e6a6a658ec33fcd4da4b6a46975bbe3d1497b66815a2b984

See more details on using hashes here.

File details

Details for the file wandbfsspec-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: wandbfsspec-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 6.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.14 CPython/3.9.13 Linux/5.15.0-1014-azure

File hashes

Hashes for wandbfsspec-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 3e36b21d69636191cbe0cae57474fed0f207f75f0fb9c8c53f58901d7fb313d7
MD5 a3c75b727e51c1edeef4a943734c2ba1
BLAKE2b-256 a662b01daf5691ee6c65944cf6d1ac0bdc6e40bccaa5e69c372ef71b07cd2f30

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page