Skip to main content

A framework for research code

Project description

pyfra

The Python Framework for Research Applications.

Documentation Status

Design Philosophy

Research code has some of the fastest shifting requirements of any type of code. It's nearly impossible to plan ahead of time the proper abstractions, because it is exceedingly likely that in the course of the project what you originally thought was your main focus suddenly no longer is. Further, research code (especially in ML) often involves big and complicated pipelines, typically involving many different machines, which are either run by hand or using shell scripts that are far more complicated than any shell script ever should be.

Therefore, the objective of pyfra is to make it as fast and low-friction as possible to write research code involving complex pipelines over many machines. This entails making it as easy as possible to implement a research idea in reality, at the cost of fine-grained control and the long-term maintainability of the system. In other words, pyfra expects that code will either be rapidly obsoleted by newer code, or rewritten using some other framework once it is no longer a research project and requirements have settled down.

Pyfra is in its very early stages of development. The interface may change rapidly and without warning.

Features:

  • Extremely elegant shell integration—run commands on any server seamlessly. All the best parts of bash and python combined
  • Handle files on remote servers with a pathlib-like interface the same way you would local files (WIP, partially implemented)
  • Automated remote environment setup, so you never have to worry about provisioning machines by hand again
  • Idempotent resumable data and training pipelines with no cognitive overhead
  • Spin up an internal webserver complete with a permissions system using only a few lines of code
  • (Coming soon) High level API for experiment management/scheduling and resource provisioning

Want to dive in? See the documentation.

Example code

from pyfra import *

rem1 = Remote("user@example.com")
rem2 = Remote("goose@8.8.8.8")

# env creates an environment object, which behaves very similarly to a Remote, but comes with a fresh python environment in a newly created directory (optionally initialized from a git repo)
env1 = rem1.env("tokenization")
env2 = rem2.env("neox", "https://github.com/EleutherAI/gpt-neox")

# path creates a RemotePath object, which behaves similar to a pathlib Path.
raw_data = local.path("training_data.txt")
tokenized_data = env2.path("tokenized_data")

# tokenize
copy("https://goose.com/files/tokenize_script.py", env1.path("tokenize.py")) # copy can copy from local/remote/url to local/remote
env1.sh(f"python tokenize.py --input {raw_data} --output {tokenized_data}") # implicitly copy files just by using the path object in an f-string

# start training run
env2.path("config.json").jwrite({...})
env2.sh("python train.py --input tokenized_data --config config.json")

Installation

pip3 install pyfra

Webserver screenshots

image image

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyfra-0.3.0rc1.tar.gz (514.1 kB view details)

Uploaded Source

Built Distribution

pyfra-0.3.0rc1-py3-none-any.whl (552.1 kB view details)

Uploaded Python 3

File details

Details for the file pyfra-0.3.0rc1.tar.gz.

File metadata

  • Download URL: pyfra-0.3.0rc1.tar.gz
  • Upload date:
  • Size: 514.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.3 pkginfo/1.7.1 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.8.10

File hashes

Hashes for pyfra-0.3.0rc1.tar.gz
Algorithm Hash digest
SHA256 f6ee7576c45210e46cd438207367fc4504cbec75660a2a33db35325db5bd016f
MD5 1852f7b8e5c807a3ec52428d4fb7a5c1
BLAKE2b-256 c49b231b0ad0d4493de2916b72becde6b615af5d0c9c64d0ca9da6f268f5deb4

See more details on using hashes here.

File details

Details for the file pyfra-0.3.0rc1-py3-none-any.whl.

File metadata

  • Download URL: pyfra-0.3.0rc1-py3-none-any.whl
  • Upload date:
  • Size: 552.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.3 pkginfo/1.7.1 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.8.10

File hashes

Hashes for pyfra-0.3.0rc1-py3-none-any.whl
Algorithm Hash digest
SHA256 fa4fe08ce8a79990e2beddf4d3bc6cdceb8eca24250bb69e4a4c76f32d44ef10
MD5 15c946aae8d39ce68727655eb45c305a
BLAKE2b-256 8ba9e8edfa2309d5d7cf8967da834706d3858ed79008d2dc9619f42c5d42f483

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page