Skip to main content

Planet-scale distributed computing in Python.

Project description

Pythagoras

PyPI version Python versions License: MIT Downloads Code style: pep8 Docstring Style: Google Ruff

Planet-scale distributed computing in Python.

!!! RESEARCH PREVIEW !!!

What is it?

Pythagoras is a super-scalable, easy-to-use, and low-maintenance framework for (1) massive algorithm parallelization and (2) hardware usage optimization in Python. It simplifies and speeds up data science, machine learning, and AI workflows.

Pythagoras excels at complex, long-running, resource-demanding computations. It’s not recommended for real-time, latency-sensitive workflows.

For a comprehensive list of terms and definitions, see the Glossary.

Tutorials

Pythagoras elevates two popular techniques — memoization and parallelization — to a global scale and then fuses them, unlocking performance and scalability that were previously out of reach.

Drawing from many years of functional-programming practice, Pythagoras extends these proven ideas to the next level. In a Pythagoras environment, you can seamlessly employ your preferred functional patterns, augmented by new capabilities.

!!! BOOKMARK THIS PAGE AND COME BACK LATER, WE WILL PUBLISH MORE TUTORIALS SOON !!!

Videos

Usage Examples

Importing Pythagoras:

from pythagoras.core import *
import pythagoras as pth

Creating a portal based on a (shared) folder:

my_portal = get_portal("./my_local_folder")

Checking the state of a portal:

my_portal.describe()

Decorating a function:

@pure()
def my_long_running_function(a:float, b:float) -> float:
  from time import sleep # imports must be placed inside a pure function
  sleep(5)
  return a+10*b

Using a decorated function synchronously:

result = my_long_running_function(a=1, b=2) # only named arguments are allowed

Using a decorated function asynchronously:

future_result_address = my_long_running_function.swarm(a=10, b=20)
if ready(future_result_address):
    result = get(future_result_address)

Pre-conditions for executing a function:

@pure(pre_validators=[
    unused_ram(Gb=5),
    installed_packages("scikit-learn","pandas"),
    unused_cpu(cores=10)])
def my_long_running_function(a:float, b:float) -> float:
  from time import sleep
  sleep(5)
  return a+10*b

Recursion:

@pure(pre_validators=[recursive_parameters("n")])
def factorial(n:int)->int:
  if n == 1:
    return 1
  else:
    return n*factorial(n=n-1) # only named arguments are allowed

Partial function application:

@pure()
def my_map(input_list:list, transformer: PureFn)->list:
  result = []
  for element in input_list:
    transformed_element = transformer(x=element)
    result.append(transformed_element)
  return result

@pure()
def my_square(x):
  return x*x

result = my_map(input_list=[1,2,3,4,5], transformer=my_square)

my_square_map = my_map.fix_kwargs(transformer = my_square)

result = my_square_map(input_list=[1,2,3,4,5])

Mutually recursive functions:

@pure(pre_validators=recursive_parameters("n"))
def is_even(n:int, is_odd ,is_even)->bool:
  if n in {0,2}:
    return True
  else:
    return is_odd(n=n-1, is_even=is_even, is_odd=is_odd)

@pure(pre_validators=recursive_parameters("n"))
def is_odd(n:int, is_even, is_odd)->bool:
  if n in {0,2}:
    return False
  else:
    return is_even(n=n-1, is_odd=is_odd, is_even=is_even)

(is_even, is_odd) = (
  is_even.fix_kwargs(is_odd=is_odd, is_even=is_even)
  , is_odd.fix_kwargs(is_odd=is_odd, is_even=is_even) )

assert is_even(n=10)
assert is_odd(n=11)

Core Concepts

  • Portal: A persistent gateway that connects your application to the world beyond the current execution session. Portals link runtime state to persistent storage that survives across multiple runs and machines. They provide a unified interface for data persistence, caching, and state management, abstracting away storage backend complexities (local filesystem, cloud storage, etc.) and handling serialization transparently. A program can use multiple portals, each with its own storage backend, and each portal can serve multiple applications. Portals define the execution context for pure functions, enabling result caching and retrieval.

  • Autonomous Function: A self-contained function with no external dependencies. All imports must be done inside the function body. These functions cannot use global objects (except built-ins), yield statements, or nonlocal variables, and must be called with keyword arguments only. This design ensures complete isolation and portability, making autonomous functions ideal building blocks for distributed computing—they carry all dependencies with them and maintain clear interfaces.

  • Pure Function: An autonomous function that has no side effects and always returns the same result for the same arguments. Pythagoras caches pure function results using content-based addressing: if a function is called multiple times with identical arguments, it executes only once, and cached results are returned for subsequent calls. This memoization works seamlessly across machines in a distributed system, enabling significant performance improvements for computationally intensive workflows.

  • Validator: An autonomous function that checks conditions before or after executing a pure function. Pre-validators run before execution, post-validators run after. Validators can be passive (e.g., check available RAM) or active (e.g., install a missing library). They help ensure reliable distributed execution by validating requirements and system state. Multiple validators can be combined using standard decorator syntax.

  • Value Address: A globally unique, content-derived address for an immutable value. It consists of a human-readable descriptor (based on type and shape/length) and a hash signature (SHA-256) split into parts for storage efficiency. Creating a ValueAddr(data) computes the content hash and stores the value in the active portal's storage, allowing later retrieval via the address. Value addresses identify stored results and reference inputs/outputs across distributed systems.

  • Execution Result Address: A Value Address representing the result of a pure function execution. It combines the function's signature with input parameters to create a unique identifier. In swarm mode, functions immediately return an Execution Result Address that acts as a "future" reference for checking execution status and retrieving results. These addresses remain valid across application restarts and can be shared between machines.

  • Swarming: An asynchronous execution model where you don't know when, where, or how many times your function will execute. Pythagoras guarantees eventual execution at least once but offers no further guarantees. This model maximizes flexibility by decoupling function calls from execution—functions can be queued, load-balanced, retried on failure, and parallelized automatically. The trade-off is reduced control over timing and location in exchange for improved scalability, fault tolerance, and resource utilization.

For a complete list of terms and detailed definitions, see the Glossary.

How to get it?

The source code is hosted on GitHub at: https://github.com/pythagoras-dev/pythagoras

Installers for the latest released version are available at the Python package index at: https://pypi.org/project/pythagoras

Using uv :

uv add pythagoras

Using pip (legacy alternative to uv):

pip install pythagoras

Dependencies

Project Statistics

Metric Main code Unit Tests Total
Lines Of Code (LOC) 11175 15736 26911
Source Lines Of Code (SLOC) 4518 9785 14303
Classes 53 48 101
Functions / Methods 479 1286 1765
Files 62 224 286

Contributing

Interested in contributing to Pythagoras? Please see our Contributing Guidelines.

For project documentation standards, see:

Key Contacts

About The Name

Pythagoras of Samos was a famous ancient Greek thinker and scientist who was the first man to call himself a philosopher ("lover of wisdom"). He is most recognised for his many mathematical findings, including the Pythagorean theorem.

Not everyone knows that in antiquity, Pythagoras was also credited with major astronomical discoveries, such as sphericity of the Earth and the identity of the morning and evening stars as the planet Venus.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pythagoras-0.54.0.tar.gz (218.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pythagoras-0.54.0-py3-none-any.whl (136.5 kB view details)

Uploaded Python 3

File details

Details for the file pythagoras-0.54.0.tar.gz.

File metadata

  • Download URL: pythagoras-0.54.0.tar.gz
  • Upload date:
  • Size: 218.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.29 {"installer":{"name":"uv","version":"0.9.29","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for pythagoras-0.54.0.tar.gz
Algorithm Hash digest
SHA256 2449cf768b5ae5f94ffd1918317c7bc61948a2c5061b0a592a337bb438f4a637
MD5 a3fc2a820b77f7f16b9bc1d4a3adcbe1
BLAKE2b-256 cc480c7c35b997a42515a900f8384b7443736c478d9a38aa8b1d0da1e4ba9137

See more details on using hashes here.

File details

Details for the file pythagoras-0.54.0-py3-none-any.whl.

File metadata

  • Download URL: pythagoras-0.54.0-py3-none-any.whl
  • Upload date:
  • Size: 136.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.29 {"installer":{"name":"uv","version":"0.9.29","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for pythagoras-0.54.0-py3-none-any.whl
Algorithm Hash digest
SHA256 26e18fdd17e9c9974afa1f8a024f30809683b2183ae288087b6480bf0306cf75
MD5 8b66450ad98fb281274324923e1e97b1
BLAKE2b-256 7e90070f81a8fcc110c17ee30cb40b3d37cde59495c123f48be45f8d99aecfd0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page