Planet-scale distributed computing in Python.
Project description
Pythagoras
Planet-scale distributed computing in Python.
!!! RESEARCH PREVIEW !!!
What is it?
Pythagoras is a super-scalable, easy-to-use, and low-maintenance framework for (1) massive algorithm parallelization and (2) hardware usage optimization in Python. It simplifies and speeds up data science, machine learning, and AI workflows.
Pythagoras excels at complex, long-running, resource-demanding computations. It’s not recommended for real-time, latency-sensitive workflows.
For a comprehensive list of terms and definitions, see the Glossary.
Tutorials
Pythagoras elevates two popular techniques — memoization and parallelization — to a global scale and then fuses them, unlocking performance and scalability that were previously out of reach.
Drawing from many years of functional-programming practice, Pythagoras extends these proven ideas to the next level. In a Pythagoras environment, you can seamlessly employ your preferred functional patterns, augmented by new capabilities.
!!! BOOKMARK THIS PAGE AND COME BACK LATER, WE WILL PUBLISH MORE TUTORIALS SOON !!!
Videos
Usage Examples
Importing Pythagoras:
from pythagoras.core import *
import pythagoras as pth
Creating a portal based on a (shared) folder:
my_portal = get_portal("./my_local_folder")
Checking the state of a portal:
my_portal.describe()
Decorating a function:
@pure()
def my_long_running_function(a:float, b:float) -> float:
from time import sleep # imports must be placed inside a pure function
sleep(5)
return a+10*b
Using a decorated function synchronously:
result = my_long_running_function(a=1, b=2) # only named arguments are allowed
Using a decorated function asynchronously:
future_result_address = my_long_running_function.swarm(a=10, b=20)
if ready(future_result_address):
result = get(future_result_address)
Pre-conditions for executing a function:
@pure(pre_validators=[
unused_ram(Gb=5),
installed_packages("scikit-learn","pandas"),
unused_cpu(cores=10)])
def my_long_running_function(a:float, b:float) -> float:
from time import sleep
sleep(5)
return a+10*b
Recursion:
@pure(pre_validators=[recursive_parameters("n")])
def factorial(n:int)->int:
if n == 1:
return 1
else:
return n*factorial(n=n-1) # only named arguments are allowed
Partial function application:
@pure()
def my_map(input_list:list, transformer: PureFn)->list:
result = []
for element in input_list:
transformed_element = transformer(x=element)
result.append(transformed_element)
return result
@pure()
def my_square(x):
return x*x
result = my_map(input_list=[1,2,3,4,5], transformer=my_square)
my_square_map = my_map.fix_kwargs(transformer = my_square)
result = my_square_map(input_list=[1,2,3,4,5])
Mutually recursive functions:
@pure(pre_validators=recursive_parameters("n"))
def is_even(n:int, is_odd ,is_even)->bool:
if n in {0,2}:
return True
else:
return is_odd(n=n-1, is_even=is_even, is_odd=is_odd)
@pure(pre_validators=recursive_parameters("n"))
def is_odd(n:int, is_even, is_odd)->bool:
if n in {0,2}:
return False
else:
return is_even(n=n-1, is_odd=is_odd, is_even=is_even)
(is_even, is_odd) = (
is_even.fix_kwargs(is_odd=is_odd, is_even=is_even)
, is_odd.fix_kwargs(is_odd=is_odd, is_even=is_even) )
assert is_even(n=10)
assert is_odd(n=11)
Core Concepts
-
Portal: A persistent gateway that connects your application to the world beyond the current execution session. Portals link runtime state to persistent storage that survives across multiple runs and machines. They provide a unified interface for data persistence, caching, and state management, abstracting away storage backend complexities (local filesystem, cloud storage, etc.) and handling serialization transparently. A program can use multiple portals, each with its own storage backend, and each portal can serve multiple applications. Portals define the execution context for pure functions, enabling result caching and retrieval.
-
Autonomous Function: A self-contained function with no external dependencies. All imports must be done inside the function body. These functions cannot use global objects (except built-ins), yield statements, or nonlocal variables, and must be called with keyword arguments only. This design ensures complete isolation and portability, making autonomous functions ideal building blocks for distributed computing—they carry all dependencies with them and maintain clear interfaces.
-
Pure Function: An autonomous function that has no side effects and always returns the same result for the same arguments. Pythagoras caches pure function results using content-based addressing: if a function is called multiple times with identical arguments, it executes only once, and cached results are returned for subsequent calls. This memoization works seamlessly across machines in a distributed system, enabling significant performance improvements for computationally intensive workflows.
-
Validator: An autonomous function that checks conditions before or after executing a pure function. Pre-validators run before execution, post-validators run after. Validators can be passive (e.g., check available RAM) or active (e.g., install a missing library). They help ensure reliable distributed execution by validating requirements and system state. Multiple validators can be combined using standard decorator syntax.
-
Value Address: A globally unique, content-derived address for an immutable value. It consists of a human-readable descriptor (based on type and shape/length) and a hash signature (SHA-256) split into parts for storage efficiency. Creating a
ValueAddr(data)computes the content hash and stores the value in the active portal's storage, allowing later retrieval via the address. Value addresses identify stored results and reference inputs/outputs across distributed systems. -
Execution Result Address: A Value Address representing the result of a pure function execution. It combines the function's signature with input parameters to create a unique identifier. In swarm mode, functions immediately return an Execution Result Address that acts as a "future" reference for checking execution status and retrieving results. These addresses remain valid across application restarts and can be shared between machines.
-
Swarming: An asynchronous execution model where you don't know when, where, or how many times your function will execute. Pythagoras guarantees eventual execution at least once but offers no further guarantees. This model maximizes flexibility by decoupling function calls from execution—functions can be queued, load-balanced, retried on failure, and parallelized automatically. The trade-off is reduced control over timing and location in exchange for improved scalability, fault tolerance, and resource utilization.
For a complete list of terms and detailed definitions, see the Glossary.
How to get it?
The source code is hosted on GitHub at: https://github.com/pythagoras-dev/pythagoras
Installers for the latest released version are available at the Python package index at: https://pypi.org/project/pythagoras
Using uv :
uv add pythagoras
Using pip (legacy alternative to uv):
pip install pythagoras
Dependencies
- persidict
- mixinforge
- jsonpickle
- joblib
- lz4
- pandas
- numpy
- psutil
- boto3
- pytest
- moto
- scipy
- scikit-learn
- autopep8
- deepdiff
- nvidia-ml-py
- uv
Project Statistics
| Metric | Main code | Unit Tests | Total |
|---|---|---|---|
| Lines Of Code (LOC) | 11175 | 15736 | 26911 |
| Source Lines Of Code (SLOC) | 4518 | 9785 | 14303 |
| Classes | 53 | 48 | 101 |
| Functions / Methods | 479 | 1286 | 1765 |
| Files | 62 | 224 | 286 |
Contributing
Interested in contributing to Pythagoras? Please see our Contributing Guidelines.
For project documentation standards, see:
Key Contacts
About The Name
Pythagoras of Samos was a famous ancient Greek thinker and scientist who was the first man to call himself a philosopher ("lover of wisdom"). He is most recognised for his many mathematical findings, including the Pythagorean theorem.
Not everyone knows that in antiquity, Pythagoras was also credited with major astronomical discoveries, such as sphericity of the Earth and the identity of the morning and evening stars as the planet Venus.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pythagoras-0.54.0.tar.gz.
File metadata
- Download URL: pythagoras-0.54.0.tar.gz
- Upload date:
- Size: 218.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.29 {"installer":{"name":"uv","version":"0.9.29","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2449cf768b5ae5f94ffd1918317c7bc61948a2c5061b0a592a337bb438f4a637
|
|
| MD5 |
a3fc2a820b77f7f16b9bc1d4a3adcbe1
|
|
| BLAKE2b-256 |
cc480c7c35b997a42515a900f8384b7443736c478d9a38aa8b1d0da1e4ba9137
|
File details
Details for the file pythagoras-0.54.0-py3-none-any.whl.
File metadata
- Download URL: pythagoras-0.54.0-py3-none-any.whl
- Upload date:
- Size: 136.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.29 {"installer":{"name":"uv","version":"0.9.29","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
26e18fdd17e9c9974afa1f8a024f30809683b2183ae288087b6480bf0306cf75
|
|
| MD5 |
8b66450ad98fb281274324923e1e97b1
|
|
| BLAKE2b-256 |
7e90070f81a8fcc110c17ee30cb40b3d37cde59495c123f48be45f8d99aecfd0
|