Skip to main content

Python implementation of Bluesky PDS and AT Protocol, including repo, MST, and sync methods

Project description

arroba Circle CI Coverage Status

Python implementation of Bluesky PDS and AT Protocol, including data repository, Merkle search tree, and com.atproto.sync XRPC methods.

You can build your own PDS on top of arroba with just a few lines of Python and run it in any WSGI server. You can build a more involved PDS with custom logic and behavior. Or you can build a different ATProto service, eg an AppView, relay (née BGS), or something entirely new!

Install from PyPI with pip install arroba.

Arroba is the Spanish word for the @ character ("at sign").

License: This project is placed in the public domain. You may also use it under the CC0 License.

Usage

Here's minimal example code for a multi-repo PDS on top of arroba and Flask:

from flask import Flask
from google.cloud import ndb
from lexrpc.flask_server import init_flask

from arroba import server
from arroba.datastore_storage import DatastoreStorage
from arroba.xrpc_sync import send_events

# for Google Cloud Datastore
ndb_client = ndb.Client()

server.storage = DatastoreStorage(ndb_client=ndb_client)
server.repo.callback = lambda _: send_events()  # to subscribeRepos

app = Flask('my-pds')
init_flask(server.server, app)

def ndb_context_middleware(wsgi_app):
    def wrapper(environ, start_response):
        with ndb_client.context():
            return wsgi_app(environ, start_response)
    return wrapper

app.wsgi_app = ndb_context_middleware(app.wsgi_app)

See app.py for a more comprehensive example, including a CORS handler for OPTIONS preflight requests and a catch-all app.bsky.* XRPC handler that proxies requests to the AppView.

Overview

Arroba consists of these parts:

Configuration

Configure arroba with these environment variables:

  • APPVIEW_HOST, default api.bsky-sandbox.dev
  • RELAY_HOST, default bgs.bsky-sandbox.dev
  • PLC_HOST, default plc.bsky-sandbox.dev
  • PDS_HOST, where you're running your PDS

Optional, only used in com.atproto.repo, .server, and .sync XRPC handlers:

  • REPO_TOKEN, static token to use as both accessJwt and refreshJwt, defaults to contents of repo_token file. Not required to be an actual JWT. If not set, XRPC methods that require auth will return HTTP 501 Not Implemented.
  • ROLLBACK_WINDOW, number of events to serve in the subscribeRepos rollback window. Defaults to no limit.

Changelog

0.6 - 2024-06-24

Breaking changes:

  • datastore_storage:
    • DatastoreStorage: add new required ndb_client kwarg to constructor, used to get new context in lexrpc websocket subscription handlers that run server methods like subscribeRepos in separate threads (snarfed/lexrpc#8).
    • DatastoreStorage.read_blocks_by_seq: if the ndb context gets closed while we're still running, log a warning and return. (This can happen in eg flask_server if the websocket client disconnects early.)
    • AtpRemoteBlob: if the blob URL doesn't return the Content-Type header, infer type from the URL, or fall back to application/octet-stream (bridgy-fed#1073).
  • did:
    • Cache resolve_plc, resolve_web, and resolve_handle for 6h, up to 5000 total results per call.
  • storage: rename Storage.read_commits_by_seq to read_events_by_seq for new account tombstone support.
  • xrpc_sync: rename send_new_commits to send_events, ditto.
  • xrpc_repo: stop requiring auth for read methods: getRecord, listRecords, describeRepo.

Non-breaking changes:

  • did:
    • Add HANDLE_RE regexp for handle validation.
  • storage:
    • Add new Storage.tombstone_repo method, implemented in MemoryStorage and DatastoreStorage. Used to delete accounts. (bridgy-fed#783)
    • Add new Storage.load_repos method, implemented in MemoryStorage and DatastoreStorage. Used for com.atproto.sync.listRepos.
  • util:
    • service_jwt: add optional aud kwarg.
  • xrpc_sync:
    • subscribeRepos:
      • Add support for non-commit events, starting with account tombstones.
      • Add ROLLBACK_WINDOW environment variable to limit size of rollback window. Defaults to no limit.
      • For commits with create or update operations, always include the record block, even if it already existed in the repo beforehand (snarfed/bridgy-fed#1016).
      • Bug fix, populate the time each commit was created in time instead of the current time (snarfed/bridgy-fed#1015).
    • Start serving getRepo queries with the since parameter. since still isn't actually implemented, but we now serve the entire repo instead of returning an error.
    • Implement getRepoStatus method.
    • Implement listRepos method.
    • getRepo bug fix: include the repo head commit block.
  • xrpc_repo:
  • xrpc_*: return RepoNotFound and RepoDeactivated errors when appropriate (snarfed/bridgy-fed#1083).

0.5 - 2024-03-16

  • Bug fix: base32-encode TIDs in record keys, at:// URIs, commit revs, etc. Before, we were using the integer UNIX timestamp directly, which happened to be the same 13 character length. Oops.
  • Switch from BGS_HOST environment variable to RELAY_HOST. BGS_HOST is still supported for backward compatibility.
  • datastore_storage:
    • Bug fix for DatastoreStorage.last_seq, handle new NSID.
    • Add new AtpRemoteBlob class for storing "remote" blobs, available at public HTTP URLs, that we don't store ourselves.
  • did:
    • create_plc: strip padding from genesis operation signature (for did-method-plc#54, atproto#1839).
    • resolve_handle: return None on bad domain, eg .foo.com.
    • resolve_handle bug fix: handle charset specifier in HTTPS method response Content-Type.
  • util:
    • new_key: add seed kwarg to allow deterministic key generation.
  • xrpc_repo:
    • getRecord: try to load record locally first; if not available, forward to AppView.
  • xrpc_sync:
    • Implement getBlob, right now only based on "remote" blobs stored in AtpRemoteBlobs in datastore storage.

0.4 - 2023-09-19

  • Migrate to ATProto repo v3. Specifically, the existing subscribeRepos sequence number is reused as the new rev field in commits. (Discussion.).
  • Add new did module with utilities to create and resolve did:plcs and resolve did:webs.
  • Add new util.service_jwt function that generates ATProto inter-service JWTs.
  • Repo:
    • Add new signing_key/rotation_key attributes. Generate store, and load both in datastore_storage.
    • Remove format_init_commit, migrate existing calls to format_commit.
  • Storage:
    • Rename read_from_seq => read_blocks_by_seq (and in MemoryStorage and DatastoreStorage), add new read_commits_by_seq method.
    • Merge load_repo did/handle kwargs into did_or_handle.
  • XRPCs:
    • Make subscribeRepos check storage for all new commits every time it wakes up.
      • As part of this, replace xrpc_sync.enqueue_commit with new send_new_commits function that takes no parameters.
    • Drop bundled app.bsky/com.atproto lexicons, use lexrpc's instead.

0.3 - 2023-08-29

Big milestone: arroba is successfully federating with the ATProto sandbox! See app.py for the minimal demo code needed to wrap arroba in a fully functional PDS.

  • Add Google Cloud Datastore implementation of repo storage.
  • Implement com.atproto XRPC methods needed to federate with sandbox, including most of repo and sync.
    • Notably, includes subscribeRepos server side over websocket.
  • ...and much more.

0.2 - 2023-05-18

Implement repo and commit chain in new Repo class, including pluggable storage. This completes the first pass at all PDS data structures. Next release will include initial implementations of the com.atproto.sync.* XRPC methods.

0.1 - 2023-04-30

Initial release! Still very in progress. MST, Walker, and Diff classes are mostly complete and working. Repo, commits, and sync XRPC methods are still in progress.

Release instructions

Here's how to package, test, and ship a new release.

  1. Run the unit tests.

    source local/bin/activate.csh
    python -m unittest discover
    
  2. Bump the version number in pyproject.toml and docs/conf.py. git grep the old version number to make sure it only appears in the changelog. Change the current changelog entry in README.md for this new version from unreleased to the current date.

  3. Build the docs. If you added any new modules, add them to the appropriate file(s) in docs/source/. Then run ./docs/build.sh. Check that the generated HTML looks fine by opening docs/_build/html/index.html and looking around.

  4. setenv ver X.Y
    git commit -am "release v$ver"
    
  5. Upload to test.pypi.org for testing.

    python -m build
    twine upload -r pypitest dist/arroba-$ver*
    
  6. Install from test.pypi.org.

    cd /tmp
    python -m venv local
    source local/bin/activate.csh
    # make sure we force pip to use the uploaded version
    pip uninstall arroba
    pip install --upgrade pip
    pip install -i https://test.pypi.org/simple --extra-index-url https://pypi.org/simple arroba==$ver
    deactivate
    
  7. Smoke test that the code trivially loads and runs.

    source local/bin/activate.csh
    python
    # TODO: test code
    deactivate
    
  8. Tag the release in git. In the tag message editor, delete the generated comments at bottom, leave the first line blank (to omit the release "title" in github), put ### Notable changes on the second line, then copy and paste this version's changelog contents below it.

    git tag -a v$ver --cleanup=verbatim
    git push && git push --tags
    
  9. Click here to draft a new release on GitHub. Enter vX.Y in the Tag version box. Leave Release title empty. Copy ### Notable changes and the changelog contents into the description text box.

  10. Upload to pypi.org!

    twine upload dist/arroba-$ver*
    
  11. Wait for the docs to build on Read the Docs, then check that they look ok.

  12. On the Versions page, check that the new version is active, If it's not, activate it in the Activate a Version section.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

arroba-0.6.tar.gz (48.0 kB view details)

Uploaded Source

Built Distribution

arroba-0.6-py3-none-any.whl (47.6 kB view details)

Uploaded Python 3

File details

Details for the file arroba-0.6.tar.gz.

File metadata

  • Download URL: arroba-0.6.tar.gz
  • Upload date:
  • Size: 48.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.2

File hashes

Hashes for arroba-0.6.tar.gz
Algorithm Hash digest
SHA256 61745fefdcd8b4f9ebbcca5dc9804eaeacfea700ef0db882b860122ee1311238
MD5 354a903d9def60febf5a678a578f4fe6
BLAKE2b-256 aaa4cf104d54874a3faf2ecc512dece16ed909700f3903b96b6201b52a916cce

See more details on using hashes here.

File details

Details for the file arroba-0.6-py3-none-any.whl.

File metadata

  • Download URL: arroba-0.6-py3-none-any.whl
  • Upload date:
  • Size: 47.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.2

File hashes

Hashes for arroba-0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 4779b3bd0db431e0ac9ad643f40c10d3cf19977b2238d764e9aad4320a6d8b5e
MD5 d30281e920782c539d25ea0e9e4d2cbd
BLAKE2b-256 8c9ad353a0bb997838e27a5d7742d3814b373a8de1d88a603a89b3954e1086d0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page