Skip to main content

A CLI for downloading posts in bulk from Bluesky from specified a account

Project description

mass-downloader-for-bluesky

mass-downloader-for-bluesky (mdfb) is a Python cli application that can download large amounts of posts from bluesky from any given account.

Installation

You will need Python to be installed to use this CLI.

You can install via pip by:

pip install mdfb

Manual

Have Poetry installed.

Then clone the project, open a poetry shell and then install all dependencies.

git clone git@github.com:IbrahimHajiAbdi/mass-downloader-for-bluesky.git
cd mdfb
poetry shell
poetry install

Usage

mdfb works by using the public API offered by bluesky to retrieve posts liked, reposted or posted by the desired account.

mdfb will download the information for a post and the accompanying media, video or image(s). If there is no image(s) or video, it will just download the information of the post. The information of the post will be a JSON file and have lots of accompanying data, such as the text in the post, creation time of the post and author details. Currently, the retrieved posts start from the latest post to the oldest.

You will need to be inside a poetry shell to use mdfb if installed manually

Examples

Some example commands would be:

mdfb --handle bsky.app -l 10 --like --threads 3 --format "{RKEY}_{HANDLE}" ./media/
mdfb -d did:plc:z72i7hdynmk6r22z27h6tvur --archive --like --threads 3 --format "{DID}_{HANDLE}" ./media/

Naming Convention

By default, mdfb's naming convention is: "{rkey}_{handle}_{text}". If it is downloading a post with multiple images then the naming will be: "{rkey}_{handle}_{text}_{i}", where "i" represents the order of the images in the post ranging from 1 - 4. In addition, the filenames are limited to 256 bytes and will be truncated down to that size.

However, you can specify the name of the files by using the --format flag and passing a valid format string, e.g. "{RKEY}_{DID}". You can put anything in the format string inbetween the keywords. This is case-sensitive

For --format, the valid keywords are:

  • RKEY
  • DID
  • HANDLE
  • TEXT
  • DISPLAY_NAME

Download Amount

When specifying the limit, this will be true for all types of post downloaded. For example:

mdfb --handle bsky.app -l 100 --like --repost --post ./media/

This would download 100 likes, reposts and post, totalling 300 posts downloaded.

Furthermore, you can archive whole accounts. For exmaple:

mdfb --handle bsky.app --archive --like --repost --threads 3 --format "{DID}_{HANDLE}" ./media/

This would download all likes and reposts.

Note

The maximum number of threads is currently 3, that can be changed in the mdfb/utils/constants.py file. Furthermore, there are more constants that can be changed in that file, such as delay between each request and the number of retires before marking that post as a failure and continuing.

Options

  • --handle
    • The handle of the target account.
  • --did, -d
    • The DID of the target account.
  • --limit, -l
    • The amount of posts that want to be downloaded.
  • --archive
    • Downloads all posts from the selected post type.
  • directory
    • Positional argument, where all the downloaded files are to be located. Required.
  • --threads
    • The amount of threads wanted to download posts more efficiently, maximum number of threads is 3.
  • --format
    • Format string that file's will use for their name. Furthermore the keywords used are case-sensitive and should be all upper case.
  • --like
    • To retrieved liked posts
  • --repost
    • To retrieved reposts
  • --post
    • To retrieved posts

Note

At least one of the flags: --like, --repost, --post is required.

Both (--did, -d and --handle) and (--archive and --limit, -l) are mutually exclusive, and one of each of them is required as well.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mdfb-1.2.0.tar.gz (10.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mdfb-1.2.0-py3-none-any.whl (13.1 kB view details)

Uploaded Python 3

File details

Details for the file mdfb-1.2.0.tar.gz.

File metadata

  • Download URL: mdfb-1.2.0.tar.gz
  • Upload date:
  • Size: 10.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for mdfb-1.2.0.tar.gz
Algorithm Hash digest
SHA256 a849aee2a1fb731674ad577dd3e81743da67bf532820382a45d2dd44a657da25
MD5 ca248567a1e1816007b0023e2571fda8
BLAKE2b-256 d6a4f34389935fccaa623a5ca365e96616599805227195ca02426f399a6e2de6

See more details on using hashes here.

File details

Details for the file mdfb-1.2.0-py3-none-any.whl.

File metadata

  • Download URL: mdfb-1.2.0-py3-none-any.whl
  • Upload date:
  • Size: 13.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for mdfb-1.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2004016c299c227aafb7b133a0dad80ffc0a1f57d2ed61e421155c303f97e0a4
MD5 5a7bd61e5d30b97254923bf17f019052
BLAKE2b-256 edebd79974f2ad1547601a2cccb42139b1e41f3bddbdde59b82a2baf9208e054

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page