An updated bot that posts to Tumblr, based on your very own blog!

Project description

tumblrbot

Description of original project:

4tv-tumblrbot was a collaborative project I embarked on with my close friend Dima, who goes by @smoqueen on Tumblr. The aim of this endeavor was straightforward yet silly: to develop a Tumblr bot powered by a machine-learning model. This bot would be specifically trained on the content from a particular Tumblr blog or a selected set of blogs, allowing it to mimic the style, tone, and thematic essence of the original posts.

This fork is largely a rewrite of the source code with similarities in its structure and process.

Features:

An interactive console for all steps of generating posts for the blog:
1. Asks for OpenAI and Tumblr tokens.
2. Retrieves Tumblr OAuth tokens.
3. Downloads posts from specified blogs (configurable).
  - Skips redownloading already downloaded posts.
  - Shows progress and previews the current post.
4. Creates examples to fine-tune the model from the downloaded posts.
  - Filters out posts that contain more than just text data.
  - Filters out posts that contain regular expressions (configurable).
  - Only uses the most recent posts from each blog (configurable).
  - Adds custom user messages and assistant responses to the dataset (configurable).
5. Filters out any posts flagged by the OpenAI Moderation API.
6. Uploads examples to OpenAI and begins the fine-tuning process.
  - Provides cost estimates if the currently saved examples are used to fine-tune a base model (configurable).
  - Resumes monitoring the same fine-tuning process when restarted.
  - Deletes the uploaded examples file if fine-tuning does not succeed (optional).
  - Stores the output model automatically when fine-tuning is completed.
7. Generates and uploads posts to a blog using the fine-tuned model (configurable).
  - Creates tags by extracting keywords using the base model (configurable).
  - Uploads posts as drafts.
  - Reblogs posts from allowed blogs (configurable).
  - Shows progress and previews the current post.
Colorful output, progress bars, and post previews using rich.
Automatically keeps the config file up-to-date and recreates it if missing (without overriding user settings).

Known Issues:

Fine-tuning can fail after the validation phase due to the examples file not passing OpenAI moderation checks. There are a few workarounds for this that can be tried in combination:
- You can retry with the same examples file. This has, on rare occasions, worked.
- You can submit the examples file to the OpenAI moderation API with this program's guided prompts. This has worked consistently for our dataset, but others have reported it not being thorough enough.
- You can use regular expressions to filter out training data in the config. This is more of a brute-force solution, but it can work if the other solutions do not.
- You can try limiting your dataset by specifying fewer blogs to download from or limiting the number of posts taken from each one in the config.
- If all else fails, you can manually remove data from the examples file until it passes. It is unfortunately not a definitive resource, but it can help to read about what the OpenAI moderation API flags.
Sometimes, you will get an error about the training file not being found when starting fine-tuning. We do not currently have a fix or workaround for this. You should instead use the online portal for fine-tuning if this continues to happen. Read more in fine-tuning.
Post counts are incorrect when downloading posts. We are not certain what the cause of this is, but our tests suggest this is a Tumblr API problem that is giving inaccurate numbers.
During post downloading or post generation, you may receive a “Limit Exceeded” error message from the Tumblr API. This is caused by server-side rate-limiting by Tumblr. The only workaround is trying again or waiting for a period of time before retrying. In most cases, you either have to wait for a minute or an hour for the limits to reset. You can read more about the limits in the Tumblr API documentation on rate limits.
Similar to the above issue, you may sometimes get a message saying your IP is blocked. This block is temporary and probably follows the same rules as previously described.

Please submit an issue or contact us for features you want added/reimplemented.

Installation & Usage

Downloadable Binary

Pros	Cons
Easier to install	Harder to update
No risk of dependencies breaking	Dependencies may be older

Download the latest release's tumblrbot.exe.
Launch tumblrbot.exe in the install location.

PyPi

Pros	Cons
Easier to update	Harder to install
Dependencies may be newer	Dependencies may break

Install the latest version of Python:
- Windows: winget install python3
- Linux (apt): apt install python-pip
- Linux (pacman): pacman install python-pip
Install the pip package: pip install tumblrbot
- Alternatively, you can install from this repository: pip install git+https://github.com/MaidScientistIzutsumiMarin/tumblrbot.git
- On Linux, you will have to make a virtual environment or use the flag to install packages system-wide.
Run tumblrbot from anywhere. Run tumblrbot --help for command-line options. Every command-line option corresponds to a value from the config.

Obtaining Tokens

OpenAI

API token can be created here: OpenAI Tokens.

Leave everything at the defaults and set Project to Default Project.
Press Create secret key.
Press Copy to copy the API token to your clipboard.

Tumblr

API tokens can be created here: Tumblr Tokens.

Press + Register Application.
Enter anything for Application Name and Application Description.
Enter any URL for Application Website and Default callback URL, like https://example.com.
Enter any email address for Administrative contact email. It probably doesn't need to be one you have access to.
Press the checkbox next to I'm not a robot and complete the CAPTCHA.
Press Register.
You now have access to your consumer key next to Oauth Consumer Key.
Press Show secret key to see your Consumer Secret.

When running this program, you will be prompted to enter all of these tokens. If something goes wrong while entering the tokens, you can always reset them by running the program again and answering y to the relevant prompt.

After inputting the Tumblr tokens, you will be given a URL that you need to open in your browser. Press Allow, then copy and paste the URL of the page you are redirected to into the console.

Configuration

All config options can be found in config.toml after running the program once. This will be kept up-to-date if there are changes to the config's format in a future update. This also means it may be worthwhile to double-check the config file after an update. Any changes to the config should be in the changelog for a given version.

All file options can include directories that will be created when the program is run.

All config options that involve blog identifiers expect any version of a blog URL, which is explained in more detail in the Tumblr API documentation on blog identifiers.

A valid post:

Contains any content.
Only has text.
Is not an ask.
Is not a reblog.

Specific Options:

custom_prompts_file This file should follow the following file format:
```
{"user message 1": "assistant response 1"}
{"user message 1": "assistant response 1"}
{"user message 2": "assistant response 2", "user message 3": "assistant response 3"}
```
To be specific, it should follow the JSON Lines file format with one collection of name/value pairs (a dictionary) per line. You can validate your file using the JSON Lines Validator.
post_limit - At most, this many valid posts will be included in the training data. This effectively is a filter to select the N most recent valid posts from each blog. 0 will use every available valid post.
moderation_batch_size - This controls the batch size when submitting posts to the OpenAI moderation. There is no limit, but higher numbers will cause you to be rate-limited more, which can overall be slower. Low numbers reduce rate-limiting, but can sometimes take longer due to needing more requests. The best value will depend on your computer, internet connection, and any number of factors on OpenAI's side. The default value is just what worked best for our computer.
filtered_words - During training data generation, any posts with the specified words will be removed. Word boundaries are not checked by default, so “the” will also filter out posts with “them” or “thematic”. This setting supports regular expressions, so you can explicitly look for word boundaries by surrounding an entry with “\\b”, i.e., “\\bthe\\b”. Regular expressions have to be escaped like so due to how JSON data is read in. If you are familiar with regular expressions, it could be useful for you to know that every entry is joined with a “|” which is then used to search the post content for any matches.
developer_message - This message is used in for fine-tuning the AI as well as generating prompts. If you change this, you will need to run the fine-tuning again with the new value before generating posts.
user_message - This setting is used and works in the same way as developer_message.
expected_epochs - The default value here is the default number of epochs for base_model. You may have to change this value if you change base_model. After running fine-tuning once, you will see the number of epochs used in the fine-tuning portal under Hyperparameters. This value will also be updated automatically if you run fine-tuning through this program.
token_price - The default value here is the default token price for base_model. You can find the up-to-date value in OpenAI Pricing, in the Training column.
job_id - If there is any value here, this program will resume monitoring the corresponding job, instead of starting a new one. This gets set when starting the fine-tuning and is cleared when it is completed. You can read more in fine-tuning.
base_model - This value is used to choose the tokenizer for estimating fine-tuning costs. It is also the base model that will be fine-tuned and the model that is used to generate tags. You can find a list of options in the fine-tuning portal by pressing + Create and opening the drop-down list for Base Model. Be sure to update token_price if you change this value.
fine_tuned_model - Set automatically after monitoring fine-tuning if the job has succeeded. You can read more in fine-tuning.
tags_chance - This should be between 0 and 1. Setting it to 0 corresponds to a 0% chance (never) to add tags to a post. 1 corresponds to a 100% chance (always) to add tags to a post. Adding tags incurs a very small token cost.
reblog_blog_identifiers - Whenever a reblog is attempted, a random blog from this list will be chosen to be reblogged from.
reblog_chance - This setting works the same way as tags_chance.
reblog_user_message - This setting is a format string. The only argument it is formatted with is the content of the post being reblogged. In simple terms, the {} will be replaced with said content. Alternatively, you can leave out the {} so that the reblogged post is appended to the end.
- Note: The bot is only given the latest message in a reblog chain due to the required complexity and added costs of including the entire chain.

Manual Fine-Tuning

You can manually upload the examples file to OpenAI and start the fine-tuning here: fine-tuning portal.

Press + Create.
Select the desired Base Model from the dropdown. This should ideally match the model set in the config.
Upload the generated examples file to the section under Training data. You can find the path for this in the config.
Press Create.
(Optional) Copy the value next to Job ID and paste it into the config under job_id. You can then run the program and monitor its progress as usual.
If you do not do the above, you will have to copy the value next to Output model once the job is complete and paste it into the config under fine_tuned_model.

Project details

Release history Release notifications | RSS feed

2.2.2

Apr 28, 2026

2.2.1

Apr 16, 2026

2.2.0

Apr 14, 2026

2.1.0

Feb 23, 2026

2.0.0

Feb 17, 2026

1.10.1

Feb 3, 2026

1.10.0

Feb 3, 2026

1.9.7

Nov 23, 2025

This version

1.9.6

Nov 10, 2025

1.9.5

Oct 17, 2025

1.9.4

Sep 4, 2025

1.9.3

Sep 3, 2025

1.9.2

Aug 28, 2025

1.9.1

Aug 28, 2025

1.9.0

Aug 23, 2025

1.8.0

Aug 18, 2025

1.7.0

Aug 17, 2025

1.6.0

Aug 5, 2025

1.5.0

Aug 3, 2025

1.4.7

Jul 15, 2025

1.4.6

Jul 12, 2025

1.4.5

Jul 12, 2025

1.4.4

Jul 10, 2025

1.4.3

Jul 9, 2025

1.4.2

Jul 9, 2025

1.4.1

Jul 8, 2025

1.4.0

Jul 8, 2025

1.3.2

Jul 7, 2025

1.3.1

Jul 6, 2025

1.3.0

Jul 6, 2025

1.2.1

Jul 5, 2025

1.2.0

Jul 5, 2025

1.1.5

Jul 5, 2025

1.1.3

Jul 5, 2025

1.1.2

Jul 4, 2025

1.1.1

Jul 4, 2025

1.1.0

Jul 3, 2025

1.0.0

Jul 3, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tumblrbot-1.9.6.tar.gz (137.7 kB view details)

Uploaded Nov 10, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

tumblrbot-1.9.6-py3-none-any.whl (19.9 kB view details)

Uploaded Nov 10, 2025 Python 3

File details

Details for the file tumblrbot-1.9.6.tar.gz.

File metadata

Download URL: tumblrbot-1.9.6.tar.gz
Upload date: Nov 10, 2025
Size: 137.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: python-requests/2.32.5

File hashes

Hashes for tumblrbot-1.9.6.tar.gz
Algorithm	Hash digest
SHA256	`1a1027d682facae818745179a70809de817b29683bcfc755cd962bd44a51dbf1`
MD5	`3fe4a0e359376698430a7bdead1525c3`
BLAKE2b-256	`a4afec8e6c07ae66d4eb88fa2711d37a9792ef852f80f6c353edbbdbb118d25d`

See more details on using hashes here.

File details

Details for the file tumblrbot-1.9.6-py3-none-any.whl.

File metadata

Download URL: tumblrbot-1.9.6-py3-none-any.whl
Upload date: Nov 10, 2025
Size: 19.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: python-requests/2.32.5

File hashes

Hashes for tumblrbot-1.9.6-py3-none-any.whl
Algorithm	Hash digest
SHA256	`58c0af3f02977eacc282b935d673fedf2f7d5a4392169810d08a0dc9034a8a45`
MD5	`0f211e75fe68e19379718608db9498f7`
BLAKE2b-256	`2c641b8683f2de7869472a91dd0bfcbf5dd0edc50e7c82a1afc8f0a4440e22bb`

See more details on using hashes here.

tumblrbot 1.9.6

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

tumblrbot

Installation & Usage

Downloadable Binary

PyPi

Obtaining Tokens

OpenAI

Tumblr

Configuration

Manual Fine-Tuning

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes