Skip to main content

An updated bot that posts to Tumblr, based on your very own blog!

Project description

tumblrbot

PyPI - Version

Description of original project:

4tv-tumblrbot was a collaborative project I embarked on with my close friend Dima, who goes by @smoqueen on Tumblr. The aim of this endeavor was straightforward yet silly: to develop a Tumblr bot powered by a machine-learning model. This bot would be specifically trained on the content from a particular Tumblr blog or a selected set of blogs, allowing it to mimic the style, tone, and thematic essence of the original posts.

This fork is largely a rewrite of the source code with similarities in its structure and process.

Features:

  • An interactive console for all steps of generating posts for the blog:
    1. Asks for OpenAI and Tumblr tokens.
      • Stores API tokens using keyring.
      • Prevents API tokens from printing to the console.
    2. Retrieves Tumblr OAuth tokens.
    3. Downloads posts from the configured Tumblr blogs.
      • Skips redownloading already downloaded posts.
      • Shows progress and previews the current post.
    4. Creates examples to fine-tune the model from your posts.
      • Filters out posts that contain more than just text data.
      • Filters out any posts flagged by the OpenAI Moderation API (optional).
        • Shows progress and previews the current post.
      • Adds custom user messages and assistant responses to the dataset from the configured file.
    5. Provides cost estimates if the currently saved examples are used to fine-tune the configured model.
    6. Uploads examples to OpenAI and begins the fine-tuning process.
      • Resumes monitoring the same fine-tuning process when restarted.
      • Deletes the uploaded examples file if fine-tuning does not succeed (optional).
      • Stores the output model automatically when fine-tuning is completed.
    7. Generates and uploads posts to the configured Tumblr blog using the configured fine-tuned model.
      • Creates tags by extracting keywords at the configured frequency using the configured model.
      • Uploads posts as drafts to the configured Tumblr blog.
      • Shows progress and previews the current post.
  • Colorful output, progress bars, and post previews using rich.
  • Automatically keeps the config file up-to-date and recreates it if missing.

To-Do:

  • Add code documentation.
  • Fix inaccurate post counts when downloading posts.
  • Fix file not found error when starting fine-tuning.

Please submit an issue or contact us for features you want added/reimplemented.

Installation

  1. Install the latest version of Python:
    • Windows: winget install python3
    • Linux (apt): apt install python-pip
    • Linux (pacman): pacman install python-pip
  2. Install the pip package: pip install tumblrbot
    • Alternatively, you can install from this repository: pip install git+https://github.com/MaidThatPrograms/tumblrbot.git
    • On Linux, you will have to make a virtual environment or use the flag to install packages system-wide.
    • See keyring for additional requirements if you are not on Windows.

Usage

Run tumblrbot from anywhere. Run tumblrbot --help for command-line options. Every command-line option corresponds to a value from the config.

Obtaining Tokens

OpenAI

API token can be created here.

  1. Leave everything at the defaults and set Project to Default Project.
  2. Press Create secret key.
  3. Press Copy to copy the API token to your clipboard.

Tumblr

API tokens can be created here.

  1. Press + Register Application.
  2. Enter anything for Application Name and Application Description.
  3. Enter any URL for Application Website and Default callback URL, like https://example.com.
  4. Enter any email address for Administrative contact email. It probably doesn't need to be one you have access to.
  5. Press the checkbox next to I'm not a robot and complete the CAPTCHA.
  6. Press Register.
  7. You now have access to your consumer key next to Oauth Consumer Key.
  8. Press Show secret key to see your Consumer Secret.

When running this program, you will be prompted to enter all of these tokens. The fields are password-protected, so there will be no output to the console. If something goes wrong while entering the tokens, you can always reset them by running the program again and answering y to the relevant prompt.

After inputting the Tumblr tokens, you will be given a URL that you need to open in your browser. Press Allow, then copy and paste the URL of the page you are redirected to into the console.

Configuration

All config options can be found in config.toml after running the program once. This will be kept up-to-date if there are changes to the config's format in a future update. This also means it may be worthwhile to double-check the config file after an update. Any changes to the config should be in the changelog for a given version.

All file options can include directories that will be created when the program is run.

  • custom_prompts_file You will have to create this file yourself. It should follow the following format:
    {"user message 1": "assistant response 1",
     "user message 2": "assistant response 2"}
    
  • developer_message - This message is used in for fine-tuning the AI as well as generating prompts. If you change this, you will need to run the fine-tuning again with the new value before generating posts.
  • user_message - This message is used in the same way as developer_message and should be treated the same.
  • expected_epochs - The default value here is the default number of epochs for base_model. You may have to change this value if you change base_model. After running fine-tuning once, you will see the number of epochs used in the fine-tuning portal under Hyperparameters. This value will also be updated automatically if you run fine-tuning through this program.
  • token_price - The default value here is the default token price for base_model. You can find the up-to-date value here, in the Training column.
  • job_id - If there is any value here, this program will resume monitoring the corresponding job, instead of starting a new one. This gets set when starting the fine-tuning and is cleared when it is completed. You can find job IDs in the fine-tuning portal.
  • base_model - This value is used to choose the tokenizer for estimating fine-tuning costs. It is also the base model that will be fine-tuned and the model that is used to generate tags. You can find a list of options in the fine-tuning portal by pressing + Create and opening the drop-down list for Base Model. Be sure to update token_price if you change this value.
  • tags_chance - This should be between 0 and 1. Setting it to 0 corresponds to a 0% chance (never) to add tags to a post. 1 corresponds to a 100% chance (always) to add tags to a post. Adding tags incurs a very small token cost.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tumblrbot-1.4.3.tar.gz (18.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tumblrbot-1.4.3-py3-none-any.whl (16.3 kB view details)

Uploaded Python 3

File details

Details for the file tumblrbot-1.4.3.tar.gz.

File metadata

  • Download URL: tumblrbot-1.4.3.tar.gz
  • Upload date:
  • Size: 18.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-requests/2.32.4

File hashes

Hashes for tumblrbot-1.4.3.tar.gz
Algorithm Hash digest
SHA256 5e8c3f60eba2bad3a94cd98ac3496a1f8f5d214495125d3ccb78bb79308f2d27
MD5 484159fdb8d530f2f6ee2b98f85a8062
BLAKE2b-256 5b8d6a7b2bc9fbae9e40e65d5ce874f423263b6164ae5290be40a30354afa4d7

See more details on using hashes here.

File details

Details for the file tumblrbot-1.4.3-py3-none-any.whl.

File metadata

  • Download URL: tumblrbot-1.4.3-py3-none-any.whl
  • Upload date:
  • Size: 16.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-requests/2.32.4

File hashes

Hashes for tumblrbot-1.4.3-py3-none-any.whl
Algorithm Hash digest
SHA256 c51215576638d68490fdb86c1d6051e7a90fecd33a45d2e393f7711003ddaab9
MD5 582786effa90170ba8364a7e2cab6c34
BLAKE2b-256 a4717138e19e0f8cbe217e3770c4bb216844593ff2d5e93da382e111846b04fb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page