Skip to main content

File replicator

Project description

ContentMonster

ContentMonster is a Python package used to replicate the contents of directories on one server ("shore") to other servers ("vessels") using SFTP over unstable network connections. The files are split into smaller chunks which are transferred separately and reassembled on the server.

It comes with a daemon application (worker.py) which monitors the configured local directories for changes and instantly pushes them to the vessels. Once a file has been replicated to all vessels, it is moved to a "processed" subdirectory of its source directory and removed from the queue.

Prerequisites

ContentMonster is written in Python3 and makes use of syntactical features introduced in Python 3.8. It depends on two packages installable by pip, paramiko (for SSH/SFTP connections) and watchdog (to monitor local directories for changes).

It was tested on Ubuntu 21.04 and Debian 10, but I don't see a reason why it would not work on other Unixoids or even Windows (although it might need some changes to properly work on the latter) as all dependencies are platform-independent.

Vessels (destination servers) need to have an SSH server with SFTP support. This has been tested with a default OpenSSH server as well as a Dropbear server with OpenSSH's sftp-server. They also have to provide the cat command which is used to reassemble the uploaded chunks.

Installation

It is recommended that you use a virtual environment in order to maintain a clean Python environment independent from system updates and other Python projects on the same host. Note that you may have to install the venv package from your OS's package repositories first (on Debian-based distributions: apt install python3-venv).

In a terminal, navigate to the ContentMonster directory, then (assuming you are running bash) execute the following commands:

python3 -m venv venv  # Create a virtual environment in the "venv" subdirectory
. venv/bin/activate  # Activate the virtual environment (just in case)
pip install -Ur requirements.txt  # Install the package dependencies (paramiko/watchdog)

Configuration

The application is configured using the settings.ini file. Start off by copying the provided settings.example.ini to settings.ini and opening it in a text editor. Note that all keys and values are case-sensitive. Required keys are identified as such in the comments below, all other keys are optional. The file consists of (at least) three sections:

MONSTER

The MONSTER section contains a few global configuration options for the application:

[MONSTER]
ChunkSize = 10485760  # Size of individual chunks in bytes (default: 10 MiB)

Directory

You can configure as many directories to be replicated as you want by adding multiple Directory sections. The directories are replicated to the same location on the vessels that they are located at on the shore.

[Directory sampledir]  # Each directory needs a unique name - here: "sampledir"
Location = /home/user/replication  # Required: File system location of the directory

Note: Currently, the same Location value is used on both the shore and the vessels, although this may be configurable in a future version. The directory has to be writable by the configured users on all of the configured vessels. In the above example, files are taken from /home/user/replication on the shore and put into /home/user/replication on each of the vessels.

Vessel

You can configure as many vessels to replicate your files to as you want by adding multiple Vessel sections. All configured directories are replicated to all vessels by default, but you can use the IgnoreDirs directive to exclude a directory from a given vessel. If you want to use an SSH key to authenticate on the vessels, make sure that it is picked up by the local SSH agent (i.e. you can login using the key when connecting with the ssh command).

[Vessel samplevessel]  # Each vessel needs a unique name - here: "samplevessel"
Address = example.com  # Required: Hostname / IP address of the vessel
TempDir = /tmp/.ContentMonster  # Temporary directory for uploaded chunks (default: /tmp/.ContentMonster) - needs to be writable
Username = replication  # Username to authenticate as on the vessel (default: same as user running ContentMonster)
Password = verysecret  # Password to use to authenticate on the vessel (default: none, use SSH key)
Passphrase = moresecret  # Passphrase of the SSH key you use to authenticate (default: none, key has no passphrase)
Port = 22  # Port of the SSH server on the vessel (default: 22)
IgnoreDirs = sampledir, anotherdir  # Names of directories *not* to replicate to this vessel, separated by commas

Running

To run the application after creating the settings.ini, navigate to ContentMonster's base directory in a terminal and make sure you are in the right virtual environment:

. venv/bin/activate

Then, you can run the worker like this:

python worker.py

Keep an eye on the output for the first minute or so, to check for any issues during initialization.

systemd Service

You may want to run ContentMonster as a systemd service to make sure it starts automatically after a system reboot. Assuming that it is installed into /opt/ContentMonster/ following the instructions above and supposed to run as the replication user, something like this should work:

[Unit]
Description=ContentMonster
After=syslog.target network.target

[Service]
Type=simple
User=replication
WorkingDirectory=/opt/ContentMonster/
ExecStart=/opt/ContentMonster/venv/bin/python -u /opt/ContentMonster/worker.py
Restart=on-abort

[Install]
WantedBy=multi-user.target

Write this to /etc/systemd/system/contentmonster.service, then enable the service like this:

systemctl daemon-reload
systemctl enable --now contentmonster
systemctl status contentmonster  # Check that the service started properly

The service should now start automatically after every reboot. You can use commands like systemctl status contentmonster and journalctl -xeu contentmonster to keep an eye on the status of the service.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

contentmonster-0.0.6.tar.gz (17.0 kB view details)

Uploaded Source

Built Distribution

contentmonster-0.0.6-py3-none-any.whl (25.2 kB view details)

Uploaded Python 3

File details

Details for the file contentmonster-0.0.6.tar.gz.

File metadata

  • Download URL: contentmonster-0.0.6.tar.gz
  • Upload date:
  • Size: 17.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.6

File hashes

Hashes for contentmonster-0.0.6.tar.gz
Algorithm Hash digest
SHA256 2c6f76fda642360e88d0616663c95a1491b7ab6268305e89e15cee43431ed525
MD5 d61c2ea67fac88726cf14bbdcde7b61d
BLAKE2b-256 9a0ccbb36ee22b0f4740c4f0552442010c880e39b8922ac97a8d968c553504c4

See more details on using hashes here.

File details

Details for the file contentmonster-0.0.6-py3-none-any.whl.

File metadata

File hashes

Hashes for contentmonster-0.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 0f7b044d01edf949eacf97ec10e844650957925d5142a3b287396586133d33bc
MD5 0659ee63d6f01199f25888482883e1e4
BLAKE2b-256 9a9f1d947fe8d188f68f80b1d44be658fa8f8878d40dcb014836f330a7632cb5

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page