Skip to main content

A one-eyed tool to copy files with.

Project description

oeleo

Python package / app that can be used for transferring files from an instrument-PC to a data server.

Features (or limitations)

  • Transferring using an ssh connection should preferably be used with key-pairs. This might involve some setting up on your server (ACL) to prevent security issues (the oeleo user should only have access to the data folder on your server).
  • Accessing ssh can be done using password if you are not able to figure out how to set proper ownerships on your server.
  • oeleo is one-eyed. Meaning that tracking of the "state of the duplicates" is only performed on the local side (where oeleo is running).
  • However, oeleo contains a check method that can help you figure out if starting copying is a
    good idea or not. And populate the database if you want.
  • The db that stores information about the "state of the duplicates" is stored relative to the folder oeleo is running from. If you delete it (by accident?), oeleo will make a new empty one from scratch next time you run.
  • Configuration is done using environmental variables.

Usage

Install

$ pip install oeleo

Run

  1. Create an oeleo worker instance.
  2. Connect the worker's bookkeeper to a sqlite3 database.
  3. Filter local files.
  4. Run to copy files.
  5. Repeat from step 3.

Examples and descriptions

Simple script for copying between local folders

import os
from pathlib import Path
import time

import dotenv

from oeleo.checkers import ChecksumChecker
from oeleo.models import SimpleDbHandler
from oeleo.connectors import LocalConnector
from oeleo.workers import Worker
from oeleo.utils import start_logger


def main():
  log = start_logger()
  # assuming you have made a .env file:
  dotenv.load_dotenv()

  db_name = os.environ["OELEO_DB_NAME"]
  base_directory_from = Path(os.environ["OELEO_BASE_DIR_FROM"])
  base_directory_to = Path(os.environ["OELEO_BASE_DIR_TO"])
  filter_extension = os.environ["OELEO_FILTER_EXTENSION"]

  # Making a worker using the Worker class.
  # You can also use the `factory` functions in `oeleo.worker`
  # (e.g. `ssh_worker` and `simple_worker`)
  bookkeeper = SimpleDbHandler(db_name)
  checker = ChecksumChecker()
  local_connector = LocalConnector(directory=base_directory_from)
  external_connector = LocalConnector(directory=base_directory_to)

  worker = Worker(
    checker=checker,
    local_connector=local_connector,
    external_connector=external_connector,
    bookkeeper=bookkeeper,
    extension=filter_extension
  )

  # Running the worker with 5 minutes intervals.
  # You can also use an oeleo scheduler for this.
  worker.connect_to_db()
  while True:
    worker.filter_local()
    worker.run()
    time.sleep(300)


if __name__ == "__main__":
  main()

Environment .env file

OELEO_BASE_DIR_FROM=C:\data\local
OELEO_BASE_DIR_TO=C:\data\pub
OELEO_FILTER_EXTENSION=.csv
OELEO_DB_NAME=local2pub.db
OELEO_LOG_DIR=C:\oeleo\logs

## only needed for advanced connectors:
# OELEO_DB_HOST=<db host>
# OELEO_DB_PORT=<db port>
# OELEO_DB_USER=<db user>
# OELEO_DB_PASSWORD=<db user>
# OELEO_EXTERNAL_HOST<ssh hostname>
# OELEO_USERNAME=<ssh username>
# OELEO_PASSWORD=<ssh password>
# OELEO_KEY_FILENAME=<ssh key-pair filename>

## only needed for SharePointConnector:
# OELEO_SHAREPOINT_USERNAME=<sharepoint username (fallbacks to ssh username if missing)>
# OELEO_SHAREPOINT_URL=<url to sharepoint>
# OELEO_SHAREPOINT_SITENAME=<name of sharepoint site>
# OELEO_SHAREPOINT_DOC_LIBRARY=<name of sharepoint library>

Database

The database contains one table called filelist:

id processed_date local_name external_name checksum code
1 2022-07-05 15:55:02.521154 file_number_1.xyz C:\oeleo\check\to\file_number_1.xyz c976e564825667d7c11ba200457af263 1
2 2022-07-05 15:55:02.536152 file_number_10.xyz C:\oeleo\check\to\file_number_10.xyz d502512c0d32d7503feb3fd3dd287376 1
3 2022-07-05 15:55:02.553157 file_number_2.xyz C:\oeleo\check\to\file_number_2.xyz cb89d576f5bd57566c78247892baffa3 1

The processed_date is when the file was last updated (meaning last time oeleo found a new checksum for it).

The table below shows what the different values of code mean:

code meaning
0 should-be-copied
1 should-be-copied-if-changed
2 should-not-be-copied

Hint! You can lock (chose to never copy) a file by editing the code manually to 2.

Using an oeleo scheduler

Instead of for example using a while loop to keep oeleo running continuously or at selected intervals, you can use a scheduler (e.g. rocketry, watchdog, schedule, or more advanced options such as AirFlow).

oeleo also includes its own schedulers. This is an example of how to use the oeleo.SimpleScheduler:

import dotenv

from oeleo.schedulers import SimpleScheduler
from oeleo.workers import simple_worker

# assuming you have created an appropriate .env file
dotenv.load_dotenv()
worker = simple_worker()
s = SimpleScheduler(
        worker,
        run_interval_time=4,  # seconds
        max_run_intervals=4,
    )
s.start()

Copy files from a Windows PC to a Linux server through ssh

import logging
import os
from pathlib import Path

import dotenv

from oeleo.connectors import register_password
from oeleo.utils import start_logger
from oeleo.workers import ssh_worker

log = start_logger()

print(" ssh ".center(80, "-"))
log.setLevel(logging.DEBUG)
log.info(f"Starting oeleo!")
dotenv.load_dotenv()

external_dir = "/srv/data"
filter_extension = ".res"

register_password(os.environ["OELEO_PASSWORD"])

worker = ssh_worker(
  db_name="ssh_to_server.db",
  base_directory_from=Path(r"data\raw"),
  base_directory_to=external_dir,
  extension=filter_extension,
)
worker.connect_to_db()
try:
  worker.check(update_db=True)
  worker.filter_local()
  worker.run()
finally:
  worker.close()

Future planned improvements

Just plans, no promises given.

  • make even nicer printing and logging.
  • create CLI.
  • create an executable.
  • create a web-app.
  • create a GUI (not likely).

Status

  • Works on my PC → PC
  • Works on my PC → my server
  • Works on my server → my server
  • Works on my instrument PC → my instrument PC
  • Works on my instrument PC → my server
  • Works OK
  • Deployable
  • On testpypi
  • On pypi
  • Code understandable for others
  • Looking good
  • Fairly easy to use
  • Easy to use
  • Easy to debug runs (e.g. editing sql)

Licence

MIT

Development

  • Developed using poetry on python 3.11.
  • Must also run on python 3.8 for Windows 7 support.

Some useful commands

Update version

# update version e.g. from 0.3.1 to 0.3.2:
poetry version patch

Then edit __init__.py:

__version__ = "0.3.2"

Build

poetry build

Publish

If you are using 2-factor authentication, you need to create a token on pypi.org and run:

poetry config pypi-token.pypi <token>

Then run:

poetry publish

Next

  • Improve logging

Development lead

  • Jan Petter Maehlen, IFE

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

oeleo-0.5.1.tar.gz (20.7 kB view hashes)

Uploaded Source

Built Distribution

oeleo-0.5.1-py3-none-any.whl (24.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page