Skip to main content

A data movement app that can use different source/targets tomove data around.

Project description

Data Transfer Microservice

This microservice is a multi-mode file movement wizard. It can transfer files, at a scheduled interval, between two different storage devices, using different transfer protocols and storage types.

The application is built for Python 3, but also tested against Python 2.7. It is not compatible with Python 2.6.

Installing and getting started

The application should be installed using pip3 (or pip for Python 2.7).

To install from a private PyPI server we suggest using ~/.pypirc to configure your private PyPI connection details.

pip3 install data-transfer --extra-index-url <Repo-URL>

After installing and setting the configuration settings, the application can be started with the following command:

data-transfer

Developing

Start by cloning the project:

git clone git@github.com:UKHomeOffice/data-transfer.git

Ensure that python3 is installed and on your path.

Installing for local development

These steps will install the application as a local pip installed package, using symlinks so that any updates you make to the files are automatically picked up next time you run the application or tests.

Using venv

To install the app using the standard python3 venv run the following commands from the project root folder:

python3 -m venv ~/.virtualenvs/data-transfer source ~/.virtualenvs/data-transfer/bin/activate pip3 install -e . -r requirements.txt

Using virtualenvwrapper

Alternatively, if you are using virtualenvwrapper then run the following:

mkvirtualenv data-transfer -p python3 pip3 install -e . -r requirements.txt

Dependancies for local testing

The project’s tests require the following dependencies:

  • An AWS S3 bucket or a mock

  • An FTP server

  • An SFTP server

For local development and testing, we suggest running Docker images.

Test

Once the application is installed and the dependencies are in place, run the tests:

pytest tests

Building & publishing

This project uses setuptools to build the distributable package.

Remember to update the version in setup.py before building the package.

python setup.py sdist

This will create a .tar.gz distributable package in dist/. This should be uploaded to an appropriate PyPI registry.

Deploying

The application should be installed using pip3 (or pip for Python 2.7).

If installing from a private PyPI server then we suggest using ~/.pypirc to configure your private PyPI connection details.

pip3 install data-transfer --extra-index-url <Repo-URL>

Configuration

The application requires the following environment variables to be set before running.

All configuration settings automatically default to suitable values for the tests, based on the local test dependencies running in the Docker images suggested in this guide.

Application settings

These control various application behaviour:

Environment Variable

Example

Required

Description.

INGEST_SOURCE_PATH

/upload/files

Yes

Source path

INGEST_DEST_PATH

/upload/files/done

Yes

Destination path

MAX_FILES_BATCH

5

No

Number to process each run

PROCESS_INTERVAL

5

No

Runs the task every (x) seconds.

READ_STORAGE_TYPE

See footnote

Yes

The type of read storage

WRITE_STORAGE_TYPE

See footnote

Yes

The type of write storage

Note: the read and write storage types need to be prefixed and options are:

  • datatransfer.storage.FolderStorage

  • datatransfer.storage.FtpStorage

  • datatransfer.storage.SftpStorage

  • datatransfer.storage.S3Storage

Source / read settings

Provide the connection settings for either FTP, sFTP or S3. You only need to configure the settings associated with the source storage type.

Environment Variable

Example

Description

READ_FTP_HOST

localhost

Hostname or IP of server

READ_FTP_PASSWORD

pass

Password

READ_FTP_PORT

21

Port the server uses

READ_AWS_ACCESS_KEY_ID

accessKey1

Access key for S3

READ_AWS_S3_BUCKET_NAME

aws-ingest

Bucket name

READ_AWS_S3_HOST

http://localhost:8000

URL of S3

Target / write settings

Provide the connection settings for either FTP, sFTP or S3. You only need to configure the settings associated with the target storage type.

Environment Variable

Example

Description

WRITE_FTP_HOST

localhost

Hostname or IP of server

WRITE_FTP_USER

user

Username

WRITE_FTP_PASSWORD

pass

Password

WRITE_FTP_PORT

21

Port for server

WRITE_AWS_ACCESS_KEY_ID

accesskey1

Access key for S3

WRITE_AWS_SECRET_ACCESS_KEY

verysecret

Secrey key

WRITE_AWS_S3_BUCKET_NAME

aws-ingest

Bucket name

WRITE_AWS_S3_HOST

http://localhost:8000

URL of S3

Running the application

To run the application from the command line:

data-transfer

For production use we recommend running the application using PM2:

pm2 start bin/data-transfer-start --interpreter=python --name data-transfer

Contributing

This project is Open source and we welcome ocntributions to and suggestions to improve the application. Please raise issues in the usual way on Github and for contributing code:

  • Fork the repo github

  • Clone the project locally

  • Commit your changes to your own branch

  • Push your work back to your fork

  • Submit a Pull Request so that we can review the changes

Licensing

This application is released under the [BSD license](LICENSE.txt).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

data-transfer-1.0.4.tar.gz (12.2 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page