Secure transport of python objects using TLS encryption
Project description
tls-python-object (tlspyo)
:computer: :globe_with_meridians: :computer:
A library for easy and secure transfer of python objects over network.
:rocket: Quickstart guide :scroll: API documentation
tlspyo
provides a simple API to transfer python objects in a robust and safe way via TLS, between several machines (and/or processes) called Endpoints
.
Endpoints
are part of one to several groups,- Arbitrarily many
Endpoints
connect together via a centralRelay
, - Each
Endpoint
can broadcast or produce python objects to the desired groups.
:information_source: Please carefully read the Security section before using tlspyo
anywhere other than your own secure private network.
Quick links
Principle
tlspyo
provides two classes: Relay
and Endpoint
.
- The
Relay
is the center point of all communication betweenEndpoints
, - An
Endpoint
is a node in your network. It connects to theRelay
and is part of one to severalgroups
.
Endpoints
can do a multitude of things, including:
- broadcast python objects to whole groups of
Endpoints
, - retrieve the objects broadcast to the group(s) it is part of,
- produce a single object that will be consumed by a single
Endpoint
of a target group, - notify the
Relay
that it is ready to consume a produced object and wait until it receives it.
By default, tlspyo
relies on Transport Layer Security (TLS) to secure object transfers over network.
Example usage
from tlspyo import Relay, Endpoint
if __name__ == "__main__":
# Create a relay to allow connectivity between endpoints
re = Relay(
port=3000, # this must be the same on your Relay and Endpoints
password="VerySecurePassword", # must be the same on Relay and Endpoints, AND be strong
local_com_port=3001 # needs to be non-overlapping if Relays/Endpoints are on the same machine
)
# Create an Endpoint in group "producers" (arbitrary name)
prod = Endpoint(
ip_server='127.0.0.1', # IP of the Relay (here: localhost)
port=3000, # must be same port as the Relay
password="VerySecurePassword", # must be same (strong) password as the Relay
groups="producers", # this endpoint is part of the group "producers"
local_com_port=3002
)
# Create a bunch of other Endpoints in group "consumers" (arbitrary name)
cons_1 = Endpoint(
ip_server='127.0.0.1',
port=3000,
password="VerySecurePassword",
groups="consumers", # this endpoint is part of group "consumers"
local_com_port=3003
)
cons_2 = Endpoint(
ip_server='127.0.0.1',
port=3000,
password="VerySecurePassword",
groups="consumers", # this endpoint is part of group "consumers"
local_com_port=3004,
)
# Producer broadcasts an object to any and all endpoint in the destination group "consumers"
prod.broadcast("I HAVE BEEN BROADCAST", "consumers")
# Producer sends an object to the shared queue of destination group "consumers"
prod.produce("I HAVE BEEN PRODUCED", "consumers")
# Consumer 1 notifies the Relay that it wants one produced object destined for "consumers"
cons_1.notify("consumers")
# Consumer 1 is able to retrieve the broadcast AND the consumed object:
res = []
while len(res) < 2:
res += cons_1.receive_all(blocking=True)
print(f"Consumer 1 has received: {res}")
# Consumer 2 is able to retrieve only the broadcast object:
res = cons_2.receive_all(blocking=True)
print(f"Consumer 2 has received: {res}")
# Let us close everyone gracefully:
prod.stop()
cons_1.stop()
cons_2.stop()
re.stop()
Getting started
:information_source: The machine hosting your Relay
must be visible to the machines hosting your Endpoints
through the chosen port
, via its public ip_server
.
When using tlspyo
over the Internet, this typically requires you to configure your router such that it forwards port
to the IP of the machine hosting your Relay
on your local network.
Installation
From PyPI:
pip install tlspyo
TLS setup:
:information_source: You can skip this section if you do not want to use TLS.
For instance if you use tlspyo
on your own private secure network.
When using tlspyo
over the Internet, you should of course use TLS (read the security section if you do not understand why).
- Generate TLS credentials:
tlspyo
makes the process of generating your TLS credentials straightforward.
:arrow_forward: On the machine that will host your Relay
, execute the following command line:
python -m tlspyo --generate
This will generate two files in the tlspyo/credentials
data directory: key.pem
and certificate.pem
.
:information_source: In case you wish to customize your TLS certificate, add the --custom
option in the previous command line.
Now, your need to retrieve your certificate.pem
on the machines that will host your Endpoints
(note: you can skip the following steps if your Endpoints
are on the same machine as your Relay
).
This can be achieved via either of the following methods:
- METHOD 1: manually copy the public certificate (more secure):
:arrow_forward: On the machines that will host your Endpoints
, execute:
python -m tlspyo --credentials
This creates and displays the target folder where you need to copy the certificate.pem
that you generated on the machine that will host the Relay
(the source folder was displayed when you executed --generate
).
- METHOD 2: transfer the public certificate via TCP (not secure):
:warning: This method is not secure. In particular, a man-in-the-middle can impersonate the certificate-broadcasting server and send you a fraudulent TLS certificate. Use with caution.
:arrow_forward: On the machine that will host your Relay
, start a certificate-broadcasting server:
python -m tlspyo --broadcast --port=<port>
where <port>
is a port through which other machines will attempt to retrieve your certificate via TCP.
:arrow_forward: On the machines that will host your Endpoints
, execute:
python -m tlspyo --retrieve --ip=<ip> --port=<port>
where <ip>
is the public IP of the certificate-broadcasting machine, and <port>
is the same as previously.
And you are all set! :sunglasses:
You can now stop the certificate-broadcasting server by closing the terminal where it runs.
A Simple Producer-Consumer Example
Let us now see how to make basic usage of tlspyo
.
In this example, we will create a Relay
and two Endpoints
on the same machine, and have them transfer objects via localhost
.
The full script for this example can be found here.
Import the Relay
and Endpoint
classes:
from tlspyo import Relay, Endpoint
Relay
Every tlspyo
application requires a central Relay
.
The Relay
lives on a machine that can be reached by all Endpoints
.
Typically, you will want this machine to be accessible to your Endpoints
via your private local network, or via the Internet through port forwarding.
Note however that, before you make your Relay
visible to the Internet via, e.g., port forwarding, it is important that you read the Security section.
Creating a Relay
is straightforward:
# Initialize a relay to allow connectivity between endpoints
re = Relay(
port=3000, # this must be the same on your Relay and Endpoints
password="VerySecurePassword", # this must be the same on Relay and Endpoints, AND be strong
local_com_port=3001, # this needs to be non-overlapping if Relays/Endpoints live on the same machine
security="TLS" # this is the default; replace by None if you do not want to use TLS
)
As soon as your Relay
is created, it is up and running.
Behind the scenes, it is now waiting for TLS connections from Endpoints
.
This is done in a background process that listens to port
3000 in this example.
This process also communicates with your Relay
via local_com_port
3001 in this example.
Usually, you can ignore local_com_port
and leave it to the default, unless you use several Endpoints/Relay
on the same machine, which we will do.
Endpoints
Now that our Relay
is ready, let us create a bunch of Endpoints
.
This is also pretty straightforward:
# Initialize a producer endpoint
prod = Endpoint(
ip_server='127.0.0.1', # IP of the Relay (here: localhost)
port=3000, # must be same port as the Relay
password="VerySecurePassword", # must be same (strong) password as the Relay
groups="producers", # this endpoint is part of the group "producers"
local_com_port=3002, # must be unique
security="TLS" # this is the default; replace by None if you do not want to use TLS
)
# Initialize consumer endpoints
cons_1 = Endpoint(
ip_server='127.0.0.1',
port=3000,
password="VerySecurePassword",
groups="consumers", # this endpoint is part of group "consumers"
local_com_port=3003, # must be unique
security="TLS"
)
cons_2 = Endpoint(
ip_server='127.0.0.1',
port=3000,
password="VerySecurePassword",
groups="consumers", # this endpoint is part of group "consumers"
local_com_port=3004, # must be unique
security="TLS"
)
A nice thing about tlspyo
is that all communication is handled behind the scenes.
The above calls have all launched processes in the background which handle connection and communication between Endpoints
through the Relay
.
Let us now send some objects from the producer to the consumers.
As you may have noticed, we created two different groups here.
We put the producer in a group that we have named "producers", and the consumers in another group that we have called "consumers".
Note that Endpoint
can be created as being part of any number of groups (groups
can take a list of strings).
When communicating between endpoints, you can use those groups to make sure the right endpoints receive the right objects.
There are two ways for Endpoints
to send objects in tlspyo
:
-
Broadcasting is used to send an object to all endpoint in a given group. Furthermore, when an
Endpoint
connects to theRelay
, it receives the last object that was broadcast to each of his groups.# Producer broadcasts an object to any and all endpoint in the destination group "consumers" prod.broadcast("I HAVE BEEN BROADCAST", "consumers")
-
Producing is used to send an object to a queue (FIFO) that is shared between all
Endpoints
of a given group. The endpoints of the receiving group must Notify theRelay
to get access to an object that has been put in that shared queue.# Producer sends an object to the shared queue of destination group "consumers" prod.produce("I HAVE BEEN PRODUCED", "consumers") # Consumer notifies the Relay that it wants one produced object destined for "consumers" cons_1.notify("consumers")
Once objects reach the consumer endpoint, they are stored in a local queue from which you can retrieve objects whenever you want. To do this, there are multiple options:
- To retrieve from the local queue in a FIFO fashion, use
pop(blocking=blocking, max_items=max_items)
. - To retrieve the most recent item(s) in the local queue and discard the rest, use
get_last(blocking=blocking, max_items=max_items)
. - To get all items that are currently in the local queue, use
receive_all(blocking=blocking)
.
:information_source: Notes:
- All calls above return a list of objects. If no objects are returned, the result will be an empty list.
- If
blocking
isTrue
, all methods above will block until at least one item is received (default toFalse
). - In
pop
andget_last
, usemax_items
to specify a maximum number of items to be returned (defaults to 1).
Now, let our consumers retrieve their loot:
# Consumer 1 is able to retrieve the broadcast AND the consumed object:
res = []
while len(res) < 2:
res += cons_1.receive_all(blocking=True)
print(f"Consumer 1 has received: {res}")
# Consumer 2 is able to retrieve only the broadcast object:
res = cons_2.receive_all(blocking=True)
print(f"Consumer 2 has received: {res}")
which prints:
Consumer 1 has received: ['I HAVE BEEN BROADCAST', 'I HAVE BEEN PRODUCED']
Consumer 2 has received: ['I HAVE BEEN BROADCAST']
Once we are done, we can stop
all Endpoints
, and then the Relay
for the sake of a graceful exit:
# Let us close everyone gracefully:
prod.stop()
cons_1.stop()
cons_2.stop()
re.stop()
There you go! You have now sent your first object over the network using tlspyo
.
Please check out the API documentation for more advanced usage.
Security
DISCLAIMER
We are doing our best to make tlspyo
reasonably secure when used correctly, but we provide ABSOLUTELY NO GUARANTEE that it is in any sense.
We are a small open-source community, and we greatly appreciate your contribution to tackle any potentially unreasonable security concerns or important missing information.
Please submit a detailed issue if you are aware of any important exploit not covered in this section.
Implementation
tlspyo
relies on the Twisted framework regarding TLS implementation and network management.
Important to know
:warning: Objects transferred by tlspyo
are serialized with pickle
by default, so that you can transfer most python objects easily.
NEVER TRANSFER PICKLED OBJECTS OVER A PUBLIC NETWORK WITHOUT tlspyo
, as this would make you vulnerable to dangerous exploits.
This is because unpickling untrusted pickled objects (i.e., pickled objects created by a malicious user) can lead to arbitrary code execution on your machine.
To prevent this from happening, tlspyo
provides two interdependent layers of security:
Endpoints
authenticate yourRelay
via TLS, which must use your own secret key and public certificate. This ensures yourEndpoints
are indeed talking to yourRelay
and not to some man-in-the-middle, provided you keep your secret key secure. This also prevents anyone else from eavesdropping thanks to TLS encryption.- Every object transfer is protected by a password known to both the
Relay
and theEndpoints
(thepassword
argument). No object is deserialized without verification of the password. This ensures that anyone posing as an endpoint will never be able to send undesired objects through your relay unless they know your password.
If a malicious user successfully posed as your Relay
, your Endpoint
would send them messages that they could decrypt, including your password (this is prevented by TLS when using your own secret key and public certificate).
If they successfully posed as your Endpoint
they could send malicious pickled objects to your Relay
(this is prevented by them not knowing your password).
In a nutshell, when using tlspyo
you want your password to be as strong as possible, and your TLS secret key to be kept... well, secret :lock:
For safety-critical applications, we recommend you ditch pickle
altogether and instead code a secure custom serialization protocol, on top of the TLS layer provided by tlspyo
.
Custom serialization
By default, tlspyo
uses pickle
for serialization and relies on TLS to prevent attacks.
In advanced application, you may want to use another serialization protocol instead.
For instance, you may want to transfer non-picklable objects, further optimize the security of your application, or simply use a pickle
serialization protocol or your choice instead of your Python's default.
In particular, in security=None
mode (i.e., with TLS disabled) over a public network, using your own secure serialization protocol is critical.
tlspyo
makes this easy.
All you need to do is code your own serialization protocol following the pickle.dumps
/pickle.loads
signature, and pass it to the serializer
/deserializer
arguments of both your Relay
and Endpoints
.
For instance:
import pickle as pkl
from tlspyo import Relay, Endpoint
# We define a custom serialization protocol based on pickle for simplicity.
# Of course, this is only for illustration.
# In practice, you may not want to use pickle here.
def my_custom_serializer(obj):
"""
Takes a python object as input and outputs a bytestring
"""
return b"header" + pkl.dumps(["TEST", pkl.dumps(obj)])
def my_custom_deserializer(bytestring):
"""
Takes a bytestring as input and outputs a python object
"""
assert len(bytestring) > len(b"header")
assert bytestring[:len(b"header")] == b"header"
bytestring = bytestring[len(b"header"):]
tmp = pkl.loads(bytestring)
assert isinstance(tmp, list)
assert len(tmp) == 2
assert tmp[0] == "TEST"
obj = pkl.loads(tmp[1])
return obj
if __name__ == '__main__':
re = Relay(
port=3000,
password="VerySecurePassword",
local_com_port=3001,
security="TLS",
serializer=my_custom_serializer,
deserializer=my_custom_deserializer
)
ep = Endpoint(
ip_server='127.0.0.1',
port=3000,
password="VerySecurePassword",
groups="group1",
local_com_port=3002,
security="TLS",
serializer=my_custom_serializer,
deserializer=my_custom_deserializer
)
External links
tlspyo
is an open-source project hosted at Polytechnique Montreal - MISTlab.
We use it in various projects, ranging from parallel meta-learning to data transfer between multiple learning robots.
tlspyo
relies on Twisted to manage network robustness and security.
Contributing
Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.
If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!
- Fork the Project
- Create your Feature Branch (
git checkout -b feature/AmazingFeature
) - Commit your Changes (
git commit -m 'Add some AmazingFeature'
) - Push to the Branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
License
Distributed under the MIT License. See LICENSE.txt
for more information.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file tlspyo-0.3.0.tar.gz
.
File metadata
- Download URL: tlspyo-0.3.0.tar.gz
- Upload date:
- Size: 33.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.11
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | fe8ee0323338ab70bd386bd5fb4c582c3a68be0a533128f7f03c36677c7c673c |
|
MD5 | ac6ff71635c7d37425acc3ac9d72519e |
|
BLAKE2b-256 | 4668eaecd83c142abca8652b3d602c598e9d392e0f34aa7f63b52d8fa26c8d4e |