Skip to main content

Server for BastionLab Confidential Analytics.

Project description

BastionLab

Mithril Security – BastionLab

Website | Documentation | Discord | Blog | LinkedIn | Twitter


👋 Welcome to BastionLab!

Where data owners and data scientists can securely collaborate without exposing data - opening the way to projects that were too risky to consider.

⚙️ What is BastionLab?

BastionLab is a simple privacy framework for data science collaboration.

It acts like an access control solution, for data owners to protect the privacy of their datasets, and stands as a guard, to enforce that only privacy-friendly operations are allowed on the data and anonymized outputs are shown to the data scientist.

  • Data owners can let external or internal data scientists explore and extract values from their datasets, according to a strict privacy policy they'll define in BastionLab.
  • Data scientists can remotely run queries on data frames without seeing the original data or intermediary results.

BastionLab is an open-source project. Our solution is coded in Rust 🦀 and uses Polars 🐻, a pandas-like library for data exploration.

👌 Built to be easy and safe to use

Collaborating remotely and safely when it came to data science wasn’t possible until now for highly regulated fields like health, finance, or advertising. When they wanted to put their assets to good use, data owners had to open unrestricted access to their dataset, often through a Jupyter notebook. This was dangerous because too many operations were allowed and the data scientist had numerous ways to extract information from the remote infrastructure (print the whole database, save the dataset in the weights, etc).

That is why we built BastionLab with the aim to ensure privacy while fitting easily in the whole data science workflow of both data owners and data scientists.

🚀 Quick tour

You can go try out our Quick tour in the documentation to discover BastionLab with a hands-on example using the famous Titanic dataset.

But here’s a taste of what using BastionLab could look like 🍒

Data Owner's side

from bastionlab import Connection
import polars as pl

df = pl.read_csv("titanic.csv")

with Connection("bastionlab.example.com") as client:
    client.polars.send_df(df)

Data Scientist's side

from bastionlab import Connection

with Connection("bastionlab.example.com") as client:
    all_remote_dfs = client.polars.list_dfs()
    remote_df = all_remote_dfs[0]
    remote_df.head(5).collect().fetch()

👀 What is this wheel

This wheel was made to deploy very easily BastionLab's server on a Google Colab/Jupyter Notebook environments.

Please remember that while you will have most of the functionality of BastionLab, this wheel was not made to be used in production environments. If you want to personalize more the server and get the security features, it is recommanded to deploy the server yourself. Please refer to the documentation for more information.

🗝️ Key Features

  • Access control: data owners can define an interactive privacy policy that will filter the data scientist queries. They do not have to open unrestricted access to their datasets anymore.
  • Limited expressivity: BastionLab limits the type of operations that can be executed by the data scientists to avoid arbitrary code execution.
  • Transparent remote access: the data scientists never access the dataset directly. They only manipulate a local object that contains metadata to interact with a remotely hosted dataset. Calls can always be seen by data owners.

🙋 Getting Help

🚨 Disclaimer

BastionLab is still in development. Do not use it yet in a production workload. We will audit our solution in the future to attest that it enforces the security standards of the market.

📝 License

BastionLab is licensed under the Apache License, Version 2.0.

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

See the License for the specific language governing permissions and limitations under the License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bastionlab_server-0.3.7.tar.gz (5.9 kB view hashes)

Uploaded Source

Built Distribution

bastionlab_server-0.3.7-py3-none-any.whl (6.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page