Skip to main content

Simulation of protecting AI model from poisoning attack

Project description

[!TIP] This project is a complementary to the Attack AI Model project that can be found in the repo. To this end, to understand the entire idea you shall make sure to read both documentations.

About the Package

Author's Words

Welcome to the first Edition of PROTECT AI MODEL SIMULATION- protectai Tool official documentation. I am Dr. Deniz Dahman the creator of the BireyselValue algorithm and the author of this package. In the following section you will have a brief introduction on the principal idea of the protectai tool. In addition, a reference to the academic publication on this potential type of cyber attack and potential defence. Before going ahead, I would like to let you know that I have done this work as an independent scientist without any fund or similar capacity. I am dedicated to proceeding and seek further improvement on the proposed method at all costs. To this end if you wish to contribute in any way to this work, please find further details in the contributing section.

Contributing

If you wish to contribute to the creator of this project and the author, you may want to check possible ways on:

To Contribute in any way possible, thank you, you can check :

  1. view options to subscribe on Dahman's Phi Services Website
  2. subscribe to this channel Dahman's Phi Services
  3. you can support on patreon

If you prefer any other way of contribution, please feel free to contact me directly on contact.

Thank you

Introduction

The Data poisoning

The current revolution of the AI framework has become almost the main element in every solution that we use daily. Many industries heavily rely on those AI models to generate responses accordingly. In fact, it has become a trend that once a product utilizes AI as its backend, then its potential to penetrate marketplace is substantially higher than the one doesn't.

This trend has pushed many industries to consider the implementation of AI models in the tire of their business process. This rush is understandable from the way that those industries believe; for businesses to secure a place in today’s competitive market, they must catch up with the most recent advances in the realm of technology. However, one must ask what is this AI problem solving paradigm anyway?

In my published project the Big Bang of Data Science I do provide a comprehensive answer to this question, from abstract and concrete perspective. But let me just summaries both perspectives in a few lines.

Basically, AI model is a mathematical tool, so to speak. It mainly relies on an important stage, the training stage. As a metaphor, imagine it as a human brain that learns over time from the surroundings and the circumstances where it lives. Those surroundings and circumstances are the cultures, beliefs, people, etc. Once this brain is shaped and formed, it starts to make decisions and offers answers. Yet, we from the outside start to judge those decisions and answers and the brain would react to those judgements.

AI models are mimicking such paradigm. The human brain is the mathematical equation of the model, the surroundings and the circumstances are the training samples that we feed to the mathematical equation to learn, the judgements by the surroundings are the calculation of those misclassified cases which known as obtaining the derivatives. Obviously, we then aim to have a model that can give accurate answers with a minimum level of mistakes.

training_ai_model

Once the technical workflow of the AI is understood, it should be clear then that the training samples from which the AI model learns are the most important element of this entire flow. This element can be thought of as the adjudicator whether the model will succeed or fail. To this end, such element is a target for adversaries who aim to fail the model. If such attack is successful, then it’s known as data poisoning attack.

Data poisoning is a type of cyberattack in which an adversary intentionally compromises a training dataset used by an AI or machine learning (ML) model to influence or manipulate the operation of that model

Such type of attack can be done in several ways:

  • Intentionally injecting false or misleading information within the training dataset,
  • Modifying the existing dataset,
  • Deleting a portion of the dataset.

Unfortunately, such cyber-attack could go undetected for so long due to the framework of the AI at the first place. Furthermore, the lack of fundamental understanding of the AI black-box, and the employing of ready-to-use AI models by industry practicians without the comprehensive understanding of the mathematics behind the entire framework.

However, there are some signs that might lead to the observation that the AI model is compromised. Some of those signs are:

  1. Model degradation: Has the performance of the model inexplicably worsened over time? Unintended outputs Does the model behave unexpectedly and produce unintended results that cannot be explained by the training team?
  2. Increase in false positives/negatives: Has the accuracy of the model inexplicably changed over time? Has the user community noticed a sudden spike in problematic or incorrect decisions?
  3. Biased results: Does the model return results that skew toward a certain direction or demographic (indicating the possibility of bias introduction)?
  4. Breaches or other security events: Has the organization experienced an attack or security event that could indicate they are an active target and/or that could have created a pathway for adversaries to access and manipulate training data?
  5. Unusual employee activity: Does an employee show an unusual interest in understanding the intricacies of the training data and/or the security measures employed to protect it?

This gif illustrates the kind of data poisoning attack on AI Model. It basically shows how the alphas or weights are influenced by the new training samples which the model uses to update itself.

To this end, such matters must be considered by the company AI division once they decide to employ the AI problem-solving paradigm.

protectai package __version__1.0

I have presented two types of attacks that AI model can face; the first type is corrupt data training sample, and the second is crazy the model. Both are explained in detail in the repo project of the attackai. I have illustrated the image of both attacks. now I shall:

  1. first, illustrate a healthy image of AI model, in this case I will carry on the same simulation using the scenario of the storyline, so I will show the first original creation of the x-ray-ai.h5 model. Check the details.

  2. Second, I will show the effect of both attacks that I illustrated using the attackai tool. You may want to check the illustration of attack type one, and attack type two. As a result, will observe the performance after both attacks.

  3. Finally, I shall present the proposal of defence strategy from both attacks.

Here, I introduce the tool protectai, which basically is going to accomplish the three outlines motioned above. .

[!IMPORTANT] This tool illustrates the prior three objectives for an educational purpose ONLY. The author provides NO WARRANTY OF ANY KIND, INCLUDING THE WARRANTY OF DESIGN, MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.

Installation

[!TIP] The simulation using protectai is done on a binary class images dataset, referenced in the below section. The gif illustrations shows the storyline as assumed.

Data Availability

The NIH chest radiographs that support the findings of this project are publicly available at https://nihcc.app.box.com/v/ChestXray-NIHCC and https://www.kaggle.com/c/rsna-pneumonia-detection-challenge. The Indiana University Hospital Network database is available at https://openi.nlm.nih.gov/. The WCMC pediatric data that support the findings of this study are available in the identifier 10.17632/rscbjbr9sj.3

Project Setup

Story line

Basically, the storyline of the simulation is simulated on a chest X-ray dataset, outlined above. The outline of the story goes as follows:

  1. A fictious healthcare center is operating as a chest X-ray diagnosis place.
  2. In its workflow, called the old paradigm, the case person has a chest X-ray scan; then the scanned image is delivered to a domain expert, the Pulmonologist, to examine the image; and finally, the results are presented to the case person.
  3. The clinic decides to move to a new paradigm, by integrating a new layer for diagnosis that lays between the case input and the domain expert diagnosis. The new proposed block is to implement an AI agent that essentially diagnosis the case and predict its label as Normal or Pneumonia, then that will pass to the domain expert to confirm.

Environment Setup

In the first phase the clinic creates the AI Model, that is by following:

  1. Main folder that contains subfolders (train, and test); in each there are subfolders of classes for (normal and pneumonia) cases
  2. The images basically are transformed into tensors which then are fed into a neural network with x hidden layers, the system works until it produces the main predictive model x-ray-ai.h5
  3. This model contains the valid ratios weights that done the math to make the predictions

Pipeline Setup

The AI team then decides to create the pipeline to update the mode as follows:

  1. On a weekly basis the new cases are collected
  2. Then the same setup as mentioned above is created for the folders
  3. The original model then used to make the current update using the new batches of x-ray images
  4. Technically speaking, that new input will update the model in a way that change the model weights values

AI Model Protection

The AI team has decided to add a security layer before the update of the AI model. Basically, the new security layer shall inspect the weekly batches before feeding them to the AI model. The new security layer shall perform inspection against:

  1. Attack type one.

  2. attack type two.

Install protectai

[!TIP] make sure to create the project setup as outlined above.

to install the package all what you have to do:

pip install protectai

You should then be able to use the package. You may want to confirm the installation

pip show protectai

The result then shall be as:

Name: protectai
Version: 1.0.3
Summary: Simulation of protecting AI model from poisoning attack
Home-page: https://github.com/dahmansphi/protectai
Author: Dr. Deniz Dahman's
Author-email: dahmansphi@gmail.com

Employ the protectai -Conditions

[!IMPORTANT] It’s mandatory, to use the first edition of protectai, to make sure the update folder that have the subfolders of the normal and Pneumonia as illustrated in the gif.

Detour in the protectai package- Build-in

Once your installation is done, and you have met all the conditions, then you may want to check the build-in functions of the protectai and understand each.
Essentially, if you create an instance from the protectai as so:

from protectai.protectai import ProtectAI
inst = ProtectAI()

now this inst instance offers you access to those build in functions that you need. this is a screenshot:

Screenshot of build-in functions of the protectai tool.

Once you have protectai instance, here are the details of the right sequence to employ the defence:

Set the path for the main model:

first we shall call the set_path_models() function to prepare the training and testing dataset with the spec for the main model.

inst.set_path_models(paths)

the function takes one argument that is the path to the training and testing folders. or basically, any folders that shall want to be in tandem with creation a predictive model of images. Here are some pictures illustrate the result after calling the function.

Screenshot of attack one with stamp.

make model function:

This function as the name implies, is to create the skelaton AI predictive model. This is a neural network model that accept binary type of classess

inst.make_model()

train and test the model

In the standard workflow to create a neural network we use training samples to train the model and eventually the model would have its set of ratios/weights. However, we should test that trained model. To this end, this function train_test_model() does that. It requires two arguments, the first is the skeleton of the neural network model created in previous function, and the second is a list of paths to the training and testing folders.

inst.train_test_model(model, paths)

The result of the training and testing can be seen in the figure below: Screenshot train test model.

Once the result is fine and satisfactory then you can move and create the original model.

Produce AI Model

Once these results from the train and test are satisfactory then you can use the produce_model() function to output the model. This function takes two arguments, the first is the path to the skeleton model we output from above, the second argument is a list contains the path to the training samples, and path where to save the new model. Once this function is executed then you have the original model as shown in the picture below. notably you have options of the type of the saved model, e.g. h5, Keras, or any form of structure you wish to save the ratios and any other information.

inst.produce_model(model=model, paths_folder=paths_produce)

original model

Prediction Task

Since the model is ready, it is time now to see the model in action. There is the use of the predict() function. This function takes two arguments; the first one is the path to the folder of images to make prediction on their labels, and the path to the original model.

inst.predict(img_path=path_to_imgs, model_path=orignial_model)

if we, just for testing purpose, point the model to folder of normal imgages we could see the results as in the figure below with accuracy of 100%, and the same for the Pneumonia images we could also see almost 100%.

prediction by model for normal imgs prediction by model for pneumonia imgs

Model Update Framework

I have discussed in detail the workflow of updating the existing model in the pipeline workflow section. To make that workflow in action the use of the inst.update_trained_model_with_test() and inst.update_trained_model_produce() is the right order. As the name suggests, the first function does the update with test and the second does the update with save new mode updated. Basically, both functions take two arguments; the first argument is to the model which has existed to update, the second argument is path to the new weekly updated X-ray images.

inst.update_trained_model_with_test(model=paths_to_existing_model, paths=paths_new_imgs) inst.update_trained_model_produce(model=paths_to_existing_model, paths=paths_new_imgs)

Effect of poisoning attack

Assume either type of attacks has happened as discussed in types of attack. We could see the effect of that attack on the accuracy of the model using:

inst.update_trained_model_produce(model=paths_to_existing_model, paths=paths_new_imgs)

and

inst.predict(img_path=path_to_imgs, model_path=orignial_model)

Observe that the prediction tasks which we hit 100% for both classes is now at the same level of accuracy for normal images, however it is almost 0% for the other class of pneumonia images. that means if a person actually has cancer the model says they are not; as presented in the figures below.

result of updated model accuracy

prediction by attacked model

result of prediction on normal classes

prediction by model for normal imgs

result of prediction on pneumonia classes

results by attacked model

[!CAUTION] You can see the effect of such an attack how misleading it is. This is a sample illustration that we can see its effect after one round, however such idea could go unpredicted for some time by poisoning the model slowly until it loses its main core of ratios.

Secure layer using the norm culture solution

As I have discussed earlier in the simulative storyline example, if the AI team decided to add a security layer before the actual update happens. This is where I do introduce the norm culture solution. The mathematical intuition behind the method can be found in the academic published paper, however in this demonstration I will show how such layer could help to capture the images of which they have been attacked by either type one or two.

norm cuture logo

In the following figure you can see where the location of this norm culture box. It will be basically the stage where you can check if the folder of the new training images are compromised in anyway. To employ the norm culture box we shall be using inst.protect_me() function which in principle calculate four main elements:

  1. two numpy where each represents the number of classes called alpha
  2. two norms where each represents the number of classes called norms
  3. two flatten where each represents the number of classes called flatten

the figure illustrate that:

norm images

the following functions illustrates the implementation with two arguments the path of the original clean training dataset and path to where save the three elements:

inst.protect_me(paths=paths, save_path_norms=save_folder)

Here are the use in sequece two functions

inst.inspect_attack_one()

, and

inst.inspect_attack_two().

inspect_attack_one()

This function as the name implies is to inspect the new batch of images if they are compromised in anyway with the first type of attack. it takes two arguments the first one is the path to inspect and the second is the path to the model norms which is discussed above. the requrecruitment of this function using the function:

inst.inspect_attack_one(paths_to_inpect=paths_to_inspect, paths_to_norms=paths_to_norm)

here is figure shows the result with inspecting the folder of which that had been attacked prviously.

type one attack caught by inpsect type one attack caught by inpsect

inspect_attack_two()

This function as the name implies is to inspect the new batch of images if they are compromised in anyway with the second type of attack. two important arguments, first argument is path to the folder that need to be inspected, and the second path to original predictive model that has been proved to be clean orginally. the requrecruitment of this function using the function:

inst.inspect_attack_two(paths_to_inpect=paths_to_inspect, model_verified=updated_trained_model_path)

here is figure shows the result with inspecting the folder of which that had been attacked prviously.

type one attack caught by inpsect

Conclusion on installation and employing

I hope this simulation shows the idea of such attack, its effect and a proposal for defence.

Reference

please follow up on the publications on the website to find the academic published paper

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

protectai-1.0.3.tar.gz (34.7 kB view details)

Uploaded Source

Built Distribution

protectai-1.0.3-py3-none-any.whl (26.8 kB view details)

Uploaded Python 3

File details

Details for the file protectai-1.0.3.tar.gz.

File metadata

  • Download URL: protectai-1.0.3.tar.gz
  • Upload date:
  • Size: 34.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.11.6

File hashes

Hashes for protectai-1.0.3.tar.gz
Algorithm Hash digest
SHA256 1e807b2ecdf8c5815dca094a0c0e1951343eb04dc91304270b608df0cce7c505
MD5 a4d8daf57139dbe874e62b78ad614a69
BLAKE2b-256 c9af84810720beddc9fbbe89d643b720b213b9a7b070fc34722a1568ac933a09

See more details on using hashes here.

File details

Details for the file protectai-1.0.3-py3-none-any.whl.

File metadata

  • Download URL: protectai-1.0.3-py3-none-any.whl
  • Upload date:
  • Size: 26.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.11.6

File hashes

Hashes for protectai-1.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 a5ace494264fffe5d17a836a22e98c619d759c687716fb1ff5d9ba34b6eaa93f
MD5 46a54715fbc9c2d89a0b96e0cce92dbe
BLAKE2b-256 b69a750e7ae18d9dd9ad7326cf935caad09f4c197b07c3c85578713e6dead72e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page