Framework for existing attacks on trained StableBaselines3 DRL policies.
Project description
attacks-on-drl
Framework for existing attacks on trained StableBaselines3 DRL policies.
Example Usage
from attacks_on_drl.attacker import FGSMAttacker
from attacks_on_drl.runner import AttackRunner
from attacks_on_drl.victim import DQNVictim
# Defined environment (env) and SB3 policy (policy)
# SB3 policy must be wrapped in a Victim class
victim = DQNVictim(policy)
attacker = FGSMAttacker(victim)
runner = AttackRunner(env, attacker, victim, episode_max_frames=10_000)
runner.run(n_episodes=10)
Implemented Attacks
- FGSM $\ell_\infty$ Attacker [1]
- Value Function Attack [2]
- FGSM Every N Steps Attacker [2]
- Strategically Timed Attack [3]
- Critical Point Attack [4]
Adding A Custom Attack
Any attack must implement the BaseAttacker class, implementing the abstract step method. For example, suppose we introduce an attacker which attacks using FGSM every other step:
from attacks_on_drl.attacker.common import BaseAttacker
class EveryOtherStepAttacker(BaseAttacker):
def __init__(self, victim: BaseVictim) -> None:
super().__init__()
self.victim = victim
wrapped_victim = VictimModuleWrapper(self.victim)
self._perturbation_method = FGSM(wrapped_victim, eps=eps)
self.attack = True
def step(self, observation: VecEnvObs) -> tuple[VecEnvObs, bool]:
if self.attack:
actions = torch.tensor(self.victim.choose_action(observation, deterministic=True))
observation = self._perturbation_method(torch.from_numpy(observation)).numpy()
attacked = self.attack
self.attack = not self.attack
return obseration, attacked
References
[1] Huang, S., Papernot, N., Goodfellow, I., Duan, Y. and Abbeel, P., 2017. Adversarial attacks on neural network policies. arXiv preprint arXiv:1702.02284.
[2] Kos, J. and Song, D., 2017. Delving into adversarial attacks on deep policies. arXiv preprint arXiv:1705.06452.
[3] Lin, Y.C., Hong, Z.W., Liao, Y.H., Shih, M.L., Liu, M.Y. and Sun, M., 2017. Tactics of adversarial attack on deep reinforcement learning agents. arXiv preprint arXiv:1703.06748.
[4] Sun, J., Zhang, T., Xie, X., Ma, L., Zheng, Y., Chen, K. and Liu, Y., 2020, April. Stealthy and efficient adversarial attacks against deep reinforcement learning. In Proceedings of the AAAI conference on artificial intelligence (Vol. 34, No. 04, pp. 5883-5891).
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file attacks_on_drl-0.1.0.tar.gz.
File metadata
- Download URL: attacks_on_drl-0.1.0.tar.gz
- Upload date:
- Size: 10.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
90b7efab06645e75e9c9aa3eeeac7ab0a1f65155be255ad07f9ef738809ef11a
|
|
| MD5 |
847a8dda9622475a6c08579b352f2b50
|
|
| BLAKE2b-256 |
01128714a78f657f71d5ea7aeb10b357964964e3aeb6ee81f792ebcaeed160ab
|
File details
Details for the file attacks_on_drl-0.1.0-py3-none-any.whl.
File metadata
- Download URL: attacks_on_drl-0.1.0-py3-none-any.whl
- Upload date:
- Size: 16.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3fdc656e6516bb9b9f6f17ed85fe78e50915491f94b57097ec41ff95f1b9ed4b
|
|
| MD5 |
cbcbdb697ff8a8759dfeb8fecbb5c123
|
|
| BLAKE2b-256 |
986264c9f9614ec6342b0b009c71aa163cd4a723ebbf15b51cd4cc7248bd5137
|