Skip to main content

AI Safety & Misinformation Detection - Detects 6 perturbation types from ACL 2025

Project description

🛡️ CogniGuard AI Safety Platform

AI Safety & Misinformation Detection Platform

The first multi-agent AI communication security platform.

Based on the ACL 2025 paper: "When Claims Evolve: Evaluating and Enhancing the Robustness of Embedding Models Against Misinformation Edits"

Installation

pip install cogniguard

## 🎯 Features

- **Multi-Stage Threat Detection** - 4-stage pipeline catches complex threats
- **Real-World Prevention** - Stops Sydney, Samsung, and Auto-GPT style attacks
- **Research-Based** - Built on ACL 2025 and EMNLP 2023 papers
- **Beautiful Dashboard** - Cyberpunk-themed Streamlit interface

## 🚀 Live Demo

[**Try it live!**](https://your-cogniguard.streamlit.app)

## 📊 Screenshots

![Dashboard](https://via.placeholder.com/800x400.png?text=CogniGuard+Dashboard)

## 🔧 Local Development

```bash
# Clone the repo
git clone https://github.com/your-username/cogniguard.git
cd cogniguard

# Install dependencies
pip install -r requirements.txt

# Run locally
streamlit run app.py


📚 Documentation
Visit the "About & Documentation" page in the app for full details.

🛡️ Threat Detection
CogniGuard detects:

✅ Goal hijacking (Sydney-style) Data exfiltration (Samsung-style) Power-seeking (Auto-GPT-style) Emergent collusion
✅ Social engineering
📄 License
Educational and research use.

👤 Author
Built by louisa wamuyu saburi



Protecting the Future of Multi-Agent AI Communication 🚀

Markdown

# 🛡️ CogniGuard AI Safety Platform

The first multi-agent AI communication security platform.

## 🎯 Features

- **Multi-Stage Threat Detection** - 4-stage pipeline catches complex threats
- **Real-World Prevention** - Stops Sydney, Samsung, and Auto-GPT style attacks
- **Research-Based** - Built on ACL 2025 and EMNLP 2023 papers
- **Beautiful Dashboard** - Cyberpunk-themed Streamlit interface

## 🚀 Live Demo

[**Try it live!**](https://your-cogniguard.streamlit.app)

## 📊 Screenshots

![Dashboard](https://via.placeholder.com/800x400.png?text=CogniGuard+Dashboard)

## 🔧 Local Development

```bash
# Clone the repo
git clone https://github.com/your-username/cogniguard.git
cd cogniguard

# Install dependencies
pip install -r requirements.txt

# Run locally
streamlit run app.py
📚 Documentation
Visit the "About & Documentation" page in the app for full details.

🛡️ Threat Detection
CogniGuard detects:

✅ Goal hijacking (Sydney-style) Data exfiltration (Samsung-style) Power-seeking (Auto-GPT-style) Emergent collusion
✅ Social engineering
📄 License
Educational and research use.

from cogniguard import ClaimAnalyzer

analyzer = ClaimAnalyzer()
result = analyzer.analyze("Th3 vaxx is s4fe fr fr no cap")

if result.is_perturbed:
    print("⚠️ Perturbations detected!")
    for p in result.perturbations_detected:
        print(f"  - {p.perturbation_type.value}: {p.explanation}")

        cogniguard analyze "Th3 vaxx is s4fe fr fr no cap"

        
Save and close.

---

### Step 1.3: Create a LICENSE File

```bash
notepad LICENSE

MIT License

Copyright (c) 2024 Your Name

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

👤 Author
Built by [Your Name]

Protecting the Future of Multi-Agent AI Communication 🚀

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cogniguard-1.0.0.tar.gz (50.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cogniguard-1.0.0-py3-none-any.whl (59.0 kB view details)

Uploaded Python 3

File details

Details for the file cogniguard-1.0.0.tar.gz.

File metadata

  • Download URL: cogniguard-1.0.0.tar.gz
  • Upload date:
  • Size: 50.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for cogniguard-1.0.0.tar.gz
Algorithm Hash digest
SHA256 37fa13d73cd1551c94948bd53d8603c27f996e8dd5b85392e17b7aa148c85307
MD5 67321ea34746005c503c52dc2b0c5ac5
BLAKE2b-256 a70d13107bba5b36c34f80d764f2653806d255c978165f8726651d72bf9c2d54

See more details on using hashes here.

File details

Details for the file cogniguard-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: cogniguard-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 59.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for cogniguard-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8d7ab7c492f3467b6029d12eb016d7c57cbb23c3f6ac4dd1de3ce2c1238489a6
MD5 62644b5935fde6ef26d77162e310d141
BLAKE2b-256 dd9f00eabd32458ec7fed930fe2f18473e8a020b8b2be4461a9cb9896c5254ad

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page