Skip to main content

DISKOVERY: Disk Forensics Tool for Data Categorization & Keyword Filtering

Project description

๐Ÿงช DISKOVERY: Disk Forensics Tool for Data Categorization & Keyword Filtering

DISKOVERY is a Python-based digital forensics tool designed to analyze disk images. It performs a multi-stage forensic analysis including imaging, partition parsing, file categorization, keyword-based filtering, and automatic PDF reporting. The tool supports both complete and filtered analysis outputs and provides investigators with a concise overview of disk contents. It is a command-line interface (CLI) tool that works well on Ubuntu and Debian-based systems.


โš™๏ธ Features

  • Disk Image Support (.img, .E01, .dd)
  • Partition Parsing using mmls
  • File Categorization:
    • Deleted
    • Encrypted
    • Current
    • Hidden
  • File Type Filtering (e.g., .pdf, .docx)
  • Keyword Search in extracted text-based files
  • Visual Summary via pie charts
  • PDF Report Generation with listings, and visualizations

Steps to use

  1. Insert pendrive.
  2. To check the location at which it's inserted: sudo fdisk -l
  3. Go to script folder and run main.py: sudo python3 main.py

๐Ÿ“ Project Structure

DISKOVERY/
โ”œโ”€โ”€ stages/
โ”‚   โ”œโ”€โ”€ __init__.py
โ”‚   โ”œโ”€โ”€ stage1_disk_imaging.py
โ”‚   โ”œโ”€โ”€ stage2_extraction.py
โ”‚   โ”œโ”€โ”€ stage3_categorization.py
โ”‚   โ”œโ”€โ”€ stage4_filtering.py
โ”‚   โ”œโ”€โ”€ stage4_2_keyword.py
โ”‚   โ””โ”€โ”€ stage5_reporting.py
โ”œโ”€โ”€ utils/
โ”‚   โ”œโ”€โ”€ __init__.py
โ”‚   โ””โ”€โ”€ run_command.py
โ”œโ”€โ”€ main.py
โ”œโ”€โ”€ LICENSE
โ”œโ”€โ”€ README.md
โ”œโ”€โ”€ requirements.txt
โ”œโ”€โ”€ setup.py
โ”œโ”€โ”€ MANIFEST.in
โ””โ”€โ”€ pyproject.toml

๐Ÿš€ Quick Start

1. Clone the Repository

git clone https://github.com/simmithapad/DISKOVERY.git
cd DISKOVERY

2. Run Setup (Installs Tools + Python Packages)

pip install -r requirements.txt

3. Start the Tool

python3 -m venv .venv
source .venv/bin/activate
python3 main.py

๐Ÿ› ๏ธ Dependencies

System Tools (Installed via setup.sh)

  • dcfldd
  • sleuthkit (for mmls, fls, fsstat)
  • binwalk
  • grep and pdfgrep

Python Packages

  • fpdf
  • elasticsearch
  • docx2txt
  • re

๐Ÿ“„ Output

  • Disk images saved in ./output_files/
  • PDF reports saved in ./output_files/reports/
  • Extracted files saved in ./output_files/extracted_files/

๐Ÿ“ฌ Future Work

  • GPU Acceleration
  • Memory Forensics Integration

๐Ÿ‘ค Author

Simmi Thapad
Vrinda Abrol


License

This project is licensed under the MIT License - see the LICENSE file for details.


๐Ÿ”’ Disclaimer

[!Important] This tool is intended for educational and lawful forensic analysis only. Use responsibly.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

diskovery-0.1.1.tar.gz (4.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

diskovery-0.1.1-py3-none-any.whl (4.1 kB view details)

Uploaded Python 3

File details

Details for the file diskovery-0.1.1.tar.gz.

File metadata

  • Download URL: diskovery-0.1.1.tar.gz
  • Upload date:
  • Size: 4.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for diskovery-0.1.1.tar.gz
Algorithm Hash digest
SHA256 e3e6abffaf97edf0d8c4d328ef417205a6b63a0db3e9e0086065a1fdb2b9b4a7
MD5 2d6e2fa36a2ac8fd51d5b89e51e3af8a
BLAKE2b-256 95cf74dc522ddf04a79e921d28761cd9673181af2e6c0604d300b9b449922ac9

See more details on using hashes here.

File details

Details for the file diskovery-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: diskovery-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 4.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for diskovery-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 f920cecb62fbf0dd55733993cb06f16fcb8b81d92b20f111c55703ad95ca24fa
MD5 4c30cd42f36eb0c0428c8d933399c336
BLAKE2b-256 feddbdebf28531e99124c586bf2b74436593e42a00e90721bfd662ddf95da0c0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page