Skip to main content

DISKOVERY: Disk Forensics Tool for Data Categorization & Keyword Filtering

Project description

๐Ÿงช DISKOVERY: Disk Forensics Tool for Data Categorization & Keyword Filtering

DISKOVERY is a Python-based digital forensics tool designed to analyze disk images. It performs a multi-stage forensic analysis including imaging, partition parsing, file categorization, keyword-based filtering, and automatic PDF reporting. The tool supports both complete and filtered analysis outputs and provides investigators with a concise overview of disk contents. It is a command-line interface (CLI) tool that works well on Ubuntu and Debian-based systems.


โš™๏ธ Features

  • Disk Image Support (.img, .E01, .dd)
  • Partition Parsing using mmls
  • File Categorization:
    • Deleted
    • Encrypted
    • Current
    • Hidden
  • File Type Filtering (e.g., .pdf, .docx)
  • Keyword Search in extracted text-based files
  • Visual Summary via pie charts
  • PDF Report Generation with listings, and visualizations

Steps to use

  1. Insert pendrive.
  2. To check the location at which it's inserted: sudo fdisk -l
  3. Go to script folder and run main.py: sudo python3 main.py

๐Ÿ“ Project Structure

DISKOVERY/
โ”œโ”€โ”€ stages/
โ”‚   โ”œโ”€โ”€ __init__.py
โ”‚   โ”œโ”€โ”€ stage1_disk_imaging.py
โ”‚   โ”œโ”€โ”€ stage2_extraction.py
โ”‚   โ”œโ”€โ”€ stage3_categorization.py
โ”‚   โ”œโ”€โ”€ stage4_filtering.py
โ”‚   โ”œโ”€โ”€ stage4_2_keyword.py
โ”‚   โ””โ”€โ”€ stage5_reporting.py
โ”œโ”€โ”€ utils/
โ”‚   โ”œโ”€โ”€ __init__.py
โ”‚   โ””โ”€โ”€ run_command.py
โ”œโ”€โ”€ main.py
โ”œโ”€โ”€ LICENSE
โ”œโ”€โ”€ README.md
โ”œโ”€โ”€ requirements.txt
โ”œโ”€โ”€ setup.py
โ”œโ”€โ”€ MANIFEST.in
โ””โ”€โ”€ pyproject.toml

๐Ÿš€ Quick Start

1. Clone the Repository

git clone https://github.com/simmithapad/DISKOVERY.git
cd DISKOVERY

2. Run Setup (Installs Tools + Python Packages)

pip install -r requirements.txt

3. Start the Tool

python3 -m venv .venv
source .venv/bin/activate
python3 main.py

๐Ÿ› ๏ธ Dependencies

System Tools (Installed via setup.sh)

  • dcfldd
  • sleuthkit (for mmls, fls, fsstat)
  • binwalk
  • grep and pdfgrep

Python Packages

  • fpdf
  • elasticsearch
  • docx2txt
  • re

๐Ÿ“„ Output

  • Disk images saved in ./output_files/
  • PDF reports saved in ./output_files/reports/
  • Extracted files saved in ./output_files/extracted_files/

๐Ÿ“ฌ Future Work

  • GPU Acceleration
  • Memory Forensics Integration

๐Ÿ‘ค Author

Simmi Thapad
Vrinda Abrol


License

This project is licensed under the MIT License - see the LICENSE file for details.


๐Ÿ”’ Disclaimer

[!Important] This tool is intended for educational and lawful forensic analysis only. Use responsibly.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

diskovery-0.1.0.tar.gz (12.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

diskovery-0.1.0-py3-none-any.whl (14.7 kB view details)

Uploaded Python 3

File details

Details for the file diskovery-0.1.0.tar.gz.

File metadata

  • Download URL: diskovery-0.1.0.tar.gz
  • Upload date:
  • Size: 12.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for diskovery-0.1.0.tar.gz
Algorithm Hash digest
SHA256 704a48f0497f64ee7d79eb17140cd29b12b6c51416c8b5c16c856fc059eebc91
MD5 5e6175bc803912167d4957b87ad80590
BLAKE2b-256 6371a92bba6e47d18176fc447677191ec3f677b7253bae0992ec88c851cbf9f0

See more details on using hashes here.

File details

Details for the file diskovery-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: diskovery-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 14.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for diskovery-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5c6d69c12200fe71c2e5b4994b06907ce4fad0c77ed636f4e3b8f5a8ee4c6f8c
MD5 f6d5e5f1138b6cffac3ed014b3c6b9c9
BLAKE2b-256 d86453fdcb7d3812687e9b4de34ae5974911dcdea5d67aa1f02683f80966de19

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page