Skip to main content

DISKOVERY: Disk Forensics Tool for Data Categorization & Keyword Filtering

Project description

๐Ÿงช DISKOVERY: Disk Forensics Tool for Data Categorization & Keyword Filtering

DISKOVERY is a Python-based digital forensics tool designed to analyze disk images. It performs a multi-stage forensic analysis including imaging, partition parsing, file categorization, keyword-based filtering, and automatic PDF reporting. The tool supports both complete and filtered analysis outputs and provides investigators with a concise overview of disk contents. It is a command-line interface (CLI) tool that works well on Ubuntu and Debian-based systems.


โš™๏ธ Features

  • Disk Image Support (.img, .E01, .dd)
  • Partition Parsing using mmls
  • File Categorization:
    • Deleted
    • Encrypted
    • Current
    • Hidden
  • File Type Filtering (e.g., .pdf, .docx)
  • Keyword Search in extracted text-based files
  • Visual Summary via pie charts
  • PDF Report Generation with listings, and visualizations

Steps to use

  1. Insert pendrive.
  2. To check the location at which it's inserted: sudo fdisk -l
  3. Go to script folder and run main.py: sudo python3 main.py

๐Ÿ“ Project Structure

diskovery/
โ”œโ”€โ”€ diskovery/                       # Main package
โ”‚   โ”œโ”€โ”€ __init__.py
โ”‚   โ”œโ”€โ”€ main.py                      # CLI entry point
โ”‚   โ”œโ”€โ”€ stages/                      # Stage-wise modular pipeline
โ”‚   โ”‚   โ”œโ”€โ”€ __init__.py
โ”‚   โ”‚   โ”œโ”€โ”€ stage1_disk_imaging.py
โ”‚   โ”‚   โ”œโ”€โ”€ stage2_extraction.py
โ”‚   โ”‚   โ”œโ”€โ”€ stage3_categorization.py
โ”‚   โ”‚   โ”œโ”€โ”€ stage4_filtering.py
โ”‚   โ”‚   โ”œโ”€โ”€ stage4_2_keyword.py
โ”‚   โ”‚   โ””โ”€โ”€ stage5_reporting.py
โ”‚   โ””โ”€โ”€ utils/                       # Utility functions
โ”‚       โ”œโ”€โ”€ __init__.py
โ”‚       โ””โ”€โ”€ run_command.py
โ”‚
โ”œโ”€โ”€ README.md                        # Project overview and usage
โ”œโ”€โ”€ LICENSE                          # MIT License
โ”œโ”€โ”€ setup.py                         # Packaging configuration
โ”œโ”€โ”€ requirements.txt                 # Python dependencies
โ”œโ”€โ”€ MANIFEST.in                      # Include non-code files for PyPI
โ”œโ”€โ”€ pyproject.toml                   # Build configuration
โ””โ”€โ”€ .gitignore                       # Git ignore rules

๐Ÿš€ Quick Start

1. Clone the Repository

git clone https://github.com/simmithapad/DISKOVERY.git
cd DISKOVERY

2. Run Setup (Installs Tools + Python Packages)

pip install -r requirements.txt

3. Start the Tool

python3 -m venv .venv
source .venv/bin/activate
python3 main.py

๐Ÿ› ๏ธ Dependencies

System Tools (Installed via setup.sh)

  • dcfldd
  • sleuthkit (for mmls, fls, fsstat)
  • binwalk
  • grep and pdfgrep

Python Packages

  • fpdf
  • elasticsearch
  • docx2txt
  • re

๐Ÿ“„ Output

  • Disk images saved in ./output_files/
  • PDF reports saved in ./output_files/reports/
  • Extracted files saved in ./output_files/extracted_files/

๐Ÿ“ฌ Future Work

  • GPU Acceleration
  • Memory Forensics Integration

๐Ÿ‘ค Author

Simmi Thapad
Vrinda Abrol


License

This project is licensed under the MIT License - see the LICENSE file for details.


๐Ÿ”’ Disclaimer

[!Important] This tool is intended for educational and lawful forensic analysis only. Use responsibly.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

diskovery-0.1.2.tar.gz (16.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

diskovery-0.1.2-py3-none-any.whl (16.7 kB view details)

Uploaded Python 3

File details

Details for the file diskovery-0.1.2.tar.gz.

File metadata

  • Download URL: diskovery-0.1.2.tar.gz
  • Upload date:
  • Size: 16.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for diskovery-0.1.2.tar.gz
Algorithm Hash digest
SHA256 89f649a056d932fddcd7d385d7df6b7607efe9a5e742995cc9ec1cbc474a470a
MD5 5400f804f7da41441dc58b8176a0c6c0
BLAKE2b-256 3648e2ed5d47b3920e523c1d044cbe71499e3fb8f17d9f10ae86f6ca7f9b76bd

See more details on using hashes here.

File details

Details for the file diskovery-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: diskovery-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 16.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for diskovery-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 08e60d4f176f1dcd6854abe6ae1ea2d0b9b0fd75b4ef3c43fbdc281e08ec9436
MD5 bd7df2af69c403985f2fb15645a97904
BLAKE2b-256 3aff4a8a432c5e2b1ba8838a49db53ddacb28ad8d8057a125194e8b2e830f02c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page