Skip to main content

A Darkweb OSINT tool for scraping and searching onion sites.

Project description

Darkweb OSINT Package

A modular Open Source Intelligence (OSINT) tool designed to search and scrape data from the Darkweb (Tor network). This package aggregates results from multiple darkweb search engines and provides a concurrent scraper to extract content from .onion sites.

Features

  • Multi-Engine Search: Queries 15+ darkweb search engines (Ahmia, Torch, OnionLand, etc.) concurrently.
  • Concurrent Scraping: Rapidly extracts text content from multiple .onion URLs simultaneously.
  • Tor Integration: Built-in SOCKS5 proxy configuration for seamless routing through the Tor network.
  • User-Agent Rotation: Randomizes user agents to minimize blocking.

Prerequisites

Crucial: You must have the Tor service running on your machine for this package to function. This package routes traffic through the local Tor SOCKS5 proxy (default port 9050).

  • Linux (Debian/Ubuntu):
    sudo apt install tor
    sudo service tor start
    
  • Windows: Download the Tor Browser or the Tor Expert Bundle and ensure it is running.

Installation

This package is managed via pyproject.toml. You can install it directly:

pip install osint-darkweb.pkg

Usage

** 1. Searching** Search for a keyword across multiple hidden service search engines.

from osint_darkweb_pkg import get_search_results

query = "example query"

Returns a list of dictionaries: [{'title': '...', 'link': '...'}]

results = get_search_results(query)

print(f"Found {len(results)} links.")

** 2. Scraping** Extract text content from a list of .onion URLs. from osint_darkweb_pkg import scrape_multiple

specific .onion links or results from the search step

urls = [ {'link': 'http://example1.onion', 'title': 'Example 1'}, {'link': 'http://example2.onion', 'title': 'Example 2'} ]

Returns a dictionary: {'http://example1.onion': 'Page content...'}

data = scrape_multiple(urls)

for url, content in data.items(): print(f"URL: {url}\nContent: {content[:100]}...\n")

Structure

search.py: Handles sending queries to search engine endpoints and parsing results.

scrape.py: Handles visiting specific .onion URLs and extracting text.

utils.py: Manages shared resources like Tor proxy configuration and User-Agent rotation.

Disclaimer

This tool is intended for educational and research purposes only (e.g., security research, OSINT investigations). The authors are not responsible for any misuse of this tool. Accessing the dark web may be illegal or monitored in certain jurisdictions. Use responsibly.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

osint_darkweb_pkg-0.1.0.tar.gz (7.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

osint_darkweb_pkg-0.1.0-py3-none-any.whl (9.9 kB view details)

Uploaded Python 3

File details

Details for the file osint_darkweb_pkg-0.1.0.tar.gz.

File metadata

  • Download URL: osint_darkweb_pkg-0.1.0.tar.gz
  • Upload date:
  • Size: 7.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for osint_darkweb_pkg-0.1.0.tar.gz
Algorithm Hash digest
SHA256 7a42152d02d3493ec01e8913964e05b4b850107f9be06f50ba89978795463e42
MD5 4d248b6e8e78543bc963470309dd5722
BLAKE2b-256 74cabdb1b370bb98b7738848c2fa6b08643dd34fc441695d70d88a8d6eec233e

See more details on using hashes here.

File details

Details for the file osint_darkweb_pkg-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for osint_darkweb_pkg-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1154f5afc66a0c2b15f0690539ead7695ccff6bf2120268203342cbf10329fb9
MD5 12392e0fc7aafef15a07dbbc2b35384e
BLAKE2b-256 f267a8ecffa411ab4e9e0eab759929bd760dd79c1adbe894b0eac78195b2a611

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page