Skip to main content

Tor onion site scraping tool

Project description

torspy

torspy is a Python package designed for scraping .onion sites using the Tor network. It offers a straightforward interface for retrieving HTML content from .onion URLs, searching for specific text within that content, and saving the results to a file. Additionally, torspy is capable of detecting hidden directories.

Installation

You can install torspy via pip:

pip install torspy

Usage

Command-Line Interface

torspy allows you to interact with .onion sites from the command line:

  • To display the content of a .onion site:
torspy http://example.onion
  • To save the displayed content to a file:
torspy http://example.onion -s file.html
  • The -s flag indicates saving, and you can specify any file name.

  • To search for specific text within the content and save the results to a file:

torspy http://example.onion --find "search query" -s search_results.html
  • The --find flag followed by the search query indicates searching for specific text.

  • The -s flag followed by the file name indicates saving the search results.

  • To save the content to a specified directory:

torspy http://example.onion -s file.html -d /path/to/directory
  • The -d flag followed by the directory path indicates where to save the file.

  • To check for directories listed in a file:

torspy http://example.onion --dir directories.txt
  • The --dir flag followed by the file name checks for directories listed in the specified file.
  • Move it all to another file
torspy http://example.onion --dir directories.txt -s output.txt
  • For more information on available options, you can use the --help flag:
torspy --help

Additional Examples

  • Display the content of a .onion site and search for "important information", saving the results to a file named results.html in the specified directory:
torspy http://example.onion --find "important information" -s results.html -d /path/to/directory
  • Save the entire HTML content of a .onion site to a file named full_content.html in the current directory:
torspy http://example.onion -s full_content.html
  • Display the content of a .onion site and save it to a file named output.txt in the current directory:
torspy http://example.onion -s output.txt

Using torspy in a Bash Script

  • You can incorporate torspy into your Bash scripts for automated tasks. Here's an example script that fetches content from a list of .onion URLs and saves it to individual files:
#!/bin/bash

# List of .onion URLs
urls=("http://example1.onion" "http://example2.onion" "http://example3.onion")

# Loop through each URL
for url in "${urls[@]}"; do
    # Fetch content and save to a file
    torspy "$url" -s "${url##*/}.html"
done

Integrating torspy with Other Languages

Ruby

  • You can call the torspy command-line tool from Ruby using the system method:
system("torspy http://example.onion -s output.html")

Python

  • You can use the subprocess module to call torspy from a Python script:
import subprocess

subprocess.run(["torspy", "http://example.onion", "-s", "output.html"])

PHP

  • You can use the shell_exec function to call torspy from PHP:
<?php
shell_exec("torspy http://example.onion -s output.html");
?>

Node.js

  • You can use the child_process module to call torspy from Node.js:
const { exec } = require('child_process');

exec('torspy http://example.onion -s output.html', (error, stdout, stderr) => {
    if (error) {
        console.error(`Error: ${error.message}`);
        return;
    }
    if (stderr) {
        console.error(`Stderr: ${stderr}`);
        return;
    }
    console.log(`Output: ${stdout}`);
});

How torspy Works

torspy utilizes the following process to interact with .onion sites:

  • Checking Site Existence: It verifies if the .onion site exists and is reachable through the Tor network.
  • Fetching HTML Content: It retrieves the HTML content of the .onion site using Tor for anonymity.
  • Scraping and Searching: If specified, torspy searches for specific text within the content and extracts matching results.
  • Saving Results: Optionally, torspy allows you to save the retrieved content, either the entire HTML or the search results, to a file.
  • Contributing to torspy

    If you're interested in contributing to torspy, you can:

    • Report issues encountered while using torspy.
    • Suggest new features or enhancements.
    • Submit pull requests with improvements or fixes.

    Disclaimer

    This tool is intended for ethical use only. The author is not responsible for any misuse or damage caused by this tool. Users are responsible for ensuring their activities comply with all relevant laws and regulations.
    

    Project details


    Download files

    Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

    Source Distribution

    torspy-1.0.1.tar.gz (6.4 kB view hashes)

    Uploaded Source

    Built Distribution

    torspy-1.0.1-py3-none-any.whl (6.4 kB view hashes)

    Uploaded Python 3

    Supported by

    AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page