Skip to main content

A Python library to detect technologies used by websites

Project description

TechFinder

TechFinder is a Python library designed to detect technologies used on websites. It checks both the HTML content and HTTP headers of a given URL to identify various technologies like frameworks, libraries, and server software. The library provides both default patterns and support for user-defined patterns via a JSON configuration file.

Features

  • Technology Detection: Detects web technologies such as frameworks (e.g., React, Angular), server technologies (e.g., Apache, Nginx), and cloud services (e.g., AWS, Google Cloud).
  • Custom Patterns: Users can define their own patterns for technology detection via a custom JSON file.
  • HTML and HTTP Header Parsing: Examines both HTML content and HTTP headers to identify technologies.
  • Extensible: Easily extendable to support additional technologies by updating the patterns.

Installation

To install the techfinder library, run the following command:

pip install techfinder

Alternatively, clone this repository and install dependencies manually:

git clone https://github.com/Prathameshsci369/Detector.git
cd Detector
pip install -r requirements.txt

Usage

Importing and Initializing

First, import the techfinder class:

from  techfinder import Detector

You can initialize the detector with a default pattern set or provide a custom JSON file containing user-defined patterns.

# Initialize with default patterns
detector = Detector()

# Initialize with custom patterns (provide path to your custom JSON config)
detector = Detector('custom_patterns.json')

Basic Detection Use Case

To detect technologies from a website, you can use the final_function method, which fetches the URL and analyzes its HTML and headers:

url = 'https://example.com'
detected_tech = detector.final_function(url)

print("Detected Technologies:", detected_tech)

Custom Patterns

If you want to use your own patterns to detect specific technologies, you can create a custom JSON file like the one below:

custom_patterns.json

{
  "html_patterns": {
    "MyCustomTech": "mycustomtech"
  },
  "header_patterns": {
    "MyCustomServer": "mycustomserver"
  }
}

You can then initialize Detector with this file:

detector = Detector('custom_patterns.json')
detected_tech = detector.final_function('https://example.com')
print("Detected Technologies:", detected_tech)

Example Outputs

Example 1: Default Patterns

For a URL like https://example.com, the output might be:

Detected Technologies: ['React', 'Node.js', 'Express']

Example 2: Custom Patterns

If the URL matches custom patterns in custom_patterns.json, the output might look like:

Detected Technologies: ['MyCustomTech', 'MyCustomServer']

Logging

The techfinder library uses Python's built-in logging module to provide detailed information during execution. By default, it logs important actions such as pattern loading and technology detection. You can customize the logging level as needed:

import logging
logging.basicConfig(level=logging.DEBUG)  # Change logging level to DEBUG

Error Handling

The library will handle common errors such as invalid URLs or issues with fetching data gracefully. If something goes wrong, you will see an error message in the logs, and the program will continue running.

detected_tech = detector.final_function('https://invalid-url.com')
# Will log an error: "Error fetching the URL"

Use Cases

Use Case 1: Identify Web Frameworks and Libraries

techfinder can be used to determine what frameworks and libraries a website is using. For example, detecting if a website uses React, Vue.js, or Angular.

detector = Detector()
url = 'https://some-react-site.com'
detected_tech = detector.final_function(url)
print(detected_tech)  # Expected output: ['React']

Use Case 2: Identify Server Technologies

You can use this library to detect the server-side technology used by a website, such as Apache, Nginx, or a cloud platform like AWS.

detector = Detector()
url = 'https://some-apache-server.com'
detected_tech = detector.final_function(url)
print(detected_tech)  # Expected output: ['Apache']

Use Case 3: Customize Patterns for Specific Technologies

If you have specific technologies that are not part of the default set, you can define your own patterns in a custom JSON file.

{
  "html_patterns": {
    "MyCustomTech": "mycustomtech"
  },
  "header_patterns": {
    "MyCustomServer": "mycustomserver"
  }
}

This allows you to track and detect technologies that are unique to your environment or your use case.

Use Case 4: Monitor Technology Changes

By integrating techfinder into your monitoring tools, you can keep track of which technologies are being used on various websites over time. This could be useful for identifying when websites update their tech stack.

detector = Detector()
url = 'https://example.com'
detected_tech = detector.final_function(url)
# Log detected technologies every week

Contributing

We welcome contributions to the techfinder library! If you'd like to report bugs, suggest new features, or help improve the documentation, feel free to open an issue or submit a pull request.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

techfinder-0.3.2.tar.gz (11.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

techfinder-0.3.2-py3-none-any.whl (7.5 kB view details)

Uploaded Python 3

File details

Details for the file techfinder-0.3.2.tar.gz.

File metadata

  • Download URL: techfinder-0.3.2.tar.gz
  • Upload date:
  • Size: 11.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for techfinder-0.3.2.tar.gz
Algorithm Hash digest
SHA256 eefc457c2061b032e4f0716283a64d9046e48aca96ab53d61c74100a150fdbb7
MD5 e522f97d089ff20d80358e43bc266aee
BLAKE2b-256 8c62258ebb9f00f7e8c174dad8971164d41910cf190714f58657b28fc63f4f10

See more details on using hashes here.

File details

Details for the file techfinder-0.3.2-py3-none-any.whl.

File metadata

  • Download URL: techfinder-0.3.2-py3-none-any.whl
  • Upload date:
  • Size: 7.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for techfinder-0.3.2-py3-none-any.whl
Algorithm Hash digest
SHA256 a795cdc32640b24dcfba66ee0099cda96088bb30822b36d0868d045a9d23731b
MD5 5b7232b241d79d058593e66d0725c4ae
BLAKE2b-256 74dd1589e455cd30af810f71fefd73e7634aa46a3c6a0c2d4d60424b41edff5f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page