Skip to main content

Domain enrichment kit

Project description

Richkit

Richkit is a python3 package that provides tools taking a domain name as input, and returns addtional information on that domain. It can be an analysis of the domain itself, looked up from data-bases, retrieved from other services, or some combination thereof.

The purpose of richkit is to provide a reusable library of domain name-related analysis, lookups, and retrieval functions, that are shared within the Network Security research group at Aalborg University, and also availble to the public for reuse and modification.

Documentation can be found at https://richkit.readthedocs.io/en/latest/.

Requirements

  • Python >= 3.5

Installation

In order to install richikit just type in the terminal pip install richkit

Usage

The following codes can be used to retrieve the TLD and the URL category, respectively.

  • Retriving effective top level domain of a given url:

    >>> from richkit.analyse import tld
    >>> urls = ["www.aau.dk","www.github.com","www.google.com"]
    >>>
    >>> for url in urls:
    ...     print(tld(url))
    dk
    com
    com
    
  • Retriving category of a given url:

    >>> from richkit.retrieve.symantec import fetch_from_internet
    >>> from richkit.retrieve.symantec import LocalCategoryDB
    >>>
    >>> urls = ["www.aau.dk","www.github.com","www.google.com"]
    >>>
    >>> local_db = LocalCategoryDB()
    >>> for url in urls:
    ...     url_category=local_db.get_category(url)
    ...     if url_category=='':
    ...         url_category=fetch_from_internet(url)
    ...     print(url_category)
    Education
    Technology/Internet
    Search Engines/Portals
    

Modules

Richkit define a set of functions categorized by the following modules:

  • richkit.analyse: This module provides functions that can be applied to a domain name. Similarly to richkit.lookup, and in contrast to richkit.retrieve, this is done without disclosing the domain name to third parties and breaching confidentiality.

  • richkit.lookup: This modules provides the ability to look up domain names in local resources, i.e. the domain name cannot be sent of to third parties. The module might fetch resources, such as lists or databasese, but this must be done in a way that keeps the domain name confidential. Contrast this with richkit.retrieve.

  • richkit.retrieve: This module provides the ability to retrieve data on domain names of any sort. It comes without the "confidentiality contract" of richkit.lookup.

Run Tests on Docker

In order to prevent any problems regarding to environment, we are providing Dockerfile.test file which basically constructs a docker image to run tests of Richkit.

  • The only thing to add is just MAXMIND_LICENCE_KEY in .github/local-test/run-test.sh at line 3. It is required to pass the test cases for lookup module.

Commands to test them in Docker environment.

  • docker build -t richkit-test -f Dockerfile.test . : Builds required image to run test cases

  • docker run -e MAXMIND_LICENSE_KEY="<licence-key> " richkit-test : Runs run-test.sh file in Docker image.

Contributing

Contributions are most welcome.

We use the gitflow branching strategy, so if you plan to push a branch to this repository please follow that. Note that we test branch names with .githooks/check-branch-name.py. The git pre-commit hook can be used to automatically check this on commit. An example that can be used directly as follows is available on linux, and can be enabled like this (assuming python>=3.6 and bash):

ln -s $(pwd)/.githooks/pre-commit.linux.sample $(pwd)/.git/hooks/pre-commit

Credits

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

richkit-1.1.1.tar.gz (25.4 kB view details)

Uploaded Source

Built Distribution

richkit-1.1.1-py3-none-any.whl (33.2 kB view details)

Uploaded Python 3

File details

Details for the file richkit-1.1.1.tar.gz.

File metadata

  • Download URL: richkit-1.1.1.tar.gz
  • Upload date:
  • Size: 25.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.9

File hashes

Hashes for richkit-1.1.1.tar.gz
Algorithm Hash digest
SHA256 f3bf5ffed0af98e59bf05b28869c646c1c8ac9162d5998b9ff21c0efd137b718
MD5 a5533102f97c29007217aae07b92bfed
BLAKE2b-256 8c0be35724027deb3483d3a8acd8f72015e9ece37647d5cf4c57aac0326f5c56

See more details on using hashes here.

File details

Details for the file richkit-1.1.1-py3-none-any.whl.

File metadata

  • Download URL: richkit-1.1.1-py3-none-any.whl
  • Upload date:
  • Size: 33.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.9

File hashes

Hashes for richkit-1.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 1e6e4821e58d040864c5ff499ad4d1c757fde368a16641be0c761e2cc00e4f70
MD5 ef2524b04a65b3fb966e6152042ef3b7
BLAKE2b-256 292071842ced3a42a067044444fad00d1005345c8c68dfcb37a982691692e19f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page