Skip to main content

Python package for retrieving WHOIS information of domains.

Project description

whoisdomain

A Python package for retrieving WHOIS information of DOMAIN'S ONLY.

This package will not support querying ip CIDR ranges or AS information

This is a copy of the original DanyCork 'whois'.

I will start versioning at 1.x.x where the second item will be YYYYMMDD, the third will start from 1 and be only used if more than one update will have to be done in one day.

Features

  • Python wrapper for the "whois" cli command of your operating system.
  • Simple interface to access parsed WHOIS data for a given domain.
  • Able to extract data for all the popular TLDs (com, org, net, biz, info, pl, jp, uk, nz, ...).
  • Query a WHOIS server directly instead of going through an intermediate web service like many others do.
  • Works with Python >= 3.9
  • All dates as datetime objects.
  • Possibility to cache results.
  • Verbose output on stderr during debugging to see how the internal functions are doing their work
  • raise a exception on Quota ecceeded type responses
  • raise a exception on PrivateRegistry tld's where we know the tld and know we don't know anything
  • allow for optional cleaning the response before extracting information
  • optionally allow IDN's to be translated to Punycode
  • optional specify the whois command on query(...,cmd="whois") as in: https://github.com/gen1us2k/python-whois/
  • the module is now 'mypy --strict' clean
  • the module now also exports a cli command domainwhois
  • both the module and the cli now support showing the version with lib:whois.getVersion() or cli:whoisdomain -V
  • the whoisdomain can now output json data (one per domain: e.g 'whoisdomain -d google.com -j' )
  • withRedacted: bool = False has been added to query(), if set to True any redacted fields will now be shown also (also supported in the cli whoisdomain as --withRedacted)
  • a analizer directory is presend in the github repo that will be used to look for new IANA tls's currently unsupported but maching known whois servers

Dependencies

  • please install also the command line "whois" of your distribution as this library parses the output of the "whois" cli command of your operating system

Notes for Mac users

  • it has been observed that the default cli whois on Mac is showing each forward step in its output, this makes parsing the result very unreliable.
  • using a brew install whois will give in general better results.

Docker

https://hub.docker.com/r/mbootgithub/whoisdomain

  • docker pull mbootgithub/whoisdomain:latest
  • docker run mbootgithub/whoisdomain -V # show version
  • docker run mbootgithub/whoisdomain -d google.com # run one domain
  • docker run mbootgithub/whoisdomain -a # run all tld
  • docker run mbootgithub/whoisdomain -d google.com -j | jq -r . # run one domains , output in json and reformat with jq
  • docker run mbootgithub/whoisdomain -d google.com -j | jq -r '.expiration_date' # output only expire date
  • docker run mbootgithub/whoisdomain -d google.com -j | jq -r '[ .expiration_date, .creation_date ]

Usage example

Install the cli whois of your operating system if it is not present already, e.g 'apt install whois' or 'yum install whois'

# fedora 37
sudo yum install whois
pip install whoisdomain
python
>>> import whoisdomain as whois
>>> d = whois.query('google.com')
>>> print(d.__dict__)
{'name': 'google.com', 'tld': 'com', 'registrar': 'MarkMonitor, Inc.', 'registrant_country': 'US', 'creation_date': datetime.datetime(1997, 9, 15, 9, 0), 'expiration_date': datetime.datetime(2028, 9, 13, 9, 0), 'last_updated': datetime.datetime(2019, 9, 9, 17, 39, 4), 'status': 'clientUpdateProhibited (https://www.icann.org/epp#clientUpdateProhibited)', 'statuses': ['clientDeleteProhibited (https://www.icann.org/epp#clientDeleteProhibited)', 'clientTransferProhibited (https://www.icann.org/epp#clientTransferProhibited)', 'clientUpdateProhibited (https://www.icann.org/epp#clientUpdateProhibited)', 'serverDeleteProhibited (https://www.icann.org/epp#serverDeleteProhibited)', 'serverTransferProhibited (https://www.icann.org/epp#serverTransferProhibited)', 'serverUpdateProhibited (https://www.icann.org/epp#serverUpdateProhibited)'], 'dnssec': False, 'name_servers': ['ns1.google.com', 'ns2.google.com', 'ns3.google.com', 'ns4.google.com'], 'registrant': 'Google LLC', 'emails': ['abusecomplaints@markmonitor.com', 'whoisrequest@markmonitor.com']}
>>> print (d.expiration_date)
2028-09-13 09:00:00

>>> print(d.name)
google.com

>>> print (d.creation_date)
1997-09-15 09:00:00

whoisdomain

# fedora 37
sudo yum install whois
pip3 install whoisdomain
whoisdomain -d google.com

test domain: <<<<<<<<<< google.com >>>>>>>>>>>>>>>>>>>>
name               'google.com'
tld                'com'
registrar          'MarkMonitor, Inc.'
registrant_country 'US'
creation_date      1997-09-15 09:00:00
expiration_date    2028-09-13 09:00:00
last_updated       2019-09-09 17:39:04
status             'clientUpdateProhibited (https://www.icann.org/epp#clientUpdateProhibited)'
statuses           ['clientDeleteProhibited (https://www.icann.org/epp#clientDeleteProhibited)', 'clientTransferProhibited (https://www.icann.org/epp#clientTransferProhibited)', 'clientUpdateProhibited (https://www.icann.org/epp#clientUpdateProhibited)', 'serverDeleteProhibited (https://www.icann.org/epp#serverDeleteProhibited)', 'serverTransferProhibited (https://www.icann.org/epp#serverTransferProhibited)', 'serverUpdateProhibited (https://www.icann.org/epp#serverUpdateProhibited)']
dnssec             False
name_servers       ['ns1.google.com', 'ns2.google.com', 'ns3.google.com', 'ns4.google.com']
registrant         'Google LLC'
emails             ['abusecomplaints@markmonitor.com', 'whoisrequest@markmonitor.com']

A short intro into the cli whoisdomain command

whoisdomain
    [ -h | --usage ]
        print this text and exit

    [ -V | --Version ]
        print the build version string
        and exit

    [ -S | --SupportedTld ]
        print all known top level domains
        and exit

    [ -a | --all]
        test all existing tld currently supported
        and exit

    [ -f <filename> | --file = <filename> " ]
        use the named file to test all domains (one domain per line)
        lines starting with # or empty lines are skipped, anything after the domain is ignored
        the option can be repeated to specify more then one file
        exits after processing all the files

    [ -D <directory> | --Directory = <directory> " ]
        use the named directory, ald use all files ending in .txt as files containing domains
        files are processed as in the -f option so comments and empty lines are skipped
        the option can be repeated to specify more then one directory
        exits after processing all the dirs

    [ -d <domain> | --domain = <domain> " ]
        only analyze the given domains
        the option can be repeated to specify more domain's

    [ -v | --verbose ]
        set verbose to True,
        verbose output will be printed on stderr only

    [ -j | --json ]
        print each result as json

    [ -I | --IgnoreReturncode ]
        sets the IgnoreReturncode to True,

    [ -p | --print ]
        also print text containing the raw output of the cli whois

    [ -R | --Ruleset ]
        dump the ruleset for the requested tld and exit
        should be combined with -d to specify tld's

    [ -C <file> | --Cleanup <file> ]
        read the input file specified and run the same cleanup as in whois.query,
        then exit

    # test two domains with verbose and IgnoreReturncode
    example: whoisdomain -v -I -d meta.org -d meta.com

    # test all supported tld's with verbose and IgnoreReturncode
    example: whoisdomain -v -I -a

    # test one specific file with verbose and IgnoreReturncode
    example: whoisdomain -v -I -f tests/ok-domains.txt

    # test one specific directory with verbose and IgnoreReturncode
    example: whoisdomain -v -I -D tests

Json output

{
  "name": "hello.xyz",
  "tld": "xyz",
  "registrar": "Namecheap",
  "registrant_country": "IS",
  "creation_date": "2014-03-20 15:01:22",
  "expiration_date": "2024-03-20 23:59:59",
  "last_updated": "2023-03-14 09:24:32",
  "status": "clientTransferProhibited https://icann.org/epp#clientTransferProhibited",
  "statuses": [
    "clientTransferProhibited https://icann.org/epp#clientTransferProhibited"
  ],
  "dnssec": false,
  "name_servers": [
    "dns1.registrar-servers.com",
    "dns2.registrar-servers.com"
  ],
  "registrant": "Privacy service provided by Withheld for Privacy ehf",
  "emails": [
    "abuse@namecheap.com"
  ]
}

ccTLD & TLD support

see the file: ./whoisdomain/tld_regexpr.py or call lib:whoisdomain.validTlds() or cli:whoisdomain -S

Support

  • Python 3.x is supported for x >= 9
  • Python 2.x IS NOT supported.

Author's

Updates

  • 1.20230627.2 add Kenia proper whois server and known second level domains
  • 1.20230627.3 add rw tld proper whois server and second level ; restore mistakenly deleted .toml file
  • 1.20230627.3 additional kenia second level domains
  • 1.20230712.2 tld .edu now can have up to 10 nameservers; remove action on pull request
  • 1.20230717.1 add tld: com.ru, msk.ru, spb.ru (all have a test documented), also update the tld: ru, the newlines are not needed.
  • 1.20230717.2 add option to parse partial result after timout has occurred (parse_partial_response:bool default False); this will need stdbuf installed otherwise it will fail
  • 1.20230718.3 fix typo in whois server hint for tld: ru
  • 1.20230720.1 add gov.tr; switch off status:available and status:free as None response, we should not interprete the result by default (we can add a option later)
  • 1.20230720.2 fix server hints for derived second level "xxx.tr", add processing "_test" hints during 'test2.py -a'
  • add external caching framework that can be overridden for use of your own caching implementation
  • renaming various vars to mak them more verbose
  • preparing for capturing all parameters in one object and parring that object around instead of many arguments in methods/functions
  • switch to json so we dont need a additional dependency in ParamContext
  • finish rework args to ParameterContext, split of domain as file
  • 1.20230803.1 frenzy refactor-release
  • 1.20230804.1 testing
  • 1.20230804.2 testing after remove of leading dot in rw second level domains
  • 1.20230804.3 simplefy cache implementation after feedback from baderdean
  • "more lembas bread", refactor parse and query
  • remove option to typecheck CACHE_STUB, use try/catch/exit instead, does not work when timout happens, removed ;-(
  • refactor doQuery create processWhoisDomainRequest, split of lastWhois
  • 1.20230806.1 testing done, prep new release: "more lembas bread"
  • bug found with the default timeout: if no timeout is specified the program fails: all pypi releases before 2023-07-17 yanked
  • 1.20230807.1 fix default timeout
  • add DummyCache, DBMCache, RedisCache with simple test in testCache.py, testing custom cache options
  • 1.20230811.1 ; replace type hint | with Union for py3.9 compat; switch off experimental redis tools
  • switch off 3.[6-8] minimal is 3.9 we test against
  • start working on dataContext;
  • add more _test items; reorder parts of tld_regexpr;
  • propagate all meta domains servers as they are not inherited, testing , some domains have been retracted mboot; 2023-08-23;
  • add suggestion from baderdean to parse fr domains with more focus on ORGANISATION
  • 2023-08-24: mboot: more _test added to tld
  • verify all _test on whois.nic. _test: nic. fix where needed; remove some abandoned tld's
  • build: 1.20230824.1 mboot; to combine all new tests and changes, "the galloping Chutzpah release"
  • build: 1.20230824.5 mboot; fix missing module in whl
  • restore python 3.6 test as i still use it on one remaining app with python 3.6 (make testP36)
  • finalize verification of all tld's in iana, add test where this can be auto generated from whois.nic. 2023-08-28; mboot
  • 1.20230829.1 mboot; all _test now work, using analizer tool to verify that iana tld db web site and tl-regexpr match
  • add DEBUG to all verbose strings
  • remove tldString and dList and domain , all go via dc (dataContect) now
  • run tests and add new TODO
  • moving all TLD_RE activities to tldInfo.py, and all exported helper funcs to helpers.py
  • thinking about adding more complicated nested regex extractors to target contact info
  • start with dependency inject: parser is passed as arg
  • add cli interface to dependency inject, rightsize after test
  • finish dependency inject move Domain create outside
  • prep for other types or regex; all simple regex strings in tld_regexpr.py now need R() around them
  • use currying to make all regex strings into function cal in whoisParser.py; all regexes in tld_regexpr.py are now converted on import to function calls via R()
  • update tld: sk to use contextual extract, test with google.sk
  • add findFromToAndLookForWithFindFirst contextual search based on a previous findFirst, used in "fr" tld, example google.fr, {} is used to add to fromStr
  • test: 1.20230904.1, only on pypi-test
  • 1.20230906.1: introduce parsing based on functions, allow contextual search in splitted data and plain data, allow contextual search based on earlier result; fix a few tld to return the proper registrant string (not nic handle)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

whoisdomain-1.20230906.1.tar.gz (51.9 kB view details)

Uploaded Source

Built Distribution

whoisdomain-1.20230906.1-py3-none-any.whl (62.4 kB view details)

Uploaded Python 3

File details

Details for the file whoisdomain-1.20230906.1.tar.gz.

File metadata

  • Download URL: whoisdomain-1.20230906.1.tar.gz
  • Upload date:
  • Size: 51.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.4

File hashes

Hashes for whoisdomain-1.20230906.1.tar.gz
Algorithm Hash digest
SHA256 bb5e755de7e603f8dac06b3bffe2aa3ea5baf550969feacca7776655eb9a5415
MD5 b683a87488a0b91bfa0bdd84a33f0a59
BLAKE2b-256 1eec18e027bd7ae099eb5271338706e4b8d0509762a24fc407091445adaa0d7c

See more details on using hashes here.

File details

Details for the file whoisdomain-1.20230906.1-py3-none-any.whl.

File metadata

File hashes

Hashes for whoisdomain-1.20230906.1-py3-none-any.whl
Algorithm Hash digest
SHA256 1593f455a6a3d36d2548a7889b99fa8e7fc010c19d0490d4bffeaae98c273bd5
MD5 447e7506859afa4065444630c8259ace
BLAKE2b-256 acf4575b23bd6fb70e84513681822cb82b72402272f129a32b68d374ea8d40f0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page