asyncio-compatible Python module for performing WHOIS queries for any domain.
Project description
asyncwhois
asyncio-compatible Python module for performing WHOIS queries for any domain.
Installation
pip install asyncwhois
Quickstart
# Opens a connection to the appropriate WhoIs server, submits the query, and parses the output.
result = asyncwhois.lookup('google.com')
# [for asyncio] result = await asyncwhois.aio_lookup('google.com')
result.query_output # raw output from the whois server
result.parser_output # dictionary of key/values extracted from query_output
# Equivalent to running "whois <domain>" from the shell. Uses the "subprocess" package.
result = asyncwhois.whois_cmd_shell('google.com')
# [for asyncio] result = await asyncwhois.aio_whois_cmd_shell('google.com')
result.query_output # raw output from the whois server
result.parser_output # dictionary of key/values extracted from query_output
Examples
standard
import asyncwhois
def main():
urls = [
'www.google.co.uk',
'en.wikipedia.org/wiki/Pi',
'https://twitch.tv',
'https://www.bing.com/search?q=llama',
'agar.io',
'172.217.3.110'
]
for url in urls:
asyncwhois.lookup(url)
if __name__ == '__main__':
main()
asyncio
import asyncio
import asyncwhois
async def main():
urls = [
'www.google.co.uk',
'en.wikipedia.org/wiki/Pi',
'https://twitch.tv',
'https://www.bing.com/search?q=llama',
'agar.io',
'172.217.3.110'
]
tasks = []
for url in urls:
awaitable = asyncwhois.aio_lookup(url)
tasks.append(awaitable)
await asyncio.gather(*tasks)
if __name__ == '__main__':
asyncio.run(main())
Contributions
Top Level Domain Parsers are located in asyncwhois/parser.py and are based on those found in richardpenman/pywhois. For additional TLD support, simply create a new class like the one below:
class RegexORG(BaseParser):
_org_expressions = {} # the custom regular expressions needed to parse the output from this whois server
def __init__(self):
super().__init__()
self.server = 'whois.pir.org' # the whois server for this TLD
self.update_reg_expressions(self._org_expressions)
Unfortunately, “the format of responses [from a Whois server] follow a semi-free text format”. This means that situations will arise where you may find yourself needing more control over how parsing happens. Fortunately, you can override the “BaseParser.parse” method to suit your needs. Tests are obviously encouraged if you plan on doing this.
For example, this is a snippet of the output from running a “whois google.ir” command.
domain: google.ir
ascii: google.ir
remarks: (Domain Holder) Google Inc.
remarks: (Domain Holder Address) 1600 Amphitheatre Parkway, Mountain View, CA, US
holder-c: go438-irnic
...
In this case, the address, city, state, and country can all be extracted from the the “registrant_address” field. So, as seen below, the parse method is overwritten to include this extra step.
class RegexIR(BaseParser):
_ir_expressions = {
BaseKeys.UPDATED : r'last-updated: *(.+)',
BaseKeys.EXPIRES : r'expire-date: *(.+)',
BaseKeys.REGISTRANT_ORGANIZATION : r'org: *(.+)',
BaseKeys.REGISTRANT_NAME : r'remarks:\s+\(Domain Holder\) *(.+)',
BaseKeys.REGISTRANT_ADDRESS : r'remarks:\s+\(Domain Holder Address\) *(.+)',
BaseKeys.NAME_SERVERS : r'nserver: *(.+)'
}
def __init__(self):
super().__init__()
self.server = 'whois.nic.ir'
self.update_reg_expressions(self._ir_expressions)
def parse(self, blob: str) -> Dict[str, Any]:
"""
Custom address parsing is required.
"""
parsed_output = {}
for key, regex in self.reg_expressions.items():
if key == BaseKeys.REGISTRANT_ADDRESS:
match = self.find_match(regex, blob)
# need to break up the address field
address, city, state, country = match.split(', ')
parsed_output[BaseKeys.REGISTRANT_ADDRESS] = address
parsed_output[BaseKeys.REGISTRANT_CITY] = city
parsed_output[BaseKeys.REGISTRANT_STATE] = state
parsed_output[BaseKeys.REGISTRANT_COUNTRY] = country
elif not parsed_output.get(key):
parsed_output[key] = self.find_match(regex, blob, many=key in self.multiple_match_keys)
# convert dates
if key in self.date_keys and parsed_output.get(key, None):
parsed_output[key] = self._parse_date(parsed_output.get(key))
return parsed_output
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for asyncwhois-0.2.4-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 56772b47cf63ee313d272bd6977bda8bb995c957092f0376cc5ba487b25c4f4d |
|
MD5 | 02e1e874310d30793fb865832025388e |
|
BLAKE2b-256 | 10808ed88dcf55fb1af59300f4b36f2b74e4f1f338117e8ee1d6c595d9e450a7 |