Get data on IP addresses
Project description
Know your IP
Get data on IP addresses. Learn where they are located (lat/long, country, city, time zone), whether they are blacklisted or not (by abuseipdb, virustotal, ipvoid, etc.) and for what (and when they were blacklisted), which ports are open, and what services are running (via shodan), and what you get when you ping or issue a traceroute.
If you are curious about potential application of the package, we have a presentation on its use in cybersecurity analysis workflow.
The package exposes a single function know_your_ip that takes a csv file with a single column of IP addresses, details about the API keys and which columns you would like from which service (in know_your_ip.cfg), and appends the requested results to the IP list.
Brief Primer on Functionality
Geocoding IPs: There is no simple way to discern the location of an IP. The locations are typically inferred from data on delay and topology along with information from private and public databases. For instance, one algorithm starts with a database of locations of various ‘landmarks’, calculates the maximum distance of the last router before IP from the landmarks using Internet speed, and builds a boundary within which the router must be present and then takes the centroid of it. The accuracy of these inferences is generally unknown, but can be fairly `poor.’ For instance, most geolocation services place my IP as located more than 30 miles away from where I am. Try http://www.geoipinfo.com/.
The script provides hook to Maxmind City Lite DB. It expects a copy of the database to be in the folder in which the script is run. To download the database, go here. The function returns city, country, lat/long etc.
Timezone: In theory, there are 24 time zones. In practice, a few more. For instance, countries like India have half-hour offsets. Theoretical mappings can be easily created for lat/long data based on the 15 degrees longitude span. For practical mappings, one strategy is to map (nearest) city to time zone (recall the smallish lists that you scroll though on your computer’s time/date program.) There are a variety of services for getting the timezone, including, but not limited to,
For its ease, we choose a Python hook to nodeJS lat/long to timezone. To get the timezone, we first need to geocode the IP (see above). The function takes lat/long and returns timezone.
Ping: Sends out a ICMP echo request and waits for the reply. Measures round-trip time (min, max, and mean), reporting errors and packet loss. The function for now works only on Linux machines. If there is a timeout, the function puts in nothing. If there is a reply, it add cols, packets_sent, packets_received, packets_lost, min_time, max_time, avg_time
Traceroute: Sends a UDP (or ICMP) packet. Builds the path for how the request is routed, noting routers and time.
Backgrounder:
censys.io: Performs ZMap and ZGrab scans of IPv4 address space. To use censys.io, you must first register. Once you register and have the API key, put in here. The function takes an IP and returns asn, timezone, country etc. For a full list, see https://censys.io/ipv4/help.
shodan.io: Scans devices connected to the Internet for services, open ports etc. You must register to use shodan.io. Querying costs money. Once you register and have the API key, put in here. The script implements two API calls: shodan/host/ip and shodan/scan. The function takes a list of IPs and returns
Blacklists and Backgrounders: The number of services that maintain blacklists is enormous. Here’s a list of some of the services: TornevallNET, BlockList_de, Spamhaus, MyWOT, SpamRATS, Malc0de, SpyEye, GoogleSafeBrowsing, ProjectHoneypot, etc. Some of the services report results from other services as part of their results. In this script, we implement hooks to the following three:
virustotal.com: A Google company that analyzes and tracks suspicious files, URLs, and IPs. You must register to use virustotal. Once you register and have the API key, put in here. The function implements retrieving IP address reports method.
abuseipdb.com: Tracks reports on IPs. You must register to use the API. Once you register and have the API key, put in here. There is a limit of 5k pings per month. The function that we implement here is a mixture of API and scraping as the API doesn’t return details of the reports filed.
ipvoid.com: Tracks information on IPs. There is no API. We scrape information about IPs including status on various blacklist sites.
Query Limits
Service |
Query Limits |
More Info |
---|---|---|
Censys.io |
120/5 minutes |
|
Virustotal |
4/minute |
|
AbuseIPDB |
2500/month |
|
IPVoid |
- |
|
Shodan |
- |
|
-———- |
-————— |
-———- |
Installation
The script depends on some libraries. Currently traceroute uses operating system command traceroute on Linux and tracert on Windows.
Ping function is based on a pure python ping implementation using raw socket and you must have root (on Linux) or Admin (on Windows) privileges to run
# Install package and dependencies pip install know_your_ip # On Ubuntu Linux (if traceroute command not installed) sudo apt-get install traceroute
Note: If you use anaconda on Windows, it is best to install Shapely via:
conda install -c scitools shapely
General Layout of the Software
In the config file (default: know_your_ip.cfg), there are settings grouped by each APIs.
For Maxmind API, the script expects a copy of the database to be in the folder specify by dbpath in the config file. To download the database, go here
In the columns file (default: columns.txt), there are the data columns to be output by the script. We may have more than one columns file but only one will be use by setting the columns variable in output section.
Usage
usage: know_your_ip [-h] [-f FILE] [-c CONFIG] [-o OUTPUT] [-n MAX_CONN] [--from FROM_ROW] [--to TO] [-v] [--no-header] [ip [ip ...]] Know Your IP positional arguments: ip IP Address(es) optional arguments: -h, --help show this help message and exit -f FILE, --file FILE List of IP addresses file -c CONFIG, --config CONFIG Configuration file -o OUTPUT, --output OUTPUT Output CSV file name -n MAX_CONN, --max-conn MAX_CONN Max concurrent connections --from FROM_ROW From row number --to TO To row number -v, --verbose Verbose mode --no-header Output without header at the first row
General Examples
know_your_ip --file input.csv
Please also look at example.py, this way we’ll be able to use this script as external lib.
Documentation
For more information please visit the project documentation page.
Contributor Code of Conduct
The project welcomes contributions from everyone! In fact, it depends on it. To maintain this welcoming atmosphere, and to collaborate in a fun and productive way, we expect contributors to the project to abide by the Contributor Code of Conduct.
License
The package is released under the MIT License.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.