Polars extension for IP address parsing and enrichment including geolocation
Project description
Polars IPTools
Polars IPTools is a Rust-based extension to accelerates IP address manipulation and enrichment in Polars dataframes. This library includes various utility functions for working with IPv4 and IPv6 addresses and geoip and anonymization/proxy enrichment using MaxMind databases.
Install
pip install polars-iptools
Examples
Simple enrichments
IPTools' Rust implementation gives you speedy answers to basic IP questions like "is this a private IP?"
>>> import polars as pl
>>> import polars_iptools as ip
>>> df = pl.DataFrame({'ip': ['8.8.8.8', '2606:4700::1111', '192.168.100.100', '172.21.1.1', '172.34.5.5', 'a.b.c.d']})
>>> df.with_columns(ip.is_private(pl.col('ip')).alias('is_private'))
shape: (6, 2)
┌─────────────────┬────────────┐
│ ip ┆ is_private │
│ --- ┆ --- │
│ str ┆ bool │
╞═════════════════╪════════════╡
│ 8.8.8.8 ┆ false │
│ 2606:4700::1111 ┆ false │
│ 192.168.100.100 ┆ true │
│ 172.21.1.1 ┆ true │
│ 172.34.5.5 ┆ false │
│ a.b.c.d ┆ false │
└─────────────────┴────────────┘
is_in
but for network ranges
Pandas and Polars have is_in
functions to perform membership lookups. IPTools extends this to enable IP address membership in IP networks. This function works seamlessly with both IPv4 and IPv6 addresses and converts the specified networks into a Level-Compressed trie (LC-Trie) for fast, efficient lookups.
>>> import polars as pl
>>> import polars_iptools as ip
>>> df = pl.DataFrame({'ip': ['8.8.8.8', '1.1.1.1', '2606:4700::1111']})
>>> networks = ['8.8.8.0/24', '2606:4700::/32']
>>> df.with_columns(ip.is_in(pl.col('ip'), networks).alias('is_in'))
shape: (3, 2)
┌─────────────────┬───────┐
│ ip ┆ is_in │
│ --- ┆ --- │
│ str ┆ bool │
╞═════════════════╪═══════╡
│ 8.8.8.8 ┆ true │
│ 1.1.1.1 ┆ false │
│ 2606:4700::1111 ┆ true │
└─────────────────┴───────┘
GeoIP enrichment
Using MaxMind's GeoLite2-ASN.mmdb and GeoLite2-City.mmdb databases, IPTools provides offline enrichment of network ownership and geolocation.
ip.geoip.full
returns a Polars struct containing all available metadata parameters. If you just want the ASN and AS organization, you can use ip.geoip.asn
.
>>> import polars as pl
>>> import polars_iptools as ip
>>> df = pl.DataFrame({"ip":["8.8.8.8", "192.168.1.1", "2606:4700::1111", "999.abc.def.123"]})
>>> df.with_columns([ip.geoip.full(pl.col("ip")).alias("geoip")])
shape: (4, 2)
┌─────────────────┬─────────────────────────────────┐
│ ip ┆ geoip │
│ --- ┆ --- │
│ str ┆ struct[11] │
╞═════════════════╪═════════════════════════════════╡
│ 8.8.8.8 ┆ {15169,"GOOGLE","","NA","","",… │
│ 192.168.1.1 ┆ {0,"","","","","","","",0.0,0.… │
│ 2606:4700::1111 ┆ {13335,"CLOUDFLARENET","","","… │
│ 999.abc.def.123 ┆ {null,null,null,null,null,null… │
└─────────────────┴─────────────────────────────────┘
>>> df.with_columns([ip.geoip.asn(pl.col("ip")).alias("asn")])
shape: (4, 2)
┌─────────────────┬───────────────────────┐
│ ip ┆ asn │
│ --- ┆ --- │
│ str ┆ str │
╞═════════════════╪═══════════════════════╡
│ 8.8.8.8 ┆ AS15169 GOOGLE │
│ 192.168.1.1 ┆ │
│ 2606:4700::1111 ┆ AS13335 CLOUDFLARENET │
│ 999.abc.def.123 ┆ │
└─────────────────┴───────────────────────┘
Spur enrichment
Spur is a commercial service that provides "data to detect VPNs, residential proxies, and bots". One of its offerings is a Maxmind mmdb format of at most 2,000,000 "busiest" Anonymous or Anonymous+Residential ips.
ip.spur.full
returns a Polars struct containing all available metadata parameters.
>>> import polars as pl
>>> import polars_iptools as ip
>>> df = pl.DataFrame({"ip":["8.8.8.8", "192.168.1.1", "999.abc.def.123"]})
>>> df.with_columns([ip.spur.full(pl.col("ip")).alias("spur")])
shape: (3, 2)
┌─────────────────┬─────────────────────────────────┐
│ ip ┆ geoip │
│ --- ┆ --- │
│ str ┆ struct[7] │
╞═════════════════╪═════════════════════════════════╡
│ 8.8.8.8 ┆ {0.0,"","","","","",null} │
│ 192.168.1.1 ┆ {0.0,"","","","","",null} │
│ 999.abc.def.123 ┆ {null,null,null,null,null,null… │
└─────────────────┴─────────────────────────────────┘
Environment Configuration
IPTools uses two MaxMind databases: GeoLite2-ASN.mmdb and GeoLite2-City.mmdb. You only need these files if you call the geoip functions.
Set the MAXMIND_MMDB_DIR
environment variable to tell the extension where these files are located.
export MAXMIND_MMDB_DIR=/path/to/your/mmdb/files
# or Windows users
set MAXMIND_MMDB_DIR=c:\path\to\your\mmdb\files
If the environment is not set, polars_iptools will check two other common locations (on Mac/Linux):
/usr/local/share/GeoIP
/opt/homebrew/var/GeoIP
Spur Environment
If you're a Spur customer, export the feed as spur.mmdb
and specify its location using SPUR_MMDB_DIR
environment variable.
export SPUR_MMDB_DIR=/path/to/spur/mmdb
# or Windows users
set SPUR_MMDB_DIR=c:\path\to\spur\mmdb
Credit
Developing this extension was super easy by following Marco Gorelli's tutorial and cookiecutter template.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
File details
Details for the file polars_iptools-0.1.8.tar.gz
.
File metadata
- Download URL: polars_iptools-0.1.8.tar.gz
- Upload date:
- Size: 48.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.7.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4210a58b89b279a74335ff28be3602f325cef3c18a825e0a8f0d969c44d8e77e |
|
MD5 | 5107a56787682168e169e06b52ed6d4b |
|
BLAKE2b-256 | a0958681705050a97fd249d5b8a422d0e6d670372037f1009ea7b0780b37f29f |
File details
Details for the file polars_iptools-0.1.8-cp38-abi3-win_amd64.whl
.
File metadata
- Download URL: polars_iptools-0.1.8-cp38-abi3-win_amd64.whl
- Upload date:
- Size: 3.5 MB
- Tags: CPython 3.8+, Windows x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.7.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 87dbaf08c58deba708df46ae370381ab71748d39cf508ef108e7eb59a8058d8d |
|
MD5 | 2a88f60bc24d4797e21f89a79b424c6c |
|
BLAKE2b-256 | 7e08285109382c52266129d19f3ac258ba8cc1f9f26d4e45ba125fc091fc8ea2 |
File details
Details for the file polars_iptools-0.1.8-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
.
File metadata
- Download URL: polars_iptools-0.1.8-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 3.8 MB
- Tags: CPython 3.8+, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.7.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4f6d0be5794d476c48eb729a3949939369cf65270fba92e117cada7ce32acd33 |
|
MD5 | 88258edb421c87f3094a83f0501e9cf5 |
|
BLAKE2b-256 | d58360e23beb6ac391d42f4b16c55a17b110017a622109fbd933ccf1b2eac42a |
File details
Details for the file polars_iptools-0.1.8-cp38-abi3-manylinux_2_12_i686.manylinux2010_i686.whl
.
File metadata
- Download URL: polars_iptools-0.1.8-cp38-abi3-manylinux_2_12_i686.manylinux2010_i686.whl
- Upload date:
- Size: 3.9 MB
- Tags: CPython 3.8+, manylinux: glibc 2.12+ i686
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.7.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | dde32d75efc9be1a1cce42aeb736a6f255d143f56eeadf6d81ec00e6fd3b55cd |
|
MD5 | c6dcd87a2f4c4a33d72ee95b142764f4 |
|
BLAKE2b-256 | a6a2f043db553a2933c54f6dbc746d0e137ca2bbfbc9ad225ee2d599c703ae2c |
File details
Details for the file polars_iptools-0.1.8-cp38-abi3-macosx_11_0_arm64.whl
.
File metadata
- Download URL: polars_iptools-0.1.8-cp38-abi3-macosx_11_0_arm64.whl
- Upload date:
- Size: 2.8 MB
- Tags: CPython 3.8+, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.7.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | cd0a9c0d7ade0bad8e1f281ab8bbbd150d4bc8310130fb9e17c70eca019c5b7c |
|
MD5 | 74f4503a3e78868751b61f454d576c50 |
|
BLAKE2b-256 | 61e346b3a69dd9f00d31340e4fb96c8ccd478741ef6d137a3840c01a331c0617 |
File details
Details for the file polars_iptools-0.1.8-cp38-abi3-macosx_10_12_x86_64.whl
.
File metadata
- Download URL: polars_iptools-0.1.8-cp38-abi3-macosx_10_12_x86_64.whl
- Upload date:
- Size: 3.2 MB
- Tags: CPython 3.8+, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.7.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 55a31af3cc6edc4bfbaaca3ac90f41db0bca8d76eecc0541c73720cfe90e9f1f |
|
MD5 | 4523f458a704cbf51d391d5f22bd6762 |
|
BLAKE2b-256 | 2448f2bc882b941f2a15e136fc21ff489c355ab1ad5f59294dad949b858a2ed5 |