Skip to main content

Polars extension for IP address parsing and enrichment including geolocation

Project description

Polars IPTools

Polars IPTools is a Rust-based extension to accelerates IP address manipulation and enrichment in Polars dataframes. This library includes various utility functions for working with IPv4 and IPv6 addresses and geoip and anonymization/proxy enrichment using MaxMind databases.

Install

pip install polars-iptools

Examples

Simple enrichments

IPTools' Rust implementation gives you speedy answers to basic IP questions like "is this a private IP?"

>>> import polars as pl
>>> import polars_iptools as ip
>>> df = pl.DataFrame({'ip': ['8.8.8.8', '2606:4700::1111', '192.168.100.100', '172.21.1.1', '172.34.5.5', 'a.b.c.d']})
>>> df.with_columns(ip.is_private(pl.col('ip')).alias('is_private'))
shape: (6, 2)
┌─────────────────┬────────────┐
 ip               is_private 
 ---              ---        
 str              bool       
╞═════════════════╪════════════╡
 8.8.8.8          false      
 2606:4700::1111  false      
 192.168.100.100  true       
 172.21.1.1       true       
 172.34.5.5       false      
 a.b.c.d          false      
└─────────────────┴────────────┘

is_in but for network ranges

Pandas and Polars have is_in functions to perform membership lookups. IPTools extends this to enable IP address membership in IP networks. This function works seamlessly with both IPv4 and IPv6 addresses and converts the specified networks into a Level-Compressed trie (LC-Trie) for fast, efficient lookups.

>>> import polars as pl
>>> import polars_iptools as ip
>>> df = pl.DataFrame({'ip': ['8.8.8.8', '1.1.1.1', '2606:4700::1111']})
>>> networks = ['8.8.8.0/24', '2606:4700::/32']
>>> df.with_columns(ip.is_in(pl.col('ip'), networks).alias('is_in'))
shape: (3, 2)
┌─────────────────┬───────┐
 ip               is_in 
 ---              ---   
 str              bool  
╞═════════════════╪═══════╡
 8.8.8.8          true  
 1.1.1.1          false 
 2606:4700::1111  true  
└─────────────────┴───────┘

GeoIP enrichment

Using MaxMind's GeoLite2-ASN.mmdb and GeoLite2-City.mmdb databases, IPTools provides offline enrichment of network ownership and geolocation.

ip.geoip.full returns a Polars struct containing all available metadata parameters. If you just want the ASN and AS organization, you can use ip.geoip.asn.

>>> import polars as pl
>>> import polars_iptools as ip

>>> df = pl.DataFrame({"ip":["8.8.8.8", "192.168.1.1", "2606:4700::1111", "999.abc.def.123"]})
>>> df.with_columns([ip.geoip.full(pl.col("ip")).alias("geoip")])

shape: (4, 2)
┌─────────────────┬─────────────────────────────────┐
 ip               geoip                           
 ---              ---                             
 str              struct[11]                      
╞═════════════════╪═════════════════════════════════╡
 8.8.8.8          {15169,"GOOGLE","","NA","","", 
 192.168.1.1      {0,"","","","","","","",0.0,0. 
 2606:4700::1111  {13335,"CLOUDFLARENET","","","… │
 999.abc.def.123  {null,null,null,null,null,null 
└─────────────────┴─────────────────────────────────┘

>>> df.with_columns([ip.geoip.asn(pl.col("ip")).alias("asn")])
shape: (4, 2)
┌─────────────────┬───────────────────────┐
 ip               asn                   
 ---              ---                   
 str              str                   
╞═════════════════╪═══════════════════════╡
 8.8.8.8          AS15169 GOOGLE        
 192.168.1.1                            
 2606:4700::1111  AS13335 CLOUDFLARENET 
 999.abc.def.123                        
└─────────────────┴───────────────────────┘

Spur enrichment

Spur is a commercial service that provides "data to detect VPNs, residential proxies, and bots". One of its offerings is a Maxmind mmdb format of at most 2,000,000 "busiest" Anonymous or Anonymous+Residential ips.

ip.spur.full returns a Polars struct containing all available metadata parameters.

>>> import polars as pl
>>> import polars_iptools as ip

>>> df = pl.DataFrame({"ip":["8.8.8.8", "192.168.1.1", "999.abc.def.123"]})
>>> df.with_columns([ip.spur.full(pl.col("ip")).alias("spur")])

shape: (3, 2)
┌─────────────────┬─────────────────────────────────┐
 ip               geoip                           
 ---              ---                             
 str              struct[7]                       
╞═════════════════╪═════════════════════════════════╡
 8.8.8.8          {0.0,"","","","","",null}       
 192.168.1.1      {0.0,"","","","","",null}       
 999.abc.def.123  {null,null,null,null,null,null 
└─────────────────┴─────────────────────────────────┘

Environment Configuration

IPTools uses two MaxMind databases: GeoLite2-ASN.mmdb and GeoLite2-City.mmdb. You only need these files if you call the geoip functions.

Set the MAXMIND_MMDB_DIR environment variable to tell the extension where these files are located.

export MAXMIND_MMDB_DIR=/path/to/your/mmdb/files
# or Windows users
set MAXMIND_MMDB_DIR=c:\path\to\your\mmdb\files

If the environment is not set, polars_iptools will check two other common locations (on Mac/Linux):

/usr/local/share/GeoIP
/opt/homebrew/var/GeoIP

Spur Environment

If you're a Spur customer, export the feed as spur.mmdb and specify its location using SPUR_MMDB_DIR environment variable.

export SPUR_MMDB_DIR=/path/to/spur/mmdb
# or Windows users
set SPUR_MMDB_DIR=c:\path\to\spur\mmdb

Credit

Developing this extension was super easy by following Marco Gorelli's tutorial and cookiecutter template.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

polars_iptools-0.1.8.tar.gz (48.2 kB view details)

Uploaded Source

Built Distributions

polars_iptools-0.1.8-cp38-abi3-win_amd64.whl (3.5 MB view details)

Uploaded CPython 3.8+ Windows x86-64

polars_iptools-0.1.8-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.8+ manylinux: glibc 2.17+ x86-64

polars_iptools-0.1.8-cp38-abi3-manylinux_2_12_i686.manylinux2010_i686.whl (3.9 MB view details)

Uploaded CPython 3.8+ manylinux: glibc 2.12+ i686

polars_iptools-0.1.8-cp38-abi3-macosx_11_0_arm64.whl (2.8 MB view details)

Uploaded CPython 3.8+ macOS 11.0+ ARM64

polars_iptools-0.1.8-cp38-abi3-macosx_10_12_x86_64.whl (3.2 MB view details)

Uploaded CPython 3.8+ macOS 10.12+ x86-64

File details

Details for the file polars_iptools-0.1.8.tar.gz.

File metadata

  • Download URL: polars_iptools-0.1.8.tar.gz
  • Upload date:
  • Size: 48.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: maturin/1.7.1

File hashes

Hashes for polars_iptools-0.1.8.tar.gz
Algorithm Hash digest
SHA256 4210a58b89b279a74335ff28be3602f325cef3c18a825e0a8f0d969c44d8e77e
MD5 5107a56787682168e169e06b52ed6d4b
BLAKE2b-256 a0958681705050a97fd249d5b8a422d0e6d670372037f1009ea7b0780b37f29f

See more details on using hashes here.

File details

Details for the file polars_iptools-0.1.8-cp38-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for polars_iptools-0.1.8-cp38-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 87dbaf08c58deba708df46ae370381ab71748d39cf508ef108e7eb59a8058d8d
MD5 2a88f60bc24d4797e21f89a79b424c6c
BLAKE2b-256 7e08285109382c52266129d19f3ac258ba8cc1f9f26d4e45ba125fc091fc8ea2

See more details on using hashes here.

File details

Details for the file polars_iptools-0.1.8-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for polars_iptools-0.1.8-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 4f6d0be5794d476c48eb729a3949939369cf65270fba92e117cada7ce32acd33
MD5 88258edb421c87f3094a83f0501e9cf5
BLAKE2b-256 d58360e23beb6ac391d42f4b16c55a17b110017a622109fbd933ccf1b2eac42a

See more details on using hashes here.

File details

Details for the file polars_iptools-0.1.8-cp38-abi3-manylinux_2_12_i686.manylinux2010_i686.whl.

File metadata

File hashes

Hashes for polars_iptools-0.1.8-cp38-abi3-manylinux_2_12_i686.manylinux2010_i686.whl
Algorithm Hash digest
SHA256 dde32d75efc9be1a1cce42aeb736a6f255d143f56eeadf6d81ec00e6fd3b55cd
MD5 c6dcd87a2f4c4a33d72ee95b142764f4
BLAKE2b-256 a6a2f043db553a2933c54f6dbc746d0e137ca2bbfbc9ad225ee2d599c703ae2c

See more details on using hashes here.

File details

Details for the file polars_iptools-0.1.8-cp38-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for polars_iptools-0.1.8-cp38-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 cd0a9c0d7ade0bad8e1f281ab8bbbd150d4bc8310130fb9e17c70eca019c5b7c
MD5 74f4503a3e78868751b61f454d576c50
BLAKE2b-256 61e346b3a69dd9f00d31340e4fb96c8ccd478741ef6d137a3840c01a331c0617

See more details on using hashes here.

File details

Details for the file polars_iptools-0.1.8-cp38-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for polars_iptools-0.1.8-cp38-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 55a31af3cc6edc4bfbaaca3ac90f41db0bca8d76eecc0541c73720cfe90e9f1f
MD5 4523f458a704cbf51d391d5f22bd6762
BLAKE2b-256 2448f2bc882b941f2a15e136fc21ff489c355ab1ad5f59294dad949b858a2ed5

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page