This is a pre-production deployment of Warehouse. Changes made here affect the production instance of PyPI (pypi.python.org).
Help us improve Python packaging - Donate today!

Merge two pandas DataFrames using closest match

Project Description

## merge_closest ### Merge column(s) into DataFrame using closest-without-going-over lookup.

The merge_closest function mimics Excel’s VLOOKUP function in approximate match (range lookup) mode, with added benefits of ensuring the lookup table is sorted and merging any subset of columns from the lookup table. It’s similar to a left join based on the lookup_field value that is closest to the data_field value without going over. Only the first matching row from lookup is merged; if lookup_field contains duplicates, it’s up to the user to drop duplicates in advance as appropriate. For any row in data whose data_field value is less than all values in lookup_field, lookup values in result will be missing.

By default, all columns in lookup other than lookup_field will be merged with data. A specific list of columns from lookup to include or exclude can be passed. If both include_cols and exclude_cols are provided, exclude_cols is ignored. To include lookup_field in result, lookup_field must be passed in include_cols, perhaps as include_cols=lookup.columns.

### This is my first Python package and first GitHub repo…

…and any comments or suggestions are most welcome.

### Use

from merge_closest import merge_closest

merge_closest(data, lookup, data_field, lookup_field,
include_cols=None, exclude_cols=None, presorted=False)

### Parameters

`data` : pandas DataFrame

`lookup` : pandas DataFrame

`data_field` : name of column in data

`lookup_field` : name of column in lookup

`include_cols` : list, pandas Index, or single value; columns from
lookup to include in result. If neither include_cols nor exclude_cols is provided, all columns from lookup except lookup_field are included in result.
`exclude_cols` : list, pandas Index, or single value; columns from
lookup to exclude from result. By default, lookup_field will be excluded even if not in exclude_cols. To include lookup_field, pass it in include_cols. If both include_cols and exclude_cols are provided, exclude_cols is ignored.
`presorted` : boolean; unless set to True, lookup will be sorted
ascending on lookup_field prior to merging.

### Returns

`result` : pandas DataFrame

Release History

Release History

This version
History Node

0.1b160919-3

Download Files

Download Files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

File Name & Checksum SHA256 Checksum Help Version File Type Upload Date
merge_closest-0.1b160919-3.tar.gz (3.1 kB) Copy SHA256 Checksum SHA256 Source Sep 20, 2016

Supported By

WebFaction WebFaction Technical Writing Elastic Elastic Search Pingdom Pingdom Monitoring Dyn Dyn DNS Sentry Sentry Error Logging CloudAMQP CloudAMQP RabbitMQ Heroku Heroku PaaS Kabu Creative Kabu Creative UX & Design Fastly Fastly CDN DigiCert DigiCert EV Certificate Rackspace Rackspace Cloud Servers DreamHost DreamHost Log Hosting