Skip to main content

A Python library for near deduplication and record linkage.

Project description

PyPI Version PyPI - Python Version

Introduction

Liken is a library providing enhanced deduplication tooling for DataFrames.

The key features are:

  • Near deduplication
  • Ready-to-use deduplication strategies
  • Record linkage and canonicalization
  • Rules-based deduplication
  • Pandas, Polars and PySpark support
  • Customizable in pure Python

A flexible API

Checkout the API Documentation

Installation

pip install liken

Example

from liken import Dedupe, fuzzy

lk = Dedupe(df)

lk.apply(fuzzy())

df = lk.drop_duplicates("address")

License

This project is licensed under the Apache-2.0 License. See the LICENSE file for more details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

liken-0.4.4.tar.gz (26.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

liken-0.4.4-py3-none-any.whl (32.2 kB view details)

Uploaded Python 3

File details

Details for the file liken-0.4.4.tar.gz.

File metadata

  • Download URL: liken-0.4.4.tar.gz
  • Upload date:
  • Size: 26.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.10.11 {"installer":{"name":"uv","version":"0.10.11","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for liken-0.4.4.tar.gz
Algorithm Hash digest
SHA256 acf66e8148f404def34737973c52a6d40ada00cea2bc82b4b5bccb780dc62da4
MD5 ec537a45fef379996ef7feed79e70e54
BLAKE2b-256 9ec94e178ac04d5e7a0cab3696292798e17b36a6d8e3ed68d48ce8e84911c377

See more details on using hashes here.

File details

Details for the file liken-0.4.4-py3-none-any.whl.

File metadata

  • Download URL: liken-0.4.4-py3-none-any.whl
  • Upload date:
  • Size: 32.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.10.11 {"installer":{"name":"uv","version":"0.10.11","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for liken-0.4.4-py3-none-any.whl
Algorithm Hash digest
SHA256 b348d32d3d60492c01afcd0d47fb8797387e179a1a40e37339ef4865f7ef7256
MD5 00a0e82fee54d3a3ecc60f789a28907d
BLAKE2b-256 1e5693aa3062d7d08eb277828c7c00b460ed8766e3ad142136561499dbad728f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page