A Python library for near deduplication and record linkage.
Project description
Introduction
Liken is a library providing enhanced deduplication tooling for DataFrames.
The key features are:
- Near deduplication
- Ready-to-use deduplication strategies
- Record linkage and canonicalization
- Rules-based deduplication
- Pandas, Polars and PySpark support
- Customizable in pure Python
A flexible API
Checkout the API Documentation
Installation
pip install liken
Example
from liken import Dedupe, fuzzy
lk = Dedupe(df)
lk.apply(fuzzy())
df = lk.drop_duplicates("address")
License
This project is licensed under the Apache-2.0 License. See the LICENSE file for more details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
liken-0.4.2.tar.gz
(26.4 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
liken-0.4.2-py3-none-any.whl
(32.0 kB
view details)
File details
Details for the file liken-0.4.2.tar.gz.
File metadata
- Download URL: liken-0.4.2.tar.gz
- Upload date:
- Size: 26.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.10.7 {"installer":{"name":"uv","version":"0.10.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3b4f80ee403ad7d734935fa22ac5125848d0a3bea4a85ccde186957c3ef86af8
|
|
| MD5 |
255edf1ab3490bc0e1c54b8c15e47a7b
|
|
| BLAKE2b-256 |
3b3e8741edfbc9f92fb71ce58f68ad3f3e22257e30623416a9a5cb800ef90c9f
|
File details
Details for the file liken-0.4.2-py3-none-any.whl.
File metadata
- Download URL: liken-0.4.2-py3-none-any.whl
- Upload date:
- Size: 32.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.10.7 {"installer":{"name":"uv","version":"0.10.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d7711a5021c3dd9b05c8a45db6bae6f035f0c01f60fa9d4b7deea19e0f7ca440
|
|
| MD5 |
76605d7cb98ed1b8b333e3b32c869332
|
|
| BLAKE2b-256 |
4fd2ce7aebe77e00decb078a1d83a8e6e6607a628994b88a5c17022cdbc76084
|