Skip to main content

A Python package for Cleansing Matching

Project description

What is it?

Package for clean text to alfabeth only, clean text to number only, clean name, split name, clean nik, validation format nik and text scoring similarity

Installation

pip install claming

How to use

Import package

from claming import Cleansing, Matching

Define function

clean = Cleansing()
match = Matching()

Alfabeth only

user params case sensitive : upper, lower, capitalize or title, default is capitalize

clean.alfabeth_only('+62 adalah kode negara Indonesia', case_sensitive='capitalize')
# Result
# Adalah kode negara indonesia

Number only

user params output_type : int or str, default is int

clean.number_only("+6281234123412", output_type='int')
# Result
# 6281234123412

Clean name

case sensitive : upper, lower, capitalize or title, default is upper

clean.clean_name(' John D3ve.r  Smith')
# Result
# {'input': ' John D3ve.r  Smith', 'output': 'JOHN DVER SMITH'}

Clean NIK

user params output_type : int or str, default is str

clean.clean_nik(3212300808080003)
# Result
# {'input': '3212300808080003', 'output': '3212300808080003', 'description': 'NIK format is correct'}

Split name

case sensitive : upper, lower, capitalize or title, default is upper
num_split : number of split 2 or 3, default is 3
when num_split is 2 : then the first name will be the first word and the last name will be the second until the last word
when num_split is 3 : then the first name will be the first word, the middle name will be the second word and the last name will be the third until the last word

clean.split_name(' John D3ve.r  Smith', num_split=3)
# Result
# {'original_name': ' John D3ve.r  Smith', 'full_name': 'JOHN DVER SMITH', 'first_name': 'JOHN', 'middle_name': 'DVER', 'last_name': 'SMITH'}

clean.split_name(' John D3ve.r  Smith', num_split=2)
# Result
# {'original_name': ' John D3ve.r  Smith', 'full_name': 'JOHN DVER SMITH', 'first_name': 'JOHN', 'last_name': 'DVER SMITH'}

Text similarity

match.exact_match('JOHN DVER SMITH', 'JOHN DVER SMITH')
# Result
# {'first_text': 'JOHN DVER SMITH', 'second_text': 'JOHN DVER SMITH', 'score': 1, 'max_score': 1}

match.levenshtein_match('JOHN DVER SMITH', 'JOHN DVER SMITH')
# Result
# {'first_text': 'JOHN DVER SMITH', 'second_text': 'JOHN DVER SMITH', 'score': 1.0, 'max_score': 1}

match.part_exact_match('JOHN DVER SMITH', 'JOHN DVER SMITH')
# Result
# {'first_text': 'JOHN DVER SMITH', 'second_text': 'JOHN DVER SMITH', 'score': 1.0, 'max_score': 1}

match.part_levenshtein_match('JOHN DVER SMITH', 'JOHN DVER SMITH')
# Result
# {'first_text': 'JOHN DVER SMITH', 'second_text': 'JOHN DVER SMITH', 'score': 3.0, 'max_score': 3}

match.all_method_match('JOHN DVER SMITH', 'JOHN DVER SMITH')
# Result
# {'first_text': 'JOHN DVER SMITH', 'second_text': 'JOHN DVER SMITH', 'first_text_clean': 'JOHN DVER SMITH', 'second_text_clean': 'JOHN DVER SMITH', 'exact_match': {'score': 1, 'max_score': 1}, 'levenshtein': {'score': 1.0, 'max_score': 1}, 'part_exact_match': {'score': 3, 'max_score': 3}, 'part_levenshtein': {'score': 3.0, 'max_score': 3}}

Project details


Release history Release notifications | RSS feed

This version

1.5

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

claming-1.5.tar.gz (3.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

claming-1.5-py3-none-any.whl (4.9 kB view details)

Uploaded Python 3

File details

Details for the file claming-1.5.tar.gz.

File metadata

  • Download URL: claming-1.5.tar.gz
  • Upload date:
  • Size: 3.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.10

File hashes

Hashes for claming-1.5.tar.gz
Algorithm Hash digest
SHA256 bada18f9ddfbd9c5f3958c1020db24218f4f13362a101c9cb2d68b9ba0085d62
MD5 33ddb61de15c86b9142e4c302028b56f
BLAKE2b-256 40894f86bc6478123ed496aa0501dc7de46f226e9b7b662d2c648cb962aa7238

See more details on using hashes here.

File details

Details for the file claming-1.5-py3-none-any.whl.

File metadata

  • Download URL: claming-1.5-py3-none-any.whl
  • Upload date:
  • Size: 4.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.10

File hashes

Hashes for claming-1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 0bbf6e13cdd68e639447a807a8b2852fbef6d82debd51bd922d6c7e1e1fbeff9
MD5 608ab5ce0a20f838a286131c6e87584c
BLAKE2b-256 f8498a2b2f47bbdc92c586e2a5ad3357d8c97539a8dd6967fe406104a2f6cd74

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page