Skip to main content

easymdm is an open source mdm system, usefull for user data consolidation.

Project description

pylint PyPi Deployment

Prerequisite

Define a yaml file for configuration details like below, Need to pass its name and location to CLI as shown below

YAML Construct Readme

priority_rule:
  conditions:
    - column: priority_score
      value: 5  # Selects records with exactly priority_score = 5
    - column: confidence_level
      value: 100  # If multiple records have priority_score=5, picks those with confidence_level=100

survivorship:
  rules:
    - column: last_updated
      strategy: most_recent  
    - column: source_id
      strategy: source_priority 
      source_order: ["erp", "crm"]
    - column: address
      strategy: longest_string 
    - column: priority_score
      strategy: highest_value
    - column: confidence_level
      strategy: lowest_value
    - column: quality_rating
      strategy: greater_than_threshold
      threshold: 75
    - column: quality_rating
      strategy: less_than_threshold

Sample YAML

sqlite:
  - DB_PATH: 'D:\path	o\database\'
    DB_NAME: 'mydatabase.db'

blocking:
  columns:
    - first_name
    - last_name
similarity:
  - column: first_name
    method: jarowinkler
  - column: middle_name
    method: jarowinkler
  - column: last_name
    method: jarowinkler
  - column: address
    method: levenshtein
  - column: city
    method: jarowinkler
  - column: zip_code
    method: exact

thresholds:
  review: 0.6
  auto_merge: 0.8

survivorship:
  rules:
    - column: Last_Updated_On
      strategy: most_recent # longest_string

priority_rule:
  conditions:
    - column: original
      value: 1
    - column: Address
      value: *STREET*

CLI Run

uv run roar --help

For flat file
> uv run roar --source file --name D:\path	o_your_file\123.csv --config D:\path	o_your_config

Local Test Run

uv run .\src\easymdm\cli.py --source file --name .\sample\testdata.csv --config .\sample\testdata.yaml --outpath .\sample\
uv run .\src\easymdm\cli.py --source duckdb --name .\sample\testdata.csv --config .\sample\testdata.yaml --outpath .\sample\

Pylint Action

    if: contains(github.event.head_commit.message, 'CheckCodeQuality')

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

easymdm-0.1.2-py3-none-any.whl (34.9 kB view details)

Uploaded Python 3

File details

Details for the file easymdm-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: easymdm-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 34.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.23

File hashes

Hashes for easymdm-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 b486bf0965ad7d4e6bcd3d91c6501027f276371d275581f2902250fd919a51d7
MD5 1216fabb09a91942c148250d081d85f7
BLAKE2b-256 787b070474124b89703e30d3a5e976e396ff0051a3cb9503cabd4d41ce987c51

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page