tulona

A tool to compare data from different sources.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

mrinalsardar

These details have not been verified by PyPI

Project description

A utility to compare tables, espacially useful to perform validations for migration projects.

Testing
Package
Meta

Functionality

The basic functionality of tulona is to compare datasets, write them into Excel files and highlight the mismatches.

Connection Profiles

Connection profiles is a yaml file that will store credentials and other details to connect to the databases/data sources.

It must be setup in profiles.yml file and it must be placed under $HOME/.tulona dierctory. Create a directory named .tulona under your home directory and place profiles.yml under it.

This is what a sample profiles.yml looks like:

integration_project: # project_name
  profiles:
    pgdb:
      type: postgres
      host: localhost
      port: 5432
      database: postgres
      username: postgres
      password: postgres
    mydb:
      type: mysql
      host: localhost
      port: 3306
      database: db
      username: user
      password: password
    snowflake:
      type: snowflake
      account: snowflake_account
      warehouse: dev_x_small
      role: dev_role
      database: dev_stage
      schema: user_schema
      user: dev_user
      private_key: 'rsa_key.p8'
      private_key_passphrase: 444444
    mssql:
      type: mssql
      connection_string: 'mssql+pymssql://<username>:<password>@<freetds_name>/?charset=utf8'
    bigquery:
      type: bigquery
      method: service_account
      project: training-338516
      key_file: "/path/to/service_account/training-338516-362fa3727bae.json"

Project Config File

Project config file stores the properties of the tables that need to be compared. It must be created in tulona-project.yml file and this file can be placed anywhere and that directory will be considered project root directory. Which means that the output` folder will be created under that directory where all results will be stored. It’s always a good idea to create an empty directory and store tulona-project.yml under it.

This is how a tulona-project.yml file looks like:

version: '2.0'
name: integration_project
config_version: 1

outdir: output # optional

# Datasource names must be unique
datasources:
  employee_postgres:
    connection_profile: pgdb
    database: postgresdb
    schema: corporate
    table: employee
    primary_key: Employee_ID
    exclude_columns:
      - Email
      - Name
    compare_column: Employee_ID
  employee_mysql:
    connection_profile: mydb
    schema: corporate
    table: employee
    primary_key: Employee_ID
    exclude_columns:
      - Phone_Number
    compare_column: Employee_ID
  person_postgres:
    connection_profile: pgdb
    database: postgresdb
    schema: corporate
    table: people_composite_key
    primary_key:
      - ID_1
      - ID_2
    # exclude_columns:
    #   - name
    compare_column:
      - ID_1
      - ID_2
  person_mysql:
    connection_profile: mydb
    schema: corporate
    table: people_composite_key
    primary_key:
      - ID_1
      - ID_2
    # exclude_columns:
    #   - phone_number
    compare_column:
      - ID_1
      - ID_2
  postgresdb_postgres:
    connection_profile: pgdb
    database: postgresdb
  none_mysql:
    connection_profile: mydb
  postgresdb_postgres_schema:
    connection_profile: pgdb
    database: postgresdb
    schema: corporate_copy
  none_mysql_schema:
    connection_profile: mydb
    schema: corporate
  employee_postgres_query:
    connection_profile: pgdb
    database: postgresdb
    schema: corporate
    query: select * from postgresdb.corporate.employee
    primary_key: Employee_ID
    exclude_columns:
      - name
    compare_column: Employee_ID
  employee_mysql_query:
    connection_profile: mydb
    schema: corporate
    query: select * from corporate.employee
    primary_key: Employee_ID
    exclude_columns:
      - phone_number
    compare_column: Employee_ID
  employee_postgres_query_tab:
    connection_profile: pgdb
    database: postgresdb
    schema: corporate
    table: employee
    query: select * from postgresdb.corporate.employee
    primary_key: Employee_ID
    exclude_columns:
      - name
    compare_column: Employee_ID
  employee_mysql_query_tab:
    connection_profile: mydb
    schema: corporate
    table: employee
    query: select * from corporate.employee
    primary_key: Employee_ID
    exclude_columns:
      - phone_number
    compare_column: Employee_ID
  cust_bq:
    connection_profile: bigquery
    project: training-338516
    dataset: dummy_fashion_retail
    table: customers
    primary_key: customer_id
    compare_column:
      - customer_id
  cust_snow:
    connection_profile: snowflake
    database: training
    schema: dummy_fashion_retail
    table: customers
    primary_key: customer_id
    compare_column:
      - customer_id


# List of task configs(Dict)
# Depending on the accepted params, task config can have different params
# The value for that `task` key is the name of the command you want to run
task_config:
  - task: ping
    datasources:
      - person_postgres
      - none_mysql
      - employee_mysql_query

  - task: profile
    datasources:
      - employee_postgres
      - employee_mysql
    compare: true

  - task: profile
    datasources:
      - person_postgres
      - person_mysql

  - task: compare-row
    datasources:
      - employee_postgres
      - employee_mysql
    sample_count: 30

  - task: compare-row
    datasources:
      - employee_postgres
      - employee_mysql

  - task: compare-row
    datasources:
      - employee_postgres_query
      - employee_mysql_query

  - task: compare-column
    datasources:
      - employee_postgres
      - employee_mysql

  - task: compare-column
    datasources:
      - person_postgres
      - person_mysql
    composite: false # If it's false, specifying it is optional

  - task: compare-column
    datasources:
      - person_postgres
      - person_mysql
    composite: true

  - task: compare
    datasources:
      - employee_postgres
      - employee_mysql
    composite: true

  - task: compare
    datasources:
      - person_postgres
      - person_mysql
    composite: true
    sample_count: 30

  - task: scan
    datasources:
      - postgresdb_postgres_schema

  - task: scan
    datasources:
      - postgresdb_postgres
      - none_mysql
    compare: false

  - task: scan
    datasources:
      - postgresdb_postgres_schema
      - none_mysql_schema
    compare: true

  - task: scan
    datasources:
      - postgresdb_postgres
      - none_mysql
    compare: true
  - task: compare
    datasources:
      - employee_postgres_query_tab
      - employee_mysql_query_tab
  - task: compare
    datasources:
      - cust_bq
      - cust_snow

Features

Executing tulona or tulona -h or tulona –help returns available commands. If you don’t setup task_config, all commands take one mandatory parameter, –datasources, a comma separated list of names of datasources from project config file (tulona-project.yml).

Tulona has following commands available:

ping: To test connectivity to the databases for the datasources. Sample command:
- To ping one data source pass the name to the –datasources parameter:
  
  tulona ping --datasources employee_postgres
- More than one datasources can be passed to the –datasources parameter separated by commas:
  
  tulona ping --datasources employee_postgres,employee_mysql
- To ping all the datasources, just skip the –datasources parameter:
  
  tulona ping
profile: To extract and compare metadata of two sources/tables. It includes metadata from information_schema related to the tables and some column level metrics (min, max, average, count & distinct_count). Note that specifying database, schema and table is required for profile to work regardless the use of query. Sample commands:
- Profiling without –compare flag. It will write metadata and metrics about different sources/tables in different sheets/tabs in the excel file (not a comparison view):
  
  tulona profile --datasources employee_postgres,employee_mysql
- Profiling with –compare flag. It will produce a comparison view (side by side):
  
  tulona profile --compare --datasources employee_postgres,employee_mysql
- Sample output will be something like this:
compare-row: To compare sample data from two sources/tables/queries. It will create a comparative view of all common columns from both sources/tables side by side (like: id_ds1 <-> id_ds2) and highlight mismatched values in the output excel file. By default it compares 20 common rows from both tables (subject to availabillity) but the number can be overridden with the command line argument –sample-count. Command samples:
- Command without –sample-count parameter:
  
  tulona compare-row --datasources employee_postgres,employee_mysql
- Command with –sample-count parameter:
  
  tulona compare-row --sample-count 50 --datasources employee_postgres,employee_mysql
- Compare queries instead of tables, useful when you want to compare resutls of two queries:
  
  tulona compare-row --datasources employee_postgres_query,employee_mysql_query
- Sample output will be something like this:
compare-column: To compare columns from tables from two sources/tables. This is expecially useful when you want see if all the rows from one table/source is present in the other one by comparing the primary/unique key. The result will be an excel file with extra primary/unique keys from both sides. If both have the same set of primary/unique keys, essentially means they have the same rows, excel file will be empty. Command samples:
- Column[s] to compare is[are] specified in tulona-project.yml file as part of datasource configs, with compare_column property. Sample command:
  
  tulona compare-column --datasources employee_postgres,employee_mysql
- Compare multiples columns as composite key (combination of column values will be compared) with additional –composite flag:
  
  tulona compare-column --composite --datasources employee_postgres,employee_mysql
- Sample output will be something like this:
compare: To prepare a comparison report for evrything together. To executed this command just swap the command from any of the above commands with compare. It will prepare comparison of everything and write them into different sheets of a single excel file. Sample command:

tulona compare --datasources employee_postgres,employee_mysql
scan: To scan and compare databases or schemas in terms of metadata and tables present if you want to compare all tables and don’t want to set up datasource config for all of them. Sample commands:
- Scan without comparing:
  
  tulona scan --datasources postgresdb_postgres_schema,none_mysql_schema
- Scan and compare:
  
  tulona scan --compare --datasources postgresdb_postgres_schema,none_mysql_schema
run: To execute all the tasks defined in the task_config section. Sample command:

tulona run

If you setup task_config, there is no need to pass the –datasources parameter. In that case the following command (to compare some datasoruces):

tulona compare --datasources employee_postgres,employee_mysql

will become this:

tulona compare

and it will run all the compare tasks defined in the task_config section. From our example project config file above, it will run 2 compare tasks.

Also setting up task_config can be greatly benificial as you can set up different instance of same/different tasks with different config to execute in one go with the run command.

Please look at the sample project config from above to understand how to set up task_config property.

To know more about any specific command, execute tulona <command> -h.

Supported Data Platforms

Platform	Adapter Name	Supported Auth Mechanism
Postgres	postgres	Connection string, Password
MySQL	mysql	Connection string, Password
Snowflake	snowflake	Password, Key pair, SSO (Externalbrowser)
Microsoft SQL Server	mssql	Connection string (sqlalchemy[pymssql or pyodbc])
BigQuery	bigquery	Service Account Json Key

Development Environment Setup

For live installation execute pip install -e “.[dev]”.

Build Wheel Executable

Execute python -m build.

Install Wheel Executable File

Execute pip install <wheel-file.whl>

Create A Pull Request

To create a PR, follow the below steps:

Bump the version: bump-my-version bump major/minor/patch
Commit bumped changes (current version -> bumped version): git commit -am “Bumped from vx.x.x -> vx.x.x”
Push bumped changes: git push
Tag it (bumped version): git tag -a vx.x.x -m “vx.x.x”
Push the tag (deploys to TestPyPI): git push origin vx.x.x
Install in from TestPyPI and test (optional)
Create a PR from your branch into main in github

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

mrinalsardar

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.12.0

Jan 18, 2026

0.11.0

Aug 5, 2024

0.10.1

Jul 5, 2024

0.10.0

Jun 11, 2024

0.9.1

Jun 9, 2024

0.8.2

May 23, 2024

0.8.1

May 18, 2024

0.8.0 yanked

May 16, 2024

Reason this release was yanked:

bug in row comparison

0.7.12

May 15, 2024

0.7.11

May 12, 2024

0.7.10

May 12, 2024

0.7.8

May 11, 2024

0.7.6

May 10, 2024

0.7.5

May 4, 2024

0.7.4

Apr 29, 2024

0.7.1

Apr 28, 2024

0.6.3

Apr 26, 2024

0.5.4

Apr 14, 2024

0.5.2

Apr 14, 2024

0.5.1

Apr 13, 2024

0.5.0

Apr 13, 2024

0.4.0

Apr 12, 2024

0.3.3

Apr 11, 2024

0.3.2

Apr 10, 2024

0.3.1

Apr 9, 2024

0.3.0

Apr 9, 2024

0.2.6

Apr 7, 2024

0.2.5

Apr 7, 2024

0.2.4

Apr 7, 2024

0.2.3

Apr 6, 2024

0.2.2

Apr 6, 2024

0.2.1

Apr 5, 2024

0.2.0

Apr 3, 2024

0.1.1

Apr 1, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tulona-0.12.0.tar.gz (36.3 kB view details)

Uploaded Jan 18, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

tulona-0.12.0-py3-none-any.whl (40.2 kB view details)

Uploaded Jan 18, 2026 Python 3

File details

Details for the file tulona-0.12.0.tar.gz.

File metadata

Download URL: tulona-0.12.0.tar.gz
Upload date: Jan 18, 2026
Size: 36.3 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for tulona-0.12.0.tar.gz
Algorithm	Hash digest
SHA256	`ed49c5ebf993d508c38d680619a764b28b659004af34ad239281e936375b48f7`
MD5	`761034aab8d2e339285fe0de826b1295`
BLAKE2b-256	`e0effca102cec39152e0c9657b791a86c56d90671d0b24076bfac63df5a23d8e`

See more details on using hashes here.

Provenance

The following attestation bundles were made for tulona-0.12.0.tar.gz:

Publisher: publish.yaml on mrinalsardar/tulona

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: tulona-0.12.0.tar.gz
- Subject digest: ed49c5ebf993d508c38d680619a764b28b659004af34ad239281e936375b48f7
- Sigstore transparency entry: 833592449
- Sigstore integration time: Jan 18, 2026
Source repository:
- Permalink: mrinalsardar/tulona@e6175aaa4bb02e5d97f5414bc7e7e78b53745af9
- Branch / Tag: refs/heads/main
- Owner: https://github.com/mrinalsardar
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yaml@e6175aaa4bb02e5d97f5414bc7e7e78b53745af9
- Trigger Event: pull_request

File details

Details for the file tulona-0.12.0-py3-none-any.whl.

File metadata

Download URL: tulona-0.12.0-py3-none-any.whl
Upload date: Jan 18, 2026
Size: 40.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for tulona-0.12.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ce1ba10ccecb42272845378a946e60471386d3e03827f3b1cbe57f1653d03f20`
MD5	`db5a5db5d87f997bedeb849a9f1e1ebd`
BLAKE2b-256	`3a0663a1ca2fb384b73eed3bfe24e7586232046bc92606cd0923b88f73405e75`

See more details on using hashes here.

Provenance

The following attestation bundles were made for tulona-0.12.0-py3-none-any.whl:

Publisher: publish.yaml on mrinalsardar/tulona

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: tulona-0.12.0-py3-none-any.whl
- Subject digest: ce1ba10ccecb42272845378a946e60471386d3e03827f3b1cbe57f1653d03f20
- Sigstore transparency entry: 833592452
- Sigstore integration time: Jan 18, 2026
Source repository:
- Permalink: mrinalsardar/tulona@e6175aaa4bb02e5d97f5414bc7e7e78b53745af9
- Branch / Tag: refs/heads/main
- Owner: https://github.com/mrinalsardar
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yaml@e6175aaa4bb02e5d97f5414bc7e7e78b53745af9
- Trigger Event: pull_request

tulona 0.12.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Functionality

Connection Profiles

Project Config File

Features

Supported Data Platforms

Development Environment Setup

Build Wheel Executable

Install Wheel Executable File

Create A Pull Request

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance