Skip to main content

A package with a simple 1D-DTW implementation for sentence alignment.

Project description

DTW-Sentence-Alignment

A simple, low-dependency package for aligning sentences by minimizing a chosen metric.

Overview

DTW-Sentence-Alignment is a Python package that provides functionality for aligning sentences using Dynamic Time Warping (DTW) algorithm. It allows users to align sentences based on custom similarity functions or predefined metrics. The alignment works by maximizing a score. Additionally, compared to other implementation, the first starting point does not have to be (0,0) and the last ending point does not have to be (n,m).

Installation

To install the package, you can use pip: pip install dtwsa

Usage

Here's a basic example of how to use the package:

from dtwsa import SentenceAligner
from dtwsa.metrics import WER_similarity

# Align sentences
list_1 = [
    "Something which does not match",
    "Matching sentence number one",
    "Something which does not match",
    "Another matching sentence",
    "Something which does not match",
    "Random Sentence which should match",
    "This should be matched with something",
    "Yet another matching sentence",
    "Random Sentence which should match",
    "This should be matched with something",
    "Yet another matching sentence",
    "Something which does not match",
    "Something which does not match",
    "Something that matches again",
    "Something which does not match",
]

list_2 = [
    "Something which does not match",
    "Matching sentence number one",
    "Another matching sentence",
    "Random Sentence which should match",
    "This should be matched with something",
    "Yet another matching sentence",
    "Random Sentence which should match",
    "This should be matched with something",
    "Yet another matching sentence",
    "Something that matches again",
    "Something leftover",
]

alignment, score = aligner.align_sentences(list_1, list_2)

print(f"Alignment: {alignment}")
print(f"Score: {score}")

# Plot the alignment
aligner.visualize_alignment(list_1, list_2)

Features

  • Flexible sentence alignment using custom similarity functions
  • Predefined metrics like Word Error Rate (WER) similarity
  • Simple API for easy integration

TODO

  1. Improve efficiency of the alignment algorithm
  2. Improve efficiency of the alignment algorithm
  3. Improve efficiency of the alignment algorithm
  4. Add new metrics for sentence comparison (e.g., BLEU score, cosine similarity)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dtwsa-0.0.1.tar.gz (5.7 kB view details)

Uploaded Source

Built Distribution

dtwsa-0.0.1-py3-none-any.whl (4.5 kB view details)

Uploaded Python 3

File details

Details for the file dtwsa-0.0.1.tar.gz.

File metadata

  • Download URL: dtwsa-0.0.1.tar.gz
  • Upload date:
  • Size: 5.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.5

File hashes

Hashes for dtwsa-0.0.1.tar.gz
Algorithm Hash digest
SHA256 2af15ee633e397cd0eb3e41e8b507fe1821c8a0008d4db1058c504c48c89d7b4
MD5 220c77b46fa8be871e5dd2ea8a30b678
BLAKE2b-256 d0c5d82e714d4af329145ef7c73e811a04cbfad95480e8467dbb05e870d2d484

See more details on using hashes here.

File details

Details for the file dtwsa-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: dtwsa-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 4.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.5

File hashes

Hashes for dtwsa-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 a705cde2a83af5c2b0c17614adafdbcff46bdb62b488042506c497dbbb778b7d
MD5 bdc7ae31a52820d22b44a276473975c5
BLAKE2b-256 96e74a5ea2ffa131842ccbb7e9cd320b4460792f1b774cc7693f3b2a80629e5b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page