Alignment tool based on fast_align

Project description


systran-align is a small alignment tool that is based on


pip install systran-align


import systran_align

Generating alignment probabilities

    input_path: str,
    forward_probs_path: str,
    backward_probs_path: str,
    verbose: bool = False,
    iterations: int = 5,
    favor_diagonal: bool = False,
    beam_threshold: float = -4,
    diagonal_tension: float = 4,
    optimize_tension: bool = False,
    variational_bayes: bool = False,
    alpha: float = 0.01,
    no_null_word: bool = False,
    prob_align_null: float = 0.08,
    thread_buffer_size: int = 10000,


  • input_path: text file where each line is a source-target example with format:
<source> ||| <target>


  • forward_probs_path: binary file containing forward probabilities
  • backward_probs_path: binary file containing backward probabilities

Computing alignments

aligner = systran_align.Aligner(
    forward_probs_path: str,
    backward_probs_path: str,

# result is a dict with fields:
# * alignments
# * forward_log_prob
# * backward_log_prob
result = aligner.align(
    source: List[str],
    target: List[str],

# Batch alternative:
results = aligner.align_batch(
    source: List[List[str]],
    target: List[List[str]],

