Skip to main content

Translationese analyzer

Project description

TRANSLATIONESE ANALYSER

This distribution is an application with a user interface designed for analyzing the phenomenon of translationese in translations from English into Russian.

Translationese refers to the unique features of translated texts that differentiate them from original, non-translated texts written in the target language. By employing comparative analysis across translated and non-translated corpora, the program identifies specific linguistic indicators associated with this phenomenon.


Main Features

1. Text preprocessing

  • Removing references and weblinks.
  • Adjusting text length based on user's choice.
  • Ensuring correct sentence segmentation.

Input Options:

  • Direct Input: Paste text using Ctrl+V in the console.
  • File Input: Place .txt files in one of the following folders:
    • auth_texts for non-translated texts.
    • mt_texts for machine translations.
    • ht_texts for human translations.

Note:
If you choose the File Input option, make sure to create the folders auth_texts, mt_texts, and ht_texts in your working directory and place your file for analysis into one of these folders.

If these folders are not present, they will be automatically created in the root of your working directory when you attempt to select this option. If no file is found, the program will still create these folders for you.

Processed Texts:
After the text is processed, it will be automatically saved in one of the following directories:

  • auth_ready for non-translated texts.
  • mt_ready for machine-translated texts.
  • ht_ready for human-translated texts.

These directories will be created automatically in the root of your working directory if they do not already exist.

2. Indicator Analysis

The program analyzes texts across five groups of translationese characteristics:

  • Simplification which suggests that translated texts are structurally and lexically simpler than non-translated texts.
  • Normalization assumes that translated texts tend to use more normalized grammatical structures and fixed expressions.
  • Explicitation highlights the tendency in translated texts to explicitly express elements that are implicit in the original text.
  • Interference captures the transfer of source language features into translation.
  • Other translationese indicators which cover additional features outside the main characteristics.

When users run the program, they will see all available translationese indicators displayed in the interface for detailed analysis and exploration.

3. Text Metadata Passport

Allows creating and viewing text metadata profiles for comprehensive analysis.

4. Corpora and Individual Texts Information Display

  • Displays detailed metrics for a selected text.
  • Summarizes and shows data across the entire corpus.
  • Displays the comparison of gathered data across all corpora.

5. Morphological and Syntactic Annotation

Performs linguistic annotations at both levels for further insights.


Installation and Use

1. Requirements:

  • Python (>=3.9)
  • Development Environment (e.g., PyCharm Community Edition)

If you are new to Python, refer to the Python Installation Guide.

2. Installation Instructions

  1. Set up a virtual environment:

    • Windows:
      python -m venv venv
      
    • macOS/Linux:
      python3.9 -m venv venv
      
  2. Activate the virtual environment:

    • Windows:
      source venv/Scripts/activate
      
    • macOS/Linux:
      source venv/bin/activate
      
  3. Install the package:
    Inside the virtual environment, run:

    pip install translationese_analyzer
    
  4. Set up the main script for execution:
    Create a Python file (e.g., main.py) in your project directory with the following content:

    from translationese_analyzer import start_analysis
    
    start_analysis()
    
  5. Run the script:
    It is strongly recommended to run the script using the "RUN" button in your IDE (e.g., PyCharm). This ensures that the full text for analysis is correctly captured, especially when pasting large texts.
    If you prefer to use the terminal, execute the following command:

    python main.py
    

Enjoy Using the Program!

We hope you find this application helpful in analyzing translationese features in translated texts. If you have any questions, issues, or feedback, please feel free to reach out via email:

Olesya Serova
Email: serovaolesyau@gmail.com

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

translationese_analyzer-0.0.1-py3-none-any.whl (331.0 kB view details)

Uploaded Python 3

File details

Details for the file translationese_analyzer-0.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for translationese_analyzer-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 6766ff7f1c116c86ae6d26fdb70fd3796a6e54bd2f24246f9aa5f3fe7dd9a8e6
MD5 77c7a3b9e929d6083046c59beffd2c4f
BLAKE2b-256 d0c91da750462a22d08eada70b225bd318d8534a4245e9b6e52446efcc2124b1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page