Skip to main content

Line-Level Segmentation using an ARU-Net architecture

Project description

LineSegmentation

This project contains code necessary to perform line-level segmentation in TensorFlow 2. Using the provided scripts, the model can be trained and also used for inference.

This project can be used by cloning the repository and running the manually. However, it is also available in Anaconda Cloud and can be used in any Conda environment.

Dependencies

  • Python 3.x
  • TensorFlow 2.x
  • Numpy
  • Pillow
  • Pandas
  • Matplotlib
  • Tqdm
  • Scikit-learn

A .yaml file for each supported platform has been included that specifies the necessary dependencies. A conda environment for MacOS/Windows/Linux can be created and activated by running the following commands:

conda env create -f environment_linux.yaml  # or environment_macos.yaml, environment_windows.yaml
conda activate lineseg_env

Usage with Provided Scripts

Using the code available in this repository, you have access to the train.py and inference.py scripts.

Train

Training can be done with the following command:

python train.py <TRAIN_CONFIG_FILE>

The train configuration file contains all the settings needed to train a line segmentation model. To train your own model, simply modify the configuration file arguments. Explanations of the arguments are given below:

Configuration File Arguments:

  • train_csv_path: The path to the train images in the dataset
  • val_csv_path: The path to the validation images in the dataset. If this parameter is not set, the training set will be split according to the train_size parameter.
  • train_size: The ratio used to determine the size of the train/validation split. If split_train_size is set to 0.8, then the training set will contain 80% of the data, and validation 20%. The dataset is not shuffled before being split.
  • model_out: The path for where to store the model weights after training
  • model_in: The path to the pre-trained model weights
  • img_size: The height and width of the image after it has been resized
  • epochs: The number of epochs (times through the training set) to train
  • batch_size: The number of images in a mini-batch
  • learning_rate: The learning rate the optimizer uses during training
  • shuffle_size: The number of images that will be loaded into memory and shuffled during the training process In most cases, this number shouldn't change. However, if you are running into memory constraints, you can lower this number. A shuffle_size of 0 results in no shuffling

Inference

Using the inference.py script, you can perform inference on a pre-trained model.

Inference can be performed by running the following command:

python inference.py <INFERENCE_CONFIG_FILE>

The inference configuration file contains all the settings needed to perform inference on a line segmentation model. To perform inference on your own model, simply modify the configuration file arguments. Explanations of the arguments are given below:

Configuration File Arguments:

  • img_path: The path to the directory of images to be inferred
  • img_path_subdirs: Whether or not to include images included in subdirectories of the img_path
  • out_path: The path to the directory that segmented line snippets will be stored
  • model_in: The path to the pre-trained model weights
  • coordinate_naming: Whether or not to save the coordinate information in each line snippet's name
  • save_raw: Whether or not to save the raw output of the semantic segmentation model
  • raw_path: The path to the directory that the raw output images will be stored
  • img_size: The size which all images will be resized for inference
  • batch_size: The size of the mini-batch used during inference

Usage with Conda Package

Potentially, the easiest way to access the code is to import the conda package that is available on Anaconda-Cloud. No cloning of this repository is necessary.

conda install -c byu-handwriting-lab lineseg

Code can then be accessed like any normal python package. For example, to use the recognition model, you could write something like this:

from lineseg.model import ARUNet
from lineseg.seg import segment_from_predictions

import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt

model = ARUNet()

# Load some pretrained weights
model_weights_path = './some/path/to/model/weights'
model.load_weights(model_weights_path)

# Simulate creating an image with random numbers
path_to_images = '/path/to/images/to/be/inferred/'
output_path = 'path/to/save/text/line/snippets'
img_size = (1024, 1536)

dataset = ds.get_encoded_inference_dataset_from_img_path(path_to_images, img_size)

# Run the images through the segmentation model
for image, img_name in dataset:
    output = model(image)
    prediction = tf.argmax(output, axis=3)

    # Show the raw model output
    plt.imshow(tf.squeeze(prediction))
    plt.pause(0.01)

    # Segment individual lines based on model output
    segment_from_predictions(image, prediction, img_name, output_path, plot_images=True)

Build the Conda Package to be uploaded to Anaconda Cloud

This project can be packaged with Anaconda and uploaded to the cloud. It is done through the use of setup.py and meta.yaml. Slight modifications to these files may need to take place if dependencies to the code base change. The project can be packaged using the following conda-build command.

conda-build ./conda.recipe -c defaults -c conda-forge

Make sure the lineseg environment has been built and activated before you run the conda-build command.

conda env create -f environment.yaml
conda activate linseg_env

Once the project has been packaged, the packaged file can be uploaded to Anaconda Cloud (Anaconda-Client is required):

anaconda upload -u BYU-Handwriting-Lab <FILENAME>

Project details


Release history Release notifications | RSS feed

This version

0.1

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lineseg-0.1.tar.gz (17.8 kB view details)

Uploaded Source

File details

Details for the file lineseg-0.1.tar.gz.

File metadata

  • Download URL: lineseg-0.1.tar.gz
  • Upload date:
  • Size: 17.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.5

File hashes

Hashes for lineseg-0.1.tar.gz
Algorithm Hash digest
SHA256 f4ae54211f6720f4ba94a6239ada07fbb5a2cad53cce71fffc9a684b290f274a
MD5 10722d0148c1e8ca0433b85effe45f1f
BLAKE2b-256 3cb6c634759ee32b3012623af831c356d114f6f86b13cccf21b4450930012752

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page