Skip to main content

Visual font recognition (VFR) library

Project description

fontina

fontina is a PyTorch library that helps with training models for the task of Visual Font Recognition and doing inference with them.

Feature highlights

Using fontina

Installing the dependencies

Starting from a cloned repository directory:

# Create a virtual environment: this uses venv but any system
# would work!
python -m venv .venv

# Activate the virtual environment: this depends on the OS. See
# the two options below.
# .venv/Scripts/activate # Windows
source .venv/bin/activate # Linux

# Install the dependencies needed to use .
pip install .

Note Windows users must manually install the CUDA-based version of PyTorch, as pip will only install the CPU version on this platform. See PyTorch Get Started for the specific command to ru, which should be something along the lines of pip install torch torchvision --index-url https://download.pytorch.org/whl/cu117.

(Optional) - Installing development dependencies

The following dependencies are only needed to develop fontina.

# Install the developer dependencies.
pip install .[linting]

# Run linting and tests!
make lint
make test

Generating a synthetic dataset

If needed, the model can be trained on synthetic data. fontina provides a synthetic dataset generator that follows part of the recommendations from the DeepFont paper to make the synthetic data look closer to the real data. To use the generator:

  1. Make a copy of configs/sample.yaml, e.g. configs/mymodel.yaml
  2. Open configs/mymodel.yaml and tweak the fonts section:
fonts:

  # ...

  # Force an uniform white background for the generated images.
  # backgrounds_path: "assets/backgrounds"

  samples_per_font: 1000

  # Fill in the paths of the fonts that need to be used to generate
  # the data.
  classes:
    - name: Test Font
      path: "assets/fonts/test/Test.ttf"
    - name: Other Test Font
      path: "assets/fonts/test2/Test2.ttf"
  1. Run the generation:
fontina-generate -c configs/mymodel.yaml -o outputs/font-images/mymodel

After this completes, there should be one directory per configured font in outputs/font-images/mymodel.

Training

fontina currently only supports training a DeepFont-like architecture. The training process has two major steps: unsupervised training of the stacked autoencoders and supervised training of the full network.

Before starting, make a copy of configs/sample.yaml, e.g. configs/mymodel.yaml (or use the existing one that was created for the dataset generation step).

Part 1 - Unsupervised training

  1. Open configs/mymodel.yaml and tweak the training section:
training:
  # Set this to True to train the stacked autoencoders.
  only_autoencoder: True

  # Don't use an existing checkpoint for the unsupervised training.
  # scae_checkpoint_file: "outputs/models/autoenc/good-checkpoint.ckpt"

  data_root: "outputs/font-images/mymodel"

  # The directory that will contain the model checkpoints.
  output_dir: "outputs/models/mymodel-scae"

  # The size of the batch to use for training.
  batch_size: 128

  # The initial learning rate to use for training.
  learning_rate: 0.01

  epochs: 20

  # Whether or not to use a fraction of the data to run a
  # test cycle on the trained model.
  run_test_cycle: True
  1. Then run the training with:
python src/fontina/train.py -c configs/mymodel.yaml

Part 2 - Supervised training

  1. Open configs/mymodel.yaml (or create a new one!) and tweak the training section:
training:
  # This will freeze the stacked autoencoders.
  only_autoencoder: False

  # Pick the best checkpoint from the unsupervised training.
  scae_checkpoint_file: "outputs/models/mymodel-scae/good-checkpoint.ckpt"

  data_root: "outputs/font-images/mymodel"

  # The directory that will contain the model checkpoints.
  output_dir: "outputs/models/mymodel-full"

  # The size of the batch to use for training.
  batch_size: 128

  # The initial learning rate to use for training.
  learning_rate: 0.01

  epochs: 20

  # Whether or not to use a fraction of the data to run a
  # test cycle on the trained model.
  run_test_cycle: True
  1. Then run the training with:
fontina-train -c configs/mymodel.yaml

(Optional) - Monitor performance using TensorBoard

fontina automatically captures the performances of the training runs in a TensorBoard-compatible way. It should be possible to visualize the recorded data by pointing TensorBoard to the logs directory as follows:

tensorboard --logdir=lightning_logs

Inference

Once training is complete, the resulting model can be used to run inference.

fontina-predict -n 6 -w "outputs/models/mymodel-full/best_checkpoint.ckpt" -i "assets/images/test.png"

AdobeVFR Pre-trained model

The AdobeVFR dataset is currently available for download at Dropbox, here. The license for using and distributing the dataset is available here, which cites:

This dataset ('Licensed Material') is made available to the scientific community for non-commercial research purposes such as academic research, teaching, scientific publications or personal experimentation.

The model, being trained on that dataset, retains the same spirit and the same license applies: the release model can only be used for non-commercial purposes.

How to train

  1. Download the dataset to assets/AdobeVFR
  2. Unpack assets/AdobeVFR/Raw Image/VFR_real_u/scrape-wtf-new.zip in that directory so that the assets/AdobeVFR/Raw Image/VFR_real_u/scrape-wtf-new/ path exists
  3. Run fontina-train -c configs/adobe-vfr-autoencoder.yaml. This will take a long while but progress can be checked with Tensorboard (see the previous sections) during training
  4. Change configs/adobe-vfr.yaml so that scae_checkpoint_file points to the best checkpoint from step (3).
  5. Run fontina-train -c configs/adobe-vfr.yaml. This will take a long while (but less than the unsupervised training round)

Downloading the models

While only the full model is needed, the stand-alone autonencoder model is being released as well.

Note The pre-trained model achieves a validation loss of 0.3523, with an accuracy of 0.8855 after 14 epochs. Unfortunately the test performance on VFR_real_test is much worse, with a top-1 accuracy of 0.05. I'm releasing the model in the hope that somebody could help me fixing this 😊😅

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fontina-0.0.1rc6.tar.gz (22.8 kB view details)

Uploaded Source

Built Distribution

fontina-0.0.1rc6-py3-none-any.whl (18.3 kB view details)

Uploaded Python 3

File details

Details for the file fontina-0.0.1rc6.tar.gz.

File metadata

  • Download URL: fontina-0.0.1rc6.tar.gz
  • Upload date:
  • Size: 22.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/4.0.2 CPython/3.11.6

File hashes

Hashes for fontina-0.0.1rc6.tar.gz
Algorithm Hash digest
SHA256 4e906615c4cfcb830706943eb149af3fd988bef8257a06800d8d4f9414b4c9bf
MD5 7ed41d6424508eafea57ed67d1efb295
BLAKE2b-256 bda4c3fd16ab8ed4d0fde6ff1dc967f2f22f2ef419b3db7fa0df2ae9e59f4e75

See more details on using hashes here.

File details

Details for the file fontina-0.0.1rc6-py3-none-any.whl.

File metadata

  • Download URL: fontina-0.0.1rc6-py3-none-any.whl
  • Upload date:
  • Size: 18.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/4.0.2 CPython/3.11.6

File hashes

Hashes for fontina-0.0.1rc6-py3-none-any.whl
Algorithm Hash digest
SHA256 e9687599f36ae6ba9f9e8b266d5368ae396f9f1b2a78647cfe8b29d4370b8c5e
MD5 ab01f5cd0732a4d6bb32e301fad06dfa
BLAKE2b-256 015a929914d58d16b5aadfd432adf17649f54069c9edc1db057e874bbf2d93bc

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page