Profile of filter

timetiles

Last released Mar 24, 2026

SDK for building TimeTiles scrapers — CSV output helper

split-folders

Last released Jan 28, 2026

Split folders with files (e.g. images) into training, validation and test (dataset) folders.

clean-text

Last released Jan 28, 2026

Functions to preprocess and normalize text.

pd3f

Last released Apr 3, 2021

Reconstruct the original continuous text from PDFs with language models

dehyphen

Last released Sep 15, 2020

Dehyphenation of broken text (mainly German), i.e., extracted from a PDF

pd3f-flair

Last released Sep 15, 2020

Flair's language models without unnecessary dependencies

hyperhyper

Last released Mar 7, 2020

Python Library to Construct Word Embeddings for Small Data

german

Last released Nov 16, 2019

Preprocess German texts for serious NLP.

german-lemmatizer

Last released Jul 30, 2019

A Python package (using a Docker image under the hood) to lemmatize German texts.

text-classification-keras

Last released Apr 28, 2019

Text Classification Library for Keras

get-wayback-machine

Last released Dec 14, 2018

Fetch a URL via the latest Wayback Machine Snapshot

get-retries

Last released Oct 25, 2018

Adding retries to Requests.get() with exponential backoff

mw-category-members

Last released Sep 9, 2018

Using MediaWiki's API, retrieve pages that belong to a given category

deep-plots

Last released Jul 30, 2018

Visualize Your Deep Learning Training in Static Graphics

Johannes Filter

14 projects