Skip to main content

Text-based Modifiers

Project description

Textfier: Text-Based Modifiers

Latest release Open issues License

Welcome to Textfier.

Dealing with text is not often a trivial task. Hence, this package provides a more straightforward interface to tackle text-based texts and modifications. Built on top of Huggingface's Transformers, Textfier is a wrapper focusing on the specific tasks we are currently researching.

Use Textifier if you need a library or wish to:

  • Implement or use pre-defined tasks;
  • Mix-and-match different approaches to solve problems;
  • Because modifying text is fun.

Read the docs at textfier.readthedocs.io.

Textfier is compatible with: Python 3.6+.


Package guidelines

  1. The very first information you need is in the very next section.
  2. Installing is also easy if you wish to read the code and bump yourself into, follow along.
  3. Note that there might be some additional steps in order to use our solutions.
  4. If there is a problem, please do not hesitate, call us.

Getting started: 60 seconds with Textfier

First of all. We have examples. Yes, they are commented. Just browse to examples/, chose your subpackage, and follow the example. We have high-level examples for most tasks we could think of.

Alternatively, if you wish to learn even more, please take a minute:

Textfier is based on the following structure, and you should pay attention to its tree:

- textfier
    - core
        - dataset
        - runner
        - task
    - stream
        - cleaner
        - tokenizer
    - tasks
        - language_modeling
        - named_entity_recognition
        - question_answering
        - seq2seq
        - sequence_classification
    - utils
        - loader
        - logging
        - metrics

Core

The core is the core. Essentially, it is the parent of everything. You should find parent classes defining the basis of our structure. They should provide variables and methods that will help to construct other modules.

Stream

Every pipeline has its first step, right? The stream package serves as primary methods to clean and tokenize data.

Tasks

Pre-defined tasks provide an easier framework when loading pre-trained models. Hence, this package serves as a wrapper around pre-trained models loading from Huggingface's Transformers.

Utils

This is a utility package. Common things shared across the application should be implemented here. It is better to implement once and use it as you wish than re-implementing the same thing over and over again.


Installation

We believe that everything has to be easy. Not tricky or daunting, textfier will be the one-to-go package that you will need, from the very first installation to the daily-tasks implementing needs. If you may just run the following under your most preferred Python environment (raw, conda, virtualenv, whatever)!:

pip install textfier

Alternatively, if you prefer to install the bleeding-edge version, please clone this repository and use:

pip install .

Environment configuration

Note that sometimes, there is a need for additional implementation. If needed, from here, you will be the one to know all of its details.

Ubuntu

No specific additional commands needed.

Windows

No specific additional commands needed.

MacOS

No specific additional commands needed.


Support

We know that we do our best, but it is inevitable to acknowledge that we make mistakes. If you ever need to report a bug, report a problem, talk to us, please do so! We will be available at our bests at this repository or gustavo.rosa@unesp.br.


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

textfier-1.0.0.tar.gz (9.8 kB view details)

Uploaded Source

Built Distribution

textfier-1.0.0-py3-none-any.whl (25.6 kB view details)

Uploaded Python 3

File details

Details for the file textfier-1.0.0.tar.gz.

File metadata

  • Download URL: textfier-1.0.0.tar.gz
  • Upload date:
  • Size: 9.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3.post20200330 requests-toolbelt/0.9.1 tqdm/4.46.1 CPython/3.7.7

File hashes

Hashes for textfier-1.0.0.tar.gz
Algorithm Hash digest
SHA256 ca4c622a1e039871c41059773364353b5f23c5e6452563c4465168b0c74b9240
MD5 2fc728cc2d603a496a9fd242cb555260
BLAKE2b-256 86061848774d6d872a9d5b0e81abf8dede394cee39c6eb40aec9aedfa157aca9

See more details on using hashes here.

File details

Details for the file textfier-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: textfier-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 25.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3.post20200330 requests-toolbelt/0.9.1 tqdm/4.46.1 CPython/3.7.7

File hashes

Hashes for textfier-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 32ceba4aa30304ff9d437a0173df5d0b223c3fbf3b3d08496ceb6dbe4e85a4a7
MD5 455e1be78249bd04ca36227aa71ee4c8
BLAKE2b-256 36ec7ab9bbf626afac74034c395b2096c34132f98946cd5151bc377d657d8ac0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page