Skip to main content

Data loaders and abstractions for text and NLP

Project description

LanguageFlow

https://img.shields.io/pypi/v/languageflow.svg https://img.shields.io/pypi/pyversions/languageflow.svg https://img.shields.io/badge/license-GNU%20General%20Public%20License%20v3-brightgreen.svg https://img.shields.io/travis/undertheseanlp/languageflow.svg Documentation Status

Data loaders and abstractions for text and NLP

Requirements

Install dependencies

$ pip install future, tox
$ pip install python-crfsuite==0.9.5
$ pip install Cython
$ pip install -U fasttext --no-cache-dir --no-deps --force-reinstall
$ pip install xgboost==0.82

Installation

$ pip install languageflow

Components

  • Transformers: NumberRemover, CountVectorizer, TfidfVectorizer

  • Models: SGDClassifier, XGBoostClassifier, KimCNNClassifier, FastTextClassifier, CRF

Data

Download a dataset using download command

$ languageflow download DATASET

List all dataset

$ languageflow list

Datasets

The datasets module currently contains:

  • Tagged: VLSP2018-NER, VTB-CHUNK*, VLSP2016-NER*, VLSP2013-POS*, VLSP2013-WTK*

  • Categorized: AIVIVN2019_SA*, VLSP2018_SA*, UTS2017_BANK, VLSP2016_SA*, VNTC

  • Plaintext: VNESES, VNTQ_SMALL, VNTQ_BIG

Caution (*): With closed license dataset, you must provide URL to download

Example

Download UTS2017_BANK dataset

$ languageflow download UTS2017_BANK

Use UTS2017_BANK dataset

>>> from languageflow.data_fetcher import DataFetcher, NLPData
>>> corpus = DataFetcher.load_corpus(NLPData.UTS2017_BANK_SA)
>>> print(corpus)
CategorizedCorpus: 1780 train + 197 dev + 494 test sentences

History

1.1.7 (2018-04-12)

  • Automatic deploy with travis and pypi

  • Fix dependencies hell

1.1.6 (2017-12-26)

  • Add data module to handle data downloading and data preprocessing

  • Add many new models: SGDClassifier, XGBoostClassier, FastTextClassifier, CRF

  • Add new feature: LanguageBoard

  • Automatic continuous integration with travis-ci

  • Build docs with readthedocs.org

1.1.5 (2017-12-11)

  • Refactor project to integrate with underthesea experiment

0.1.0 (2017-09-18)

  • First release on PyPI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

languageflow-1.1.13.tar.gz (481.3 kB view details)

Uploaded Source

Built Distribution

languageflow-1.1.13-py2.py3-none-any.whl (457.6 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file languageflow-1.1.13.tar.gz.

File metadata

  • Download URL: languageflow-1.1.13.tar.gz
  • Upload date:
  • Size: 481.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.6.7

File hashes

Hashes for languageflow-1.1.13.tar.gz
Algorithm Hash digest
SHA256 8853231aead03fa4d548180589e37c64c6cf35ea1d299a6bf957e84855d7ab58
MD5 b92631c36c4b31bc77e1b4f379cc5aec
BLAKE2b-256 440b4629af839dd9a2a6a2e55244269e27defdf3017dc880e2274751c3a3b681

See more details on using hashes here.

File details

Details for the file languageflow-1.1.13-py2.py3-none-any.whl.

File metadata

  • Download URL: languageflow-1.1.13-py2.py3-none-any.whl
  • Upload date:
  • Size: 457.6 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.6.7

File hashes

Hashes for languageflow-1.1.13-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 fc7d978d1c9711a650f0df74bd20b7a50d86faa9e2c00ac938952917ee2041ad
MD5 e1c6f34bc38ac27f627c54a08995f001
BLAKE2b-256 abc0905a59c133936f45684b8d3e76119ffbc92787bb320d3dba9e4d7da12d8a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page