Skip to main content

text processing functions for small or big data files

Project description

https://badge.fury.io/py/textTinyPy.svg https://travis-ci.org/mlampros/textTinyPy.svg?branch=master1 https://codecov.io/github/mlampros/textTinyPy/coverage.svg?branch=master1

textTinyPy


The textTinyPy package consists of text processing functions for small or big data files. The source code is based on C++11 and wrapped in Python using Cython. It is tested on Linux (Debian) with Python 2.7 and there is currently one limitation :

  • there is no support for chinese, japanese, korean, thai or languages with ambiguous word boundaries.


The functionality of the textTinyPy is explained in the blog post


Details for the parameters of each class can be found in the package documentation


The package will work properly only if the following requirements are satisfied / installed :


System Requirements:


Python Requirements:

  • Cython>=0.23.5

  • pandas>=0.21.0

  • scipy>=0.13.0

  • numpy>=1.11.2

  • future>=0.15.2


The package can be installed from pypi using:

pip install textTinyPy


To upgrade use

pip install -U textTinyPy


Use the following link to report bugs/issues, https://github.com/mlampros/textTinyPy/issues


Installation of System Requirements on Linux (Debian):


The installation requires a gcc-4.8 or newer (this can be checked in a console using : gcc –version ).

If the gcc is older than 4.8 continue with step 1. else go to step 2.


1.: installation of gcc-4.9 and g++-4.9

sudo add-apt-repository ppa:ubuntu-toolchain-r/test -y

sudo apt-get update

sudo apt-get install gcc-4.9

sudo apt-get install g++-4.9

sudo update-alternatives –install /usr/bin/gcc gcc /usr/bin/gcc-4.9 90

sudo update-alternatives –install /usr/bin/g++ g++ /usr/bin/g++-4.9 90

sudo update-alternatives –install /usr/bin/gcov gcov /usr/bin/gcov-4.9 90


2.: installation of boost version 1.55 (including boost-locale and boost-system)

sudo add-apt-repository ppa:boost-latest/ppa -y

sudo apt-get update

sudo apt-get install libboost1.55-dev libboost-filesystem1.55-dev libboost-locale1.55-dev


3.: installation of armadillo (including the requirements for Debian and Fedora)


armadillo requirements – Debian only

sudo apt-get install cmake libopenblas-dev libblas-dev libarpack++2-dev liblapack-dev


armadillo requirements – Fedora only

yum install cmake openblas-devel lapack-devel arpack-devel SuperLU-devel


armadillo installation version 7.600.2

wget http://sourceforge.net/projects/arma/files/armadillo-7.600.2.tar.xz

tar xf armadillo-7.600.2.tar.xz

cd armadillo-7.600.2/

cmake .

make

sudo make install


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

textTinyPy-0.0.4.tar.gz (3.8 MB view details)

Uploaded Source

Built Distribution

textTinyPy-0.0.4-py2.7-linux-x86_64.egg (3.5 MB view details)

Uploaded Source

File details

Details for the file textTinyPy-0.0.4.tar.gz.

File metadata

  • Download URL: textTinyPy-0.0.4.tar.gz
  • Upload date:
  • Size: 3.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for textTinyPy-0.0.4.tar.gz
Algorithm Hash digest
SHA256 05fe8c4d9ffa1414a89ad9fcf49fc56fb229f805dc1e8da79283a878f98006ec
MD5 370e9e0a6ada3ffdce3ffc67a6ae157a
BLAKE2b-256 e7347d6a4f8a35deca0eac2342871e65b87f6d5a70eb55726a79651e35770d60

See more details on using hashes here.

File details

Details for the file textTinyPy-0.0.4-py2.7-linux-x86_64.egg.

File metadata

File hashes

Hashes for textTinyPy-0.0.4-py2.7-linux-x86_64.egg
Algorithm Hash digest
SHA256 7040c9b83d20993bdb116305d9a2e1fb21b54229d6d895b0f9843cc8d40f22ea
MD5 0155aca707ec20d6bb18bfb4f5386249
BLAKE2b-256 20c8de913cbb957471ff16318160b473520f6f5d4a4101092017c6f3acb69ac9

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page