text processing functions for small or big data files
Project description
textTinyPy
The textTinyPy package consists of text processing functions for small or big data files. The source code is based on C++11 and wrapped in Python using Cython. It is tested on Linux (Debian) with Python 2.7 and there is currently one limitation :
there is no support for chinese, japanese, korean, thai or languages with ambiguous word boundaries.
The functionality of the textTinyPy is explained in the blog post
Details for the parameters of each class can be found in the package documentation
The package will work properly only if the following requirements are satisfied / installed :
System Requirements:
boost (boost >= 1.55)
armadillo (armadillo >= 0.7.5)
a C++11 compiler
OpenMP for parallelization ( optional )
Python Requirements:
Cython>=0.23.5
pandas>=0.21.0
scipy>=0.13.0
numpy>=1.11.2
future>=0.15.2
The package can be installed from pypi using:
pip install textTinyPy
To upgrade use
pip install -U textTinyPy
Use the following link to report bugs/issues, https://github.com/mlampros/textTinyPy/issues
Installation of System Requirements on Linux (Debian):
The installation requires a gcc-4.8 or newer (this can be checked in a console using : gcc –version ).
If the gcc is older than 4.8 continue with step 1. else go to step 2.
1.: installation of gcc-4.9 and g++-4.9
sudo add-apt-repository ppa:ubuntu-toolchain-r/test -y
sudo apt-get update
sudo apt-get install gcc-4.9
sudo apt-get install g++-4.9
sudo update-alternatives –install /usr/bin/gcc gcc /usr/bin/gcc-4.9 90
sudo update-alternatives –install /usr/bin/g++ g++ /usr/bin/g++-4.9 90
sudo update-alternatives –install /usr/bin/gcov gcov /usr/bin/gcov-4.9 90
2.: installation of boost version 1.55 (including boost-locale and boost-system)
sudo add-apt-repository ppa:boost-latest/ppa -y
sudo apt-get update
sudo apt-get install libboost1.55-dev libboost-filesystem1.55-dev libboost-locale1.55-dev
3.: installation of armadillo (including the requirements for Debian and Fedora)
armadillo requirements – Debian only
sudo apt-get install cmake libopenblas-dev libblas-dev libarpack++2-dev liblapack-dev
armadillo requirements – Fedora only
yum install cmake openblas-devel lapack-devel arpack-devel SuperLU-devel
armadillo installation version 7.600.2
wget http://sourceforge.net/projects/arma/files/armadillo-7.600.2.tar.xz
tar xf armadillo-7.600.2.tar.xz
cd armadillo-7.600.2/
cmake .
make
sudo make install
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file textTinyPy-0.0.4.tar.gz
.
File metadata
- Download URL: textTinyPy-0.0.4.tar.gz
- Upload date:
- Size: 3.8 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 05fe8c4d9ffa1414a89ad9fcf49fc56fb229f805dc1e8da79283a878f98006ec |
|
MD5 | 370e9e0a6ada3ffdce3ffc67a6ae157a |
|
BLAKE2b-256 | e7347d6a4f8a35deca0eac2342871e65b87f6d5a70eb55726a79651e35770d60 |
File details
Details for the file textTinyPy-0.0.4-py2.7-linux-x86_64.egg
.
File metadata
- Download URL: textTinyPy-0.0.4-py2.7-linux-x86_64.egg
- Upload date:
- Size: 3.5 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7040c9b83d20993bdb116305d9a2e1fb21b54229d6d895b0f9843cc8d40f22ea |
|
MD5 | 0155aca707ec20d6bb18bfb4f5386249 |
|
BLAKE2b-256 | 20c8de913cbb957471ff16318160b473520f6f5d4a4101092017c6f3acb69ac9 |