TWItter STock market Machine Learning package
Project description
TwistML
Disclaimer
This package is still very much under developement.
At this point most of the intended functionality is in place, but documentation is still very spotty.
Installation
You can use pip to install TwistML like so:
$ pip install twistml
Please make you sure you have numpy, scipy and gensim installed as well. I have opted out of adding them to the install_requires as this has caused problems in my own tests on windows machines. (For numpy the problem is described here.) So these packages will not be installed automatically by pip.
Known Issues & Planned Improvements
Implement a DateRange class and replace all occurences of fromdate, todate, dateformat.
Implement find_files() without dateranges at all. It should be possible to simply process all files within a directory (also recursively)
TwistML currently assumes raw twitter data to be avaialble as one json file per day. Make sure the internet-archive’s file scheme is supported as well
Add support for hourly time resolution instead of daily only.
Evaluation subpackage can only deal with binary classification. Possibly explore adding multiclass.
The way logging is currently set up is weird and should be reworked.
gensim’s LabeledSentence is deprecated, use TaggedDocument instead
Changes
Version 0.2.4
ATTENTION: Some of these may break existing code!!
renamed combine_tweets.py to combine.py
added support for stacking of features
classification targets are now 0 / 1 instead of -1 / 1
added toydata module -> create some toydata for testing
added F1-Score to classifcation evaluation
added additional window functions: window_stack and window_element_avg
Version 0.2.3
Improved long_description generation
Fixed CHANGES.rst
Version 0.2.2
Added sentiment features based on TextBlob sentiments
Version 0.2.1
Added functionality for complex category subsets to tml-generate-features
Also improved documentation for tml-generate-features (on cmd line as well as docstring)
improved test coverage
Version 0.2.0
Changed Development Status to Alpha
Removed Sentence2Vec as that functionality is included in current gensim versions’ Doc2Vec class
Added Changelog
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.