A modular toolkit for analytics.
Project description
# newslynx
**This is still a WIP and we should be officially open-sourcing the codebase in late June/July 2015. For now, please read [the report](http://towcenter.org/research/the-newslynx-impact-tracker-produced-these-key-ideas/) we published for the [TowCenter](http://towcenter.org) on our prototype.**
## (Re)Setting up the dev environment
#### Install `newslynx`, prefrerably in a virtual environment.
```
git clone https://github.com/newslynx/newslynx.git
cd newslynx
python setup.py install
```
If you want to actively work on the codebase, install in `editable` mode:
```
git clone https://github.com/newslynx/newslynx.git
cd newslynx
pip install --editable .
```
#### Install the dependencies (instructions for Mac OS X, for now):
Install `redis`:
```
brew install redis
```
**NOTE** We recommend using [Postgres APP](http://postgresapp.com/). However, if you prefer the `brew` distribution, make sure to install it with plpythonu.
```
brew install postgresql --build-from-source --with-python
```
(Re)create a `postgresql` database
```
dropdb newslynx
createdb newslynx
```
#### Set your configurations
- fill out [`example_config/config.yaml`](example_config/config.yaml) and move it to `~/.newslynx/config.yaml`
- follow the [SQLAlchemy Docs](http://docs.sqlalchemy.org/en/rel_1_0/core/engines.html#database-urls) for details on how to configure your `sqlalchemy_database_uri`.
- Modify default recipes and tags in [`example_config/defaults/recipes.yaml`](example_config/defaults/recipes.yaml) and [`example_config/defaults/tags.yaml`](example_config/defaults/tags.yaml). These tags and recipes will be created everytime a new organization is added. If you'd simply like to use our defaults, type `make defaults`. This will move these files to `~/.newslynx/defaults`.
#### Start the redis server
Open another shell and run:
```
redis-server
```
#### Initialize the database, super user, and instlall built-in sous chefs.
```
newslynx init
```
#### Start the server
- In debug mode: `newslynx debug`
- Debug mode with errors: `newslynx debug --raise-errors`
- Production `guniorn` server: `./run`
#### Start the task workers
```
./start_workers
```
stop the tasks workers
```
./stop_workers
```
#### Start the cron daemon
```
newslynx cron
```
#### IGNORE THIS ERROR:
This is a result of our extensive use of `gevent`. We haven't yet figured out how to properly suppress this error. See more details [here](http://stackoverflow.com/questions/8774958/keyerror-in-module-threading-after-a-successful-py-test-run).
```
Exception KeyError: KeyError(4332017936,) in <module 'threading' from '/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.pyc'> ignored
```
## TODO
- [ ] Fix Bulk Loading process.
- [ ] Re-implement SousChefs
- [x] RSS Feeds => Thing
- [x] Google Analytics => Metric
- [ ] Google Alerts => Event
- [x] Social Shares => Metric
- [ ] Homepage Promotions => Metric
- [x] Twitter Promotions => Metric
- [ ] Facebook Promotions => Metric
- [x] Twitter List => Event
- [x] Twitter User => Event
- [ ] Facebook Page => Event
- [x] Reddit => Event
- [ ] HackerNews => Event
- [ ] Implement New SousChefs
- [ ] IFTTT integrations
- [ ] Wordpress Publish => Thing
- TK
- [ ] Regex Thing URL => Tag
- [ ] Search Things => Tag
- [ ] Meltwater Emails => Event
- [ ] Newsletter Email Promotions => Metric
- [ ] Calculated Metric? SQL API.
- [x] Implement Recipe scheduler
- [ ] Implement Admin Panel
- [ ] Migrate Core Prototype Users.
- [ ] Automate Deployment
- [ ] App Integration
- [ ] Document, Document, Document
## References
#### API Design
* [http://www.vinaysahni.com/best-practices-for-a-pragmatic-restful-api](http://www.vinaysahni.com/best-practices-for-a-pragmatic-restful-api)
#### Crosstab in Postgres
* [http://www.postgresonline.com/journal/archives/14-CrossTab-Queries-in-PostgreSQL-using-tablefunc-contrib.html](http://www.postgresonline.com/journal/archives/14-CrossTab-Queries-in-PostgreSQL-using-tablefunc-contrib.html)
#### filling in zeros for a timezeries
* [http://stackoverflow.com/questions/346132/postgres-how-to-return-rows-with-0-count-for-missing-data]
#### fetching column names from table
* [http://www.postgresql.org/message-id/AANLkTilsjTAXyN5DghR3M2U4c8w48UVxhov4-8igMpd1@mail.gmail.com](http://www.postgresql.org/message-id/AANLkTilsjTAXyN5DghR3M2U4c8w48UVxhov4-8igMpd1@mail.gmail.com)
#### timeseries tips
* [http://no0p.github.io/postgresql/2014/05/08/timeseries-tips-pg.html](http://no0p.github.io/postgresql/2014/05/08/timeseries-tips-pg.html)
#### Getting bigger with flask (+ dynamic subdomains):
* [http://maximebf.com/blog/2012/11/getting-bigger-with-flask/#.VVYvUZNVhBc](http://maximebf.com/blog/2012/11/getting-bigger-with-flask/#.VVYvUZNVhBc)
#### Nonblocking with flask, gevent, + psycopg2
* [https://github.com/kljensen/async-flask-sqlalchemy-example](https://github.com/kljensen/async-flask-sqlalchemy-example)
#### Rate Limiting in Flask.
* [http://flask.pocoo.org/snippets/70/](http://flask.pocoo.org/snippets/70/)
#### Postgres Search Configuration
* [http://sqlalchemy-searchable.readthedocs.org/en/latest/configuration.html](http://sqlalchemy-searchable.readthedocs.org/en/latest/configuration.html)
## License
<a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-nc-sa/4.0/88x31.png" /></a><br />This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/">Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License</a>.
**This is still a WIP and we should be officially open-sourcing the codebase in late June/July 2015. For now, please read [the report](http://towcenter.org/research/the-newslynx-impact-tracker-produced-these-key-ideas/) we published for the [TowCenter](http://towcenter.org) on our prototype.**
## (Re)Setting up the dev environment
#### Install `newslynx`, prefrerably in a virtual environment.
```
git clone https://github.com/newslynx/newslynx.git
cd newslynx
python setup.py install
```
If you want to actively work on the codebase, install in `editable` mode:
```
git clone https://github.com/newslynx/newslynx.git
cd newslynx
pip install --editable .
```
#### Install the dependencies (instructions for Mac OS X, for now):
Install `redis`:
```
brew install redis
```
**NOTE** We recommend using [Postgres APP](http://postgresapp.com/). However, if you prefer the `brew` distribution, make sure to install it with plpythonu.
```
brew install postgresql --build-from-source --with-python
```
(Re)create a `postgresql` database
```
dropdb newslynx
createdb newslynx
```
#### Set your configurations
- fill out [`example_config/config.yaml`](example_config/config.yaml) and move it to `~/.newslynx/config.yaml`
- follow the [SQLAlchemy Docs](http://docs.sqlalchemy.org/en/rel_1_0/core/engines.html#database-urls) for details on how to configure your `sqlalchemy_database_uri`.
- Modify default recipes and tags in [`example_config/defaults/recipes.yaml`](example_config/defaults/recipes.yaml) and [`example_config/defaults/tags.yaml`](example_config/defaults/tags.yaml). These tags and recipes will be created everytime a new organization is added. If you'd simply like to use our defaults, type `make defaults`. This will move these files to `~/.newslynx/defaults`.
#### Start the redis server
Open another shell and run:
```
redis-server
```
#### Initialize the database, super user, and instlall built-in sous chefs.
```
newslynx init
```
#### Start the server
- In debug mode: `newslynx debug`
- Debug mode with errors: `newslynx debug --raise-errors`
- Production `guniorn` server: `./run`
#### Start the task workers
```
./start_workers
```
stop the tasks workers
```
./stop_workers
```
#### Start the cron daemon
```
newslynx cron
```
#### IGNORE THIS ERROR:
This is a result of our extensive use of `gevent`. We haven't yet figured out how to properly suppress this error. See more details [here](http://stackoverflow.com/questions/8774958/keyerror-in-module-threading-after-a-successful-py-test-run).
```
Exception KeyError: KeyError(4332017936,) in <module 'threading' from '/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.pyc'> ignored
```
## TODO
- [ ] Fix Bulk Loading process.
- [ ] Re-implement SousChefs
- [x] RSS Feeds => Thing
- [x] Google Analytics => Metric
- [ ] Google Alerts => Event
- [x] Social Shares => Metric
- [ ] Homepage Promotions => Metric
- [x] Twitter Promotions => Metric
- [ ] Facebook Promotions => Metric
- [x] Twitter List => Event
- [x] Twitter User => Event
- [ ] Facebook Page => Event
- [x] Reddit => Event
- [ ] HackerNews => Event
- [ ] Implement New SousChefs
- [ ] IFTTT integrations
- [ ] Wordpress Publish => Thing
- TK
- [ ] Regex Thing URL => Tag
- [ ] Search Things => Tag
- [ ] Meltwater Emails => Event
- [ ] Newsletter Email Promotions => Metric
- [ ] Calculated Metric? SQL API.
- [x] Implement Recipe scheduler
- [ ] Implement Admin Panel
- [ ] Migrate Core Prototype Users.
- [ ] Automate Deployment
- [ ] App Integration
- [ ] Document, Document, Document
## References
#### API Design
* [http://www.vinaysahni.com/best-practices-for-a-pragmatic-restful-api](http://www.vinaysahni.com/best-practices-for-a-pragmatic-restful-api)
#### Crosstab in Postgres
* [http://www.postgresonline.com/journal/archives/14-CrossTab-Queries-in-PostgreSQL-using-tablefunc-contrib.html](http://www.postgresonline.com/journal/archives/14-CrossTab-Queries-in-PostgreSQL-using-tablefunc-contrib.html)
#### filling in zeros for a timezeries
* [http://stackoverflow.com/questions/346132/postgres-how-to-return-rows-with-0-count-for-missing-data]
#### fetching column names from table
* [http://www.postgresql.org/message-id/AANLkTilsjTAXyN5DghR3M2U4c8w48UVxhov4-8igMpd1@mail.gmail.com](http://www.postgresql.org/message-id/AANLkTilsjTAXyN5DghR3M2U4c8w48UVxhov4-8igMpd1@mail.gmail.com)
#### timeseries tips
* [http://no0p.github.io/postgresql/2014/05/08/timeseries-tips-pg.html](http://no0p.github.io/postgresql/2014/05/08/timeseries-tips-pg.html)
#### Getting bigger with flask (+ dynamic subdomains):
* [http://maximebf.com/blog/2012/11/getting-bigger-with-flask/#.VVYvUZNVhBc](http://maximebf.com/blog/2012/11/getting-bigger-with-flask/#.VVYvUZNVhBc)
#### Nonblocking with flask, gevent, + psycopg2
* [https://github.com/kljensen/async-flask-sqlalchemy-example](https://github.com/kljensen/async-flask-sqlalchemy-example)
#### Rate Limiting in Flask.
* [http://flask.pocoo.org/snippets/70/](http://flask.pocoo.org/snippets/70/)
#### Postgres Search Configuration
* [http://sqlalchemy-searchable.readthedocs.org/en/latest/configuration.html](http://sqlalchemy-searchable.readthedocs.org/en/latest/configuration.html)
## License
<a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-nc-sa/4.0/88x31.png" /></a><br />This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/">Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License</a>.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
newslynx-0.1.2.tar.gz
(263.1 kB
view hashes)
Built Distribution
newslynx-0.1.2.macosx-10.10-intel.exe
(465.6 kB
view hashes)
Close
Hashes for newslynx-0.1.2.macosx-10.10-intel.exe
Algorithm | Hash digest | |
---|---|---|
SHA256 | b33d355928e04da6c48be447a803ba8a3e3d62e2bea7a9c34ba910f1a70de38b |
|
MD5 | 2ad31a6f80eea64cd39d4303bcc2ed69 |
|
BLAKE2b-256 | 8ff78c59c78ecadc05f45e696b8edb14b3e29ea9f86f6fc286ea840c00be3b15 |