Skip to main content

A webapp to query fedmsg history

Project description

datagrepper
===========

A webapp to retrieve historical information about messages on the `fedmsg
<http://fedmsg.com>`_ bus. It is a JSON api for the `datanommer
<https://github.com/fedora-infra/datanommer/>`_ message store.

Production Instance
-------------------

https://apps.fedoraproject.org/datagrepper/

Hacking on datagrepper
----------------------

Prerequisites
~~~~~~~~~~~~~
* virtualenvwrapper
* Postgresql

Install postgresql and virtualenvwrapper::

$ sudo yum install -y postgresql-server python-virtualenvwrapper

Setting up the stack
~~~~~~~~~~~~~~~~~~~~

Use a virtualenv::

$ mkvirtualenv datagrepper
$ workon datagrepper

Install dependencies::

$ pip install -r requirements.txt
$ pip install psycopg2

Set up the fedmsg consumer for the job runner::

$ python setup.py develop

Configuring Postgresql (and getting some data)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

In normal operations, the `datanommer
<https://github.com/fedora-infra/datanommer>`_ consumer daemon will be
running somewhere and continuously stuff each new `fedmsg
<http://fedmsg.com>`_ message that it sees into a postgres DB. If you're
just sitting down to hack on datagrepper, that won't be your situation
so you'll need a dump of the database.

.. note:: If you've tried installing postgres before and think you've
messed it up, you'll need to blow away the old databases with
``$ rm -rf /var/lib/pgsql``

Install postgres (and fedmsg, while we're at it)::

$ sudo yum install -y postgresql-server fedmsg
$ sudo postgresql-setup initdb

Make sure postgres is set to allow connections over tcp/ip using password
authentication. Edit the ``/var/lib/pgsql/data/pg_hba.conf``. You might
find a line like this::

host all all 127.0.0.1/32 ident

Instead of that line, you need one that looks like this::

host all all 127.0.0.1/32 md5

----

Become yourself again (not the ``postgres`` user) and start up postgres::

$ sudo systemctl restart postgresql.service

Become the postgres user (again) and run the psql command. Use that psql
shell to setup the DB, the user, and privileges::

$ sudo su - postgres
$ psql
# create database datanommer;
# create user datanommer with password 'bunbunbun';
# grant all privileges on database datanommer to datanommer;
# \q

Back in the bash shell (but still as the `postgres` user), grab a DB dump and
restore it::

$ wget http://infrastructure.fedoraproject.org/infra/db-dumps/datanommer.dump.xz
$ xzcat datanommer.dump.xz | psql datanommer

Last step, run datagrepper
~~~~~~~~~~~~~~~~~~~~~~~~~~

You have to configure your development datagrepper instance to talk to
postgres (by default, it looks for a sqlite database). Edit
``fedmsg.d/example-datagrepper.py`` and give it these contents:

.. code-block:: python

config = {
'datanommer.enabled': False,
'datanommer.sqlalchemy.url': 'postgresql+psycopg2://datanommer:bunbunbun@localhost:5432/datanommer',
'fedmsg.consumers.datagrepper-runner.enabled': True,
}

As your normal old user self, run the development server::

$ workon datagrepper
$ python runserver.py

In a browser, visit http://localhost:5000 to see the docs.

You can quick test that you can get data by running::

$ sudo yum install -y httpie
$ http get localhost:5000/raw delta==1000000 rows_per_page==1

Running the job runner
~~~~~~~~~~~~~~~~~~~~~~

Without starting ``fedmsg-hub``, the job runner won't actually run jobs::

$ workon datagrepper
$ fedmsg-hub

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datagrepper-0.5.1.tar.gz (114.2 kB view details)

Uploaded Source

File details

Details for the file datagrepper-0.5.1.tar.gz.

File metadata

  • Download URL: datagrepper-0.5.1.tar.gz
  • Upload date:
  • Size: 114.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for datagrepper-0.5.1.tar.gz
Algorithm Hash digest
SHA256 4470ddaa8889148021e70d41c865d0f8d5f3ee4a0ca8424d2b7b9bc3466367cf
MD5 5cb0304d7e78139dded6a1cdd0c4ba3b
BLAKE2b-256 6a2b9672877d8d073a4ed3be4d1befd0dfd682272fca418b3691011b38761190

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page