Skip to main content

An open-source tool to pull and cache data from RapidPro servers.

Project description

rapidpro-pull is an open-source command-line tool for pulling data from RapidPro servers, printing it in the JSON format and caching it in local or remote relational databases (any database supported by SQLAlchemy - e.g. PostgreSQL, Oracle, SQLite). It has been developed for UNICEF to create a foundation for a new ETL subsystem for IMAMD but it is a standalone tool which can be used independently.

Intallation and usage

To install and get started:

$ pip install rapidpro-pull
$ rapidpro-pull --help


rapidpro-pull --flow-runs --api-token=<api-token> [--address=<address>]
                          [--before=<before> --after=<after>]
                          [--with-contacts --with-flows]

rapidpro-pull --flows --api-token=<api-token> [--address=<address>]
                          [--before=<before> --after=<after>]
                          [--uuid=<uuid> ...]

rapidpro-pull --contacts --api-token=<api-token> [--address=<address>]
                          [--before=<before> --after=<after>]
                          [--uuid=<uuid> ...]

rapidpro-pull --help

--flow-runs                        download flow runs
--flows                            download flows
--contacts                         download contacts
-a, --address=<address>            a RapidPro server [default:]
-t, --api-token=<api-token>        a RapidPro API token

-h, --help                         display this help and exit

--before=<before>                  download all older than ISO 8601 date/time
--after=<after>                    download all newer than ISO 8601 date/time

--uuid=<uuid>                      fetch objects matching UUID(s) (repeatable)

--with-flows                       download associated flows, too
--with-contacts                    download associated contacts, too

--cache=<database-url>             use database-url as cache (store retrieved
                                   objects in cache; retrieve objects from
                                   cache instead of downloading from RapidPro
                                   when possible)


rapidpro-pull –api-token=a-token -flow-runs >all-flow-runs.json
Use a RapidPro API token a-token to download all flow runs and save them into a JSON file all-flow-runs.json.
rapidpro-pull -t a-token –address –flow-runs
Use a RapidPro API token a-token to download all flow runs from an alternative RapidPro service over HTTPS and print them in the JSON format.
rapidpro-pull -t a-token –flow-runs –with-flows –cache=sqlite:////tmp/rp.db
Use token a-token to download all flow runs and their associated flows. Do not download flows already cached in the provided SQLite database in file /tmp/rp.db and do not overwrite cached flow runs. Add all newly downloaded objects to the database for later processing.
rapidpro-pull –flow-runs –with-flows –with-contacts –api-token=a-token
Use a RapidPro API token a-token to download all flow runs together with all associated flows and contacts
rapidpro-pull -t a-token –flows –after 2016-01-01T12:12:12.596000Z
Use token a-token to download all flows newer than 2016-01-01T12:12:12.596Z.
rapidpro-pull -t a-token –contacts –uuid=a –uuid=b –uuid=c
Use token a-token to download contacts with UUIDs a, b or c.


Working on rapidpro-pull requires the installation of a small number of development dependencies (in addition to the dependencies required to just run the program). These dependencies are listed in tests_require in the file but one does not need to install them by hand unless one chooses to invoke the project test runners manually (see: alternative ways to run tests). In order to get started one may want to do the following:

$ # Create a virtualenv and activate it, e.g.: mkvirtualenv rapidpro-pull
$ git clone
$ cd rapidpro-pull
$ pip install --editable .

To use the alternative ways of running tests one needs to explicitly install the aforementioned additional dependencies (this step is optional and not required to run tests):

$ pip install rapidpro-pull[development]

The project has been developed using the BDD / outside-in TDD approach and there are two separate groups of tests: features and scenarios describing the high-level/system behaviour using the Gherkin syntax (and, underneath, Python), and the low-level unit tests (the author is not a mockist but a classicist which means that mocking and stubbing is used where it seems to make sense instead of everywhere ;) ). The provided unit tests ensure 100% code coverage (statement + branch). Apart from the coverage reports printed after each execution of unit tests, one can view the latest HTML report stored in the htmlcov directory.

The functional tests (features/scenarios) are found in the features/ directory. To execute them:

$ python behave_test  #  please use -b to pass arguments to behave
$ behave  #  an alternative way of running tests, please see: behave --help

The unit tests are found in the tests/ directory. To execute them:

$ python pytest  #  please use -p to pass arguments to py.test
$ python test  #  an alias for pytest
$ py.test  #  an alternative way of running tests, please see: py.test -h


Please feel free to use this project issue tracker where appropriate, fork this repository and generate pull requests. The author can also be contacted via e-mail: Tomasz J. Kotarba <>.

Special Thanks

Special thanks to Robert Johnston (a crusading saint of UNICEF, always ready to fight dragons to save those in need) without whom this project would never be.

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for rapidpro-pull, version 1.0.1
Filename, size File type Python version Upload date Hashes
Filename, size rapidpro_pull-1.0.1-py2.py3-none-any.whl (13.4 kB) File type Wheel Python version py2.py3 Upload date Hashes View
Filename, size rapidpro-pull-1.0.1.tar.gz (9.2 kB) File type Source Python version None Upload date Hashes View

Supported by

Pingdom Pingdom Monitoring Google Google Object Storage and Download Analytics Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page