Mongo Redis Queue
Mongo Redis Queue - A distributed worker task queue in Python
/!MRQ is not yet ready for public use. Soon!
MRQ was first developed at http://pricingassistant.com and its initial feature set matches the needs of worker queues with heterogenous jobs (IO-bound & CPU-bound, lots of small tasks & a few large ones).
The main features of MRQ are:
- Simple code: We originally switched from Celery to RQ because Celery’s code was incredibly complex and obscure ([Slides](http://www.slideshare.net/sylvinus/why-and-how-pricing-assistant-migrated-from-celery-to-rq-parispy-2)). MRQ should be as easy to understand as RQ and even easier to extend.
- Great dashboard: Have visibility and control on everything: queued jobs, current jobs, worker status, …
- Per-job logs: Get the log output of each task separately in the dashboard
- Gevent worker: IO-bound tasks can be done in parallel in the same UNIX process for maximum throughput
- Supervisord integration: CPU-bound tasks can be split across several UNIX processes with a single command-line flag
- Job management: You can retry, requeue, cancel jobs from the code or the dashboard.
- Performance: Bulk job queueing, easy job profiling
- Easy configuration: Every aspect of MRQ is configurable through command-line flags or a configuration file
- Job routing: Like Celery, jobs can have default queues, timeout and ttl values.
- Thorough testing: Edge-cases like worker interrupts, Redis failures, … are tested inside a Docker container.
- Builtin scheduler: Schedule tasks by interval or by time of the day
- Greenlet tracing: See how much time was spent in each greenlet to debug CPU-intensive jobs.
On a MacbookPro, we see 1300 jobs/second in a single worker process with very simple jobs that store results, to measure the overhead of MRQ. However what we are really measuring there is MongoDB’s write performance.
Testing is done inside a Docker container for maximum repeatability. We don’t use Travis-CI or friends because we need to be able to kill our process dependencies (MongoDB, Redis, …) on demand.
Therefore you need to ([install docker](https://www.docker.io/gettingstarted/#h_installation)) to run the tests. If you’re not on an os that supports natively docker, don’t forget to start up your VM and ssh into it.
` $ make test `
You can also open a shell inside the docker (just like you would enter in a virtualenv) with:
` $ make docker (if it wasn't build before) $ make ssh `
Use in your application
add MRQ to your environment
Earlier in its development MRQ was tested successfully on PyPy but we are waiting for better PyPy+gevent support to continue working on it, as performance was worse than CPython.
Useful third-party utils
- Max Retries
- MongoDB/Redis disconnect tests in more contexts (long-running queries, …)
- Full linting
- Code coverage
- Public docs
- task progress
- ETAs / Lag stats for each queue + graphes
- uniquestarted/uniquequeued via bulk sets?
- Base cleaning/retry tasks: move
- Current greenlet traces in dashboard
- Move monitoring in a thread to protect against CPU-intensive tasks
- Bulk queues
- Full PyPy support
- Search in dashboard
- JS libraries used in the Dashboard:
… as well as all the Python modules in requirements.txt!