Mongo Redis Queue
Mongo Redis Queue - A distributed worker task queue in Python
/!MRQ is not yet ready for public use. Soon!
MRQ was first developed at http://pricingassistant.com and its initial feature set matches the needs of worker queues with heterogenous jobs (IO-bound & CPU-bound, lots of small tasks & a few large ones).
The main features of MRQ are:
- Simple code: We originally switched from Celery to RQ because Celery’s code was incredibly complex and obscure ([Slides](http://www.slideshare.net/sylvinus/why-and-how-pricing-assistant-migrated-from-celery-to-rq-parispy-2)). MRQ should be as easy to understand as RQ and even easier to extend.
- Great dashboard: Have visibility and control on everything: queued jobs, current jobs, worker status, …
- Per-job logs: Get the log output of each task separately in the dashboard
- Gevent worker: IO-bound tasks can be done in parallel in the same UNIX process for maximum throughput
- Supervisord integration: CPU-bound tasks can be split across several UNIX processes with a single command-line flag
- Job management: You can retry, requeue, cancel jobs from the code or the dashboard.
- Performance: Bulk job queueing, easy job profiling
- Easy configuration: Every aspect of MRQ is configurable through command-line flags or a configuration file
- Job routing: Like Celery, jobs can have default queues, timeout and ttl values.
- Thorough testing: Edge-cases like worker interrupts, Redis failures, … are tested inside a Docker container.
- Builtin scheduler: Schedule tasks by interval or by time of the day
- Greenlet tracing: See how much time was spent in each greenlet to debug CPU-intensive jobs.
- Integrated memory leak debugger: Track down jobs leaking memory and find the leaks with objgraph.
On a MacbookPro, we see 1300 jobs/second in a single worker process with very simple jobs that store results, to measure the overhead of MRQ. However what we are really measuring there is MongoDB’s write performance.
Hunting memory leaks
Memory leaks can be a big issue with gevent workers because several tasks share the same python process.
Thankfully, MRQ provides tools to track down such issues. Memory usage of each worker is graphed in the dashboard and makes it easy to see if memory leaks are happening.
When a worker has a steadily growing memory usage, here are the steps to find the leak:
- Check which jobs are running on this worker and try to isolate which of them is leaking and on which queue
- Start a dedicated worker with `--trace_memory --gevent 1` on the same queue : This will start a worker doing one job at a time with memory profiling enabled. After each job you should see a report of leaked object types.
- Find the most unique type in the list (usually not ‘list’ or ‘dict’) and restart the worker with `--trace_memory --gevent 1 --trace_memory_type=XXX --trace_memory_output_dir=memdbg` (after creating the directory memdbg).
- There you will find a graph for each task generated by [objgraph](https://mg.pov.lt/objgraph/) which is incredibly helpful to track down the leak.
Testing is done inside a Docker container for maximum repeatability. We don’t use Travis-CI or friends because we need to be able to kill our process dependencies (MongoDB, Redis, …) on demand.
Therefore you need to ([install docker](https://www.docker.io/gettingstarted/#h_installation)) to run the tests. If you’re not on an os that supports natively docker, don’t forget to start up your VM and ssh into it.
` $ make test `
You can also open a shell inside the docker (just like you would enter in a virtualenv) with:
` $ make docker (if it wasn't build before) $ make ssh `
Use in your application
- install mrq==0.0.6 in your application virtualenv
- then you can run mrq-worker and mrq-dashboard
- To run a task you can use mrq-run. If you add the –async option that will enqueue it to be later ran by a worker
Earlier in its development MRQ was tested successfully on PyPy but we are waiting for better PyPy+gevent support to continue working on it, as performance was worse than CPython.
Useful third-party utils
- Max Retries
- MongoDB/Redis disconnect tests in more contexts (long-running queries, …)
- Full linting
- Code coverage
- Public docs
- task progress
- uniquestarted/uniquequeued via bulk sets?
- Base cleaning/retry tasks: move
- Current greenlet traces in dashboard
- Move monitoring in a thread to protect against CPU-intensive tasks
- Bulk queues
- Full PyPy support
- Search in dashboard
- JS libraries used in the Dashboard:
… as well as all the Python modules in requirements.txt!