gaspar eventlet zmq parallel worker
Gaspar is a library for creating small, simple TCP daemons that parallelize CPU intensive work with a simple and low-overhead request/response pattern.
It does this by forking off a number of worker processes, using eventlet to handle incoming requests and the 0MQ push/pull message pattern to load ballance work across the worker processes and receive responses.
Gaspar uses the terms producer and consumer for the process that receives incoming requests and the processes that actually handle those requests. In the 0MQ documentation, these are called ventilator and sink, and various other terms are used throughout distributed systems literature. To use Gaspar, you need only to create a producer and a consumer, and then start the producer:
>>> import gaspar >>> def echo(message): return message >>> consumer = gaspar.Consumer(handler=echo) >>> producer = gaspar.Producer(consumer, 10123) >>> producer.start()
The start call will block, and at this point you will have a server which is listening on port 10123, receiving requests, sending them to a number of workers (default is the # of CPUs on your machine), and then replying based on the echo handler.
Gaspar’s default Producer takes requests on a simple naked TCP port. These requests are:
- a 4-byte unsigned int in network-byte order (struct.pack('!I', N))
- a string of that length
The reply is simply a string followed by the termination of the socket. The convenience function gaspar.client.request("host:port", message) will send a request and return the reply synchronously. It uses the basic socket libraries, so you can “green” it safely with eventlet or gevent’s monkey patching methods.
gaspar.client also provides a function called pack which takes a string and returns a new string with the 4-byte message length prepended. If you are using a gaspar daemon with async frameworks that are not greenlet based, you can use this to cover that aspect of the client protocol.
Gaspar requests and responses are just strings. There is no standard way to serialize multiple arguments or return multiple values. Because the nature of the work being farmed out to such a daemon could be defeated by the wrong calling semantics, these details are left to the Consumer implementation and to postprocessing client responses.
Although the technologies in use (TCP and 0MQ) would allow for daemons to be spread across systems, this wasn’t an original design goal of Gaspar and it is not currently supported.
why shouldn’t I use celery?
The major “advantages” of Gaspar over Celery are its small size, conceptual simplicity, and infrastructureless operation. The purpose of Gaspar was to make it very easy to remove CPU bound processes from a tight event based I/O loop (like eventlet, gevent, tornado, et al), turn it into I/O wait, and spread that work across multiple cores.
Celery serves a much broader range of purposes, is a lot more sophisticated, and has features like delayed and recurrent execution that Gaspar lacks. If you have a number of assorted tasks you need to execute asynchronously, Celery is very good at this. If you have an asynchronous worker that has a few very CPU-intensive tasks that are blocking the event loop, Gaspar allows you to farm that work out to daemons with very little code.