Skip to main content

More Threads! Simpler and faster threading.

Project description

Module threads

The main distinction between this library and Python’s is:

  1. Multi-threaded queues do not use serialization - Serialization is great in the general case, where you may also be communicating between processes, but it is a needless overhead for single-process multi-threading. It is left to the programmer to ensure the messages put on the queue are not changed, which is not ominous demand.

  2. Shutdown order is deterministic and explicit - Python’s threading library is missing strict conventions for controlled and orderly shutdown. These conventions eliminate the need for interrupt() and abort(), both of which are unstable idioms when using resources. Each thread can shutdown on its own terms, but is expected to do so expediently.

  • All threads are required to accept a please_stop signal, and are expected to test it in a timely manner and exit when signalled.

  • All threads have a parent - The parent is responsible for ensuring their children get the please_stop signal, and are dead, before stopping themselves. This responsibility is baked into the thread spawning process, so you need not deal with it unless you want.

  1. Uses **Signals** to simplify logical dependencies among multiple threads, events, and timeouts.

  2. Logging and Profiling is Integrated - Logging and exception handling is seamlessly integrated: This means logs are centrally handled, and thread safe. Parent threads have access to uncaught child thread exceptions, and the cProfiler properly aggregates results from the multiple threads.

What’s it used for

A good amount of time is spent waiting for underlying C libraries and OS services to respond to network and file access requests. Multiple threads can make your code faster, despite the GIL, when dealing with those requests. For example, by moving logging off the main thread, we can get up to 15% increase in overall speed because we no longer have the main thread waiting for disk writes or remote logging posts. Please note, this level of speed improvement can only be realized if there is no serialization happening at the multi-threaded queue.

Asynch vs. Actors

My personal belief is that actors are easier to reason about than asynch tasks. Mixing regular methods and co-routines (with their yield from pollution) is dangerous because:

  1. calling styles between methods and co-routines can be easily confused

  2. actors can use blocking methods, co-routines can not

  3. there is no way to manage resource priority with co-routines.

  4. stack traces are lost with co-routines

Synchronization Primitives

There are three major aspects of a synchronization primitive:

  • Resource - Monitors and locks can only be owned by one thread at a time

  • Binary - The primitive has only two states

  • Irreversible - The state of the primitive can only be set, or advanced, never reversed

The last, irreversibility is very useful, but ignored in many threading libraries. The irreversibility allows us to model progression; and we can allow threads to poll for progress, or be notified of progress.

These three aspects can be combined to give us 8 synchronization primitives:

  • - - - - Semaphore

  • - B - - Binary Semaphore

  • R - - - Monitor

  • R B - - Lock

  • - - I - Progress (not implemented)

  • - B I - Signal

  • R - I - ?limited usefulness?

  • R B I - ?limited usefulness?

The Lock

Locks are identical to threading monitors, except for two differences:

  1. The wait() method will always acquire the lock before returning. This is an important feature; ensuring every line in a code block has lock acquisition is easier to reason about.

  2. Exiting a lock via __exit__() will always signal any waiting thread to resume immediately. This ensures no signals are missed, and every thread gets an opportunity to react to possible change.

    lock = Lock()
    todo = []
    
    while not please_stop:
        with lock:
            while not todo:
                lock.wait(seconds=1)
            # DO SOME WORK

In this example, we look for stuff todo, and if there is none, we wait for a second. During that time others can acquire the lock and add todo items. Upon releasing the the lock, our example code will immediately resume to see what’s available, waiting again if nothing is found.

The Signal and Till Classes

The ``Signal` class <https://github.com/klahnakoski/pyLibrary/blob/dev/pyLibrary/thread/signal.py>`__ is like a binary semaphore that can be signalled only once. It can be signalled by any thread. Subsequent signals have no effect. Any thread can wait on a Signal; and once signalled, all waits are unblocked, including all subsequent waits. Its current state can be accessed by any thread without blocking. Signal is used to model thread-safe state advancement. It initializes to False, and when signalled (with go()) becomes True. It can not be reversed.

is_done = Signal()
yield is_done
# DO WORK
is_done.go()

You can attach methods to a Signal, which will be run, just once, upon go()

is_done = Signal()
is_done.on_go(lambda: print("done"))
return is_done

You may also wait on a Signal, which will block the current thread until the Signal is a go

is_done = worker_thread.stopped
is_done.wait_for_go()
is_done = print("worker thread is done")

The ``Till` class <https://github.com/klahnakoski/pyLibrary/blob/dev/pyLibrary/thread/till.py>`__ is used to represent timeouts. They can serve as a sleep() replacement:

Till(seconds=20).wait()
Till(till=Date("21 Jan 2016").unix).wait()

Because Signals are first class, they can be passed around and combined with other Signals. For example, using logical or (|):

def worker(please_stop):
    while not please_stop:
        #DO WORK

user_cancel = get_user_cancel_signal()
worker(user_cancel | Till(seconds=360))

Signals can also be combined using logical and (&):

(workerA.stopped & workerB.stopped).wait()
print("both threads are done")

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mo-threads-1.2.17056.zip (28.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mo_threads-1.2.17056-py2.7.egg (19.9 kB view details)

Uploaded Egg

File details

Details for the file mo-threads-1.2.17056.zip.

File metadata

  • Download URL: mo-threads-1.2.17056.zip
  • Upload date:
  • Size: 28.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for mo-threads-1.2.17056.zip
Algorithm Hash digest
SHA256 c8aaf14184b9bcbec972441ecc9258e1f5fcac388c558d272013dc47d61fca8a
MD5 0df0f271c0463feae52879fd5ddfb432
BLAKE2b-256 573037f7e451fca56d631f9f04028bc4c0de1542c76185851fde875967b82fce

See more details on using hashes here.

File details

Details for the file mo_threads-1.2.17056-py2.7.egg.

File metadata

File hashes

Hashes for mo_threads-1.2.17056-py2.7.egg
Algorithm Hash digest
SHA256 32494312d9ae6b0ecaefa75317c2d748a7f9aa6b4a12d76d991b9e9885733ad0
MD5 ec4f379a2a46d2d8ef43cce42e03281e
BLAKE2b-256 a8364c409847b530333d0766863194ad361cc728b974cb4814f7824abf5fffbc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page