Ad-hoc Test EXecutor
Project description
ATEX = Ad-hoc Test EXecutor
A collections of Python APIs to provision operating systems, collect and execute FMF-style tests, gather and organize their results and generate reports from those results.
The name comes from a (fairly unique to FMF/TMT ecosystem) approach that
allows provisioning a pool of systems and scheduling tests on them as one would
on an ad-hoc pool of thread/process workers - once a worker becomes free,
it receives a test to run.
This is in contrast to splitting a large list of N tests onto M workers
like N/M, which yields significant time penalties due to tests having
very varies runtimes.
Above all, this project is meant to be a toolbox, not a silver-plate solution.
Use its Python APIs to build a CLI tool for your specific use case.
The CLI tool provided here is just for demonstration / testing, not for serious
use - we want to avoid huge modular CLIs for Every Possible Scenario. That's
the job of the Python API. Any CLI should be simple by nature.
THIS PROJECT IS HEAVILY WIP, THINGS WILL MOVE AROUND, CHANGE AND OTHERWISE BREAK. DO NOT USE IT (for now).
License
Unless specified otherwise, any content within this repository is distributed under the GNU GPLv3 license, see the COPYING.txt file for more.
Parallelism and cleanup
There are effectively 3 methods of running things in parallel in Python:
threading.Thread(and relatedconcurrent.futuresclasses)multiprocessing.Process(and relatedconcurrent.futuresclasses)asyncio
and there is no clear winner (in terms of cleanup on SIGTERM or Ctrl-C):
Threadhas signal handlers only in the main thread and is unable to interrupt any running threads without super ugly workarounds likesleep(1)in every thread, checking some "pls exit" variableProcessis too heavyweight and makes sharing native Python objects hard, but it does handle signals in each process individuallyasynciohandles interrupting perfectly (everytry/except/finallycompletes just fine,KeyboardInterruptis raised in every async context), but async python is still (3.14) too weird and unsupportedasyncioeffectively re-implementssubprocesswith a slightly different API, same withasyncio.Transportand derivatives reimplementingsocket- 3rd party libraries like
requestsorurllib3don't support it, one needs to resort to spawning these in separate threads anyway - same with
os.*functions and syscalls - every thing exposed via API needs to have 2 copies - async and non-async, making it unbearable
- other stdlib bugs, ie. "large" reads returning BlockingIOError sometimes
The approach chosen by this project was to use threading.Thread, and
implement thread safety for classes and their functions that need it.
For example:
class MachineReserver:
def __init__(self):
self.lock = threading.RLock()
self.job = None
self.proc = None
def reserve(self, ...):
try:
...
job = schedule_new_job_on_external_service()
with self.lock:
self.job = job
...
while not reserved(self.job):
time.sleep(60)
...
with self.lock:
self.proc = subprocess.Popen(["ssh", f"{user}@{host}", ...)
...
return machine
except Exception:
self.abort()
raise
def abort(self):
with self.lock:
if self.job:
cancel_external_service(self.job)
self.job = None
if self.proc:
self.proc.kill()
self.proc = None
Here, it is expected for .reserve() to be called in a long-running thread that
provisions a new machine on some external service, waits for it to be installed
and reserved, connects an ssh session to it and returns it back.
But equally, .abort() can be called from an external thread and clean up any
non-pythonic resources (external jobs, processes, temporary files, etc.) at
which point we don't care what happens to .reserve(), it will probably fail
with some exception, but doesn't do any harm.
Here is where daemon=True threads come in handy - we can simply call .abort()
from a KeyboardInterrupt (or SIGTERM) handle in the main thread, and just
exit, automatically killing any leftover threads that are uselessly sleeping.
(Realistically, we might want to spawn new threads to run many .abort()s in
parallel, but the main thread can wait for those just fine.)
It is not perfect, but it's probably the best Python can do.
Note that races can still occur between a resource being reserved and written
to self.* for .abort() to free, so resource de-allocation is not 100%
guaranteed, but single-threaded interrupting has the same issue.
Do have fallbacks (ie. max reserve times on the external service).
Also note that .reserve() and .abort() could be also called by a context
manager as __enter__ and __exit__, ie. by a non-threaded caller (running
everything in the main thread).
Unsorted notes
TODO: codestyle from contest
- this is not tmt, the goal is to make a python toolbox *for* making runcontest
style tools easily, not to replace those tools with tmt-style CLI syntax
- the whole point is to make usecase-targeted easy-to-use tools that don't
intimidate users with 1 KB long command line, and runcontest is a nice example
- TL;DR - use a modular pythonic approach, not a modular CLI like tmt
- Orchestrator with
- add_provisioner(<class>, max_workers=1) # will instantiate <class> at most max_workers at a time
- algo
- for all provisioner classes, spawns classes*max_workers as new Threads
- waits for any .reserve() to return
- creates a new Thread for minitmt, gives it p.get_ssh() details
- minitmt will
- establish a SSHConn
- install test deps, copy test repo over, prepare socket dir on SUT, etc.
- run the test in the background as
f=os.open('some/test/log', os.WRONLY); subprocess.Popen(..., stdout=f, stderr=f, stdin=subprocess.DEVNULL)
- read/process Unix sock results in the foreground, non-blocking,
probably calling some Orchestrator-provided function to store results persistently
- regularly check Popen proc status, re-accept UNIX sock connection, etc., etc.
- minitmt also has some Thread-independent way to .cancel(), killing the proc, closing SSHConn, etc.
- while waiting for minitmt Threads to finish, to re-assign existing Provisioner instances
to new minitmt Threads, .. Orchestrator uses some logic to select, which TestRun
would be ideal to run next
- TestRun probably has some "fitness" function that returns some priority number
when given a Provisioner instance (?) ...
- something from minitmt would also have access to the Provisioner instance
- the idea is to allow some logic to set "hey I set up nested VM snapshot on this thing"
on the Provisioner instance, and if another /hardening/oscap TestRun finds
a Provisioner instance like that, it would return high priority
- ...
- similar to "fitness" like function, we need some "applicability" function
- if TestRun is mixed to RHEL-9 && x86_64, we need it to return True
for a Provisioner instance that provides RHEL-9 and x86_64, but False otherwise
- basically Orchestrator has
- .add_provisioner()
- .run_test() # called with an exclusively-borrowed Provisioner instance
- if Provisioner is_alive()==False after .run_test(), instantiate a new one from the same inst.__class__
- if test failed and reruns > 0, try run_test() again (or maybe re-queue the test)
- .output_result() # called by run_test() to persistently log a test result
- .applicable() # return True if a passed TestRun is meant for a passed Platform (Provisioner?)
- if no TestRun returns True, the Provisioner is .release()d because we don't need it anymore
- .fitness() # return -inf / 0 / +inf with how much should a passed TestRun run on a Provisioner
- MAYBE combine applicable() and fitness() into one function, next_test() ?
- given the free Provisioner and a list of TestRuns, select which should run next on the Provisioner
- if none is chosen, .release() the Provisioner without replacement, continue
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file atex-0.8.tar.gz.
File metadata
- Download URL: atex-0.8.tar.gz
- Upload date:
- Size: 63.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
36e9a2275a66515efe8857ecf7077a7aade9abcce48a5592d68f1eaf2aa74b81
|
|
| MD5 |
f9c30828f167dbbe75ac5f04f036f01b
|
|
| BLAKE2b-256 |
c5498b1fafac4e46b5288b31d60db8487179c32d1774bb855eaa2151b37ca743
|
File details
Details for the file atex-0.8-py3-none-any.whl.
File metadata
- Download URL: atex-0.8-py3-none-any.whl
- Upload date:
- Size: 57.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0a9f71e3bd03593b400afa97e12c4c70b20af35619e9b442255949b8edb4cb36
|
|
| MD5 |
5b92d41a69f4b49f09ac46a8a9da34bf
|
|
| BLAKE2b-256 |
e8a9ba15a9f57faa2cf0debd179fa9f29bbff102cb385bb330061e6fa4da2118
|