A Python tester for the 42 push_swap project with controlled disorder generation and performance grading.

These details have not been verified by PyPI

Project links

Project description

ft_ps_tester

A Python-based tester for the 42 push_swap project. It generates controlled random sequences with specific disorder levels, runs your push_swap executable, validates the output, and grades performance against 42 thresholds.

Features

Controlled disorder generation — creates sequences with precise inversion percentages.
Four test modes — simple, medium, complex, and adaptive.
Output validation — simulates operations to verify sorting correctness.
Performance grading — compares operation counts against 42 thresholds (excellent / good / pass / fail).
Failure report — concise summary of timeouts, invalid operations, and limit exceedances.
Big-O complexity analysis (--big-o) — measures algorithmic scaling across progressively larger inputs with per-mode expectations.
Basic & edge-case tests (--basic) — small-N op counts, reversed/sorted/nearly-sorted inputs, error handling, split args, mode specialization, default-flag, benchmark mode, and memory/crash checks.
Configurable number range — values span INT_MIN..INT_MAX by default (--1m, --u1m narrow it); per-call timeouts scale with input size.
Bonus checker test (--bonus) — validates the bonus checker program: error management, OK/KO verdicts (subject examples + generated), and memory/crash checks.

Requirements

Python 3
A compiled push_swap executable that:
- prints operations to fd1 (stdout), and exactly Error\n to fd2 (stderr) on invalid input;
- handles the full 32-bit signed integer range (INT_MIN..INT_MAX).
Optional: valgrind (Linux) / leaks (macOS) for the memory test; a --bench mode in your push_swap for the benchmark checks; a compiled checker next to push_swap for --bonus.

Installation

Tip: If you run into installation errors, try updating pip first:
pip install --upgrade pip
# or
pip3 install --upgrade pip

Option 1: Install from PyPI (recommended)

Using pip:

pip install ft_ps_tester

Using pip3:

pip3 install ft_ps_tester

User-local install (no sudo required — pip):

pip install --user ft_ps_tester

User-local install (no sudo required — pip3):

pip3 install --user ft_ps_tester

Using python3 -m pip:

python3 -m pip install ft_ps_tester

Note: When using --user, the ft_ps_tester binary is installed to a user-local bin/ directory (e.g. ~/.local/bin on Linux/macOS, or %APPDATA%\Python\Python3x\Scripts on Windows). Make sure this directory is on your PATH, or use the python3 -m execution methods shown below.

Option 2: Install from source

Clone this repository:

git clone https://github.com/italoalmeida0/ft_ps_tester.git
cd ft_ps_tester

Editable / development mode (pip):

pip install -e .

Editable / development mode (pip3):

pip3 install -e .

Editable / development mode (python3 -m pip):

python3 -m pip install -e .

Normal install (pip):

pip install .

Normal install (pip3):

pip3 install .

Normal install (python3 -m pip):

python3 -m pip install .

User-local install from source (pip --user):

pip install --user -e .

User-local install from source (pip3 --user):

pip3 install --user -e .

Option 3: Run with `pipx` (isolated, no install required)

If you have pipx installed, you can run the tester directly without permanently installing it:

pipx run ft_ps_tester ./push_swap

Single test run:

pipx run ft_ps_tester ./push_swap 500 complex

Or install it into an isolated environment:

pipx install ft_ps_tester

Then run normally:

ft_ps_tester ./push_swap

Make sure your push_swap binary is compiled and executable:

make
chmod +x push_swap

Usage

Full test suite (recommended)

Runs the --basic edge-case suite first, then tests 100 and 500 elements across all four modes (100 tests each).

If the ft_ps_tester command is on your PATH:

ft_ps_tester ./push_swap

If the command is not found (common with --user installs), run via the module:

python3 -m ft_ps_tester ./push_swap

Or using the .cli submodule directly:

python3 -m ft_ps_tester.cli ./push_swap

Or using python directly:

python -m ft_ps_tester ./push_swap

From the cloned source directory (no install required):

python3 ft_ps_tester/cli.py ./push_swap

Single test run

Test a specific size and mode (size must be 100 or 500 — smaller sizes are covered by --basic):

ft_ps_tester ./push_swap <size> <mode>

If ft_ps_tester is not on your PATH:

python3 -m ft_ps_tester ./push_swap <size> <mode>

Or via the .cli submodule:

python3 -m ft_ps_tester.cli ./push_swap <size> <mode>

Example:

ft_ps_tester ./push_swap 500 complex

Or:

python3 -m ft_ps_tester ./push_swap 500 complex

Failure reports (`--reports`)

Enable automatic report generation when a test fails (invalid sort, operation limit exceeded, or timeout). Reports are saved in the same directory as the push_swap executable.

Two files are generated per failure:

File suffix	Content
`_ops_N.txt`	Raw output (operations) from `push_swap`
`_nums_N.txt`	Input numbers passed to `push_swap`

The numbering (_1, _2, …) is shared between both files: if either an _ops_N or _nums_N file already exists, the next number is used so both files always share the same suffix.

Example:

ft_ps_tester --reports ./push_swap

If a failure occurs in the 100_simple test, the following files are created next to ./push_swap:

report_100_simple_ops_1.txt
report_100_simple_nums_1.txt

If another failure occurs later (or if one of those files already existed), the next pair becomes:

report_100_simple_ops_2.txt
report_100_simple_nums_2.txt

You can also use --reports with single test runs:

ft_ps_tester --reports ./push_swap 500 complex

Basic & edge-case tests (`--basic`)

Run a fast suite of correctness and edge-case checks. These also run first, automatically, at the start of every full test suite.

ft_ps_tester --basic ./push_swap

The suite covers ten areas:

#	Test	What it checks
1	Small-N op counts	Every permutation of 3 and 5 numbers across all modes (N=3 ≤3 `GOOD` / ≤5 `PASS`; N=5 ≤12 `GOOD` / ≤15 `PASS`).
2	Reversed input	Fully reversed (100% disorder) sequences of 3, 5, 10, 50, 100, 500 must sort correctly.
3	Sorted → 0 ops	Already-sorted 1, 2, 3, 5, 10, 50, 100, 500 must output 0 operations (and no error).
4	Nearly-sorted (1 swap)	10 elements with a single adjacent swap — last↔penultimate and first↔second — must sort correctly.
5	Error handling	Invalid input (`1a`, `1.0`, `6-1`, lone `-`/`+`, `+0`/`-0` duplicate, empty arg in the middle `3 2 "" 1 4 5`, `> INT_MAX`, LONG overflow, duplicates…) → exactly `Error\n` on fd2; valid edges (`INT_MIN`, `+0`) accepted; no args → no output.
6	Split / multi-number args	Whether `1 2 "3 4" 5` and `"1 2 3 4 5"` are accepted (a warning, not a failure, if not).
7	Mode specialization	The matching `--<mode>` should use the fewest ops per number type (10% margin); identical counts across all modes → invalid (flags not differentiated).
8	Default flag (adaptive)	Running with no flag must sort and match `--adaptive` behavior.
9	Benchmark mode (`--bench`)	`fd1` still sorts; the `fd2` report names the strategy and reports the disorder % accurately for random sequences (0–100%, checked against the inversion count).
10	Memory & crashes	Runs `valgrind` (Linux) / `leaks` (macOS) on valid, sorted, error and no-arg inputs; flags leaks, memory errors and segfaults. Skipped if the tool is unavailable.

File descriptors: fd1 (stdout) carries the operations. fd2 (stderr) carries exactly Error\n on invalid input, and the --bench report when benchmark mode is used. An error is detected only when fd2 is exactly Error\n and fd1 is empty.

Number range & timeouts

Random numbers are generated across the full 32-bit signed range by default. Narrow the range with a flag (works with every mode, including --big-o):

Flag	Range
(default)	`INT_MIN..INT_MAX`
`--1m`	`-1,000,000..1,000,000`
`--u1m`	`0..1,000,000`

ft_ps_tester --1m ./push_swap          # generate values in -1M..1M
ft_ps_tester --u1m ./push_swap 500 complex

The active range is printed at the top of every run; your push_swap must handle these values.

Each ./push_swap call has a timeout scaled by input size (~5s for small/basic inputs, ~10s at 500, ~15s at 800, capped at 30s). When a timeout occurs it is reported on screen, and the tester suggests trying a narrower range to check whether large/negative values are the cause.

Bonus checker (`--bonus`)

Add --bonus to any run to also test the bonus checker program. The tester expects a compiled, executable checker in the same directory as your push_swap (if it is missing, the bonus is skipped with a warning).

ft_ps_tester --bonus ./push_swap          # full suite + checker tests
ft_ps_tester --basic --bonus ./push_swap  # basic tests + checker tests

It checks, per the subject's checker evaluation:

Group	What it checks
Error management	non-numeric / duplicate / `> INT_MAX` args, a non-existent instruction, an instruction with surrounding spaces → `Error\n` on fd2; no args → no output.
False tests (`KO`)	valid operations that do not sort the stack must print `KO` (subject example `sa pb rrr` + generated cases).
Right tests (`OK`)	valid operations that do sort the stack must print `OK` (subject examples + sequences generated by your `push_swap`, verified to sort).
Memory & crashes	`valgrind`/`leaks` and segfault detection on the checker.

Operation sequences are simulated internally to know the expected OK/KO, so the checker is verified against an independent oracle (not against the binary 42 provides).

Big-O Complexity Analysis (`--big-o`)

⚠️ Informational only — never use this to fail anyone. The subject defines the complexity model (four strategies; complexity measured in the number of push_swap operations generated — not time, nor classical array complexity) but does not specify how to validate or test it (no sizes, thresholds, or method). The execution-time figures here are shown for reference only and are not part of the subject's model — the Overall verdict is based on operation count alone. Treat all results as a rough indicator, not a pass/fail criterion. The tool prints this same warning at the top of the analysis.

Run a dedicated complexity analysis that measures how your algorithm scales with input size. This mode runs 100 tests per size per mode across progressively larger sequences:

Sizes tested: 50, 100, 200, 400, 800

For each size, the tester:

Generates sequences with disorder appropriate to the mode
Runs 3 warm-up executions (not measured) to stabilize system caches
Runs 100 measured executions and tracks:
- Average number of operations
- Average execution time
- Growth ratio between consecutive sizes
Validates that the output is correctly sorted

If any test fails to sort or times out, the failure is reported and that size/mode combination is marked.

Classification criteria:

Both operations and time are classified independently using dynamic formulas based on input size n (operations drive the verdict; time is informational only):

Operations:

Complexity	Formula (max ops)	Avg growth ratio
`O(n)`	`1.0 * n`	≤ 2.2x
`O(n log n)`	`1.14 * n * log₂(n)`	≤ 3.0x
`O(n sqrt(n))`	`1.09 * n * sqrt(n)`	≤ 3.5x
`O(n²)`	`0.152 * n²`	≤ 4.0x
`O(n³)`	`0.00095 * n³`	≤ 8.0x
`O(>n³)`	> `0.00095 * n³`	> 8.0x

Time (ms):

Complexity	Formula (max ms)	Avg growth ratio
`O(n)`	`0.05 * n`	≤ 2.2x
`O(n log n)`	`0.08 * n * log₂(n)`	≤ 3.0x
`O(n sqrt(n))`	`0.12 * n * sqrt(n)`	≤ 3.5x
`O(n²)`	`0.25 * n²`	≤ 4.0x
`O(n³)`	`1.0 * n³`	≤ 8.0x

Example: At n=800, O(n²) max ops = 0.152 * 800² ≈ 97,280. The coefficients are calibrated for push_swap output patterns.

Expected complexity per mode:

Mode	Expected complexity	Description
`simple`	≤ `O(n²)`	Nearly sorted — should stay within n²
`medium`	≤ `O(n sqrt(n))`	Moderate disorder — better than n²
`complex`	`O(n log n)`	Heavily shuffled — optimal sort expected
`adaptive`	`O(n²)` down to `O(n log n)`	Should adapt based on disorder level

Usage:

ft_ps_tester --big-o ./push_swap

Or via module:

python3 -m ft_ps_tester --big-o ./push_swap

Example output:

================================================================================
  BIG-O COMPLEXITY ANALYSIS
================================================================================

>> Big-O Analysis | Mode: SIMPLE
  Size |  Tests |    Avg Ops |  Ops Ratio | Avg Time(ms) | Time Ratio | Status
----------------------------------------------------------------------------------
    50 |    100 |        450 |        N/A |         2.10 |        N/A | PASS
   100 |    100 |        980 |      2.18x |         4.50 |      2.14x | PASS
   ...
----------------------------------------------------------------------------------
Ops Complexity:  O(n^2) (Avg ratio: 3.29x, Max ops at n=800: 75000)
Time Complexity: O(n log n) (Avg ratio: 2.15x, Max time(ms) at n=800: 35.20)

================================================================================
  BIG-O SUMMARY
================================================================================

Mode       | Ops Big-O    | Ops Status | Time Big-O   | Time (info) | Overall | Expected
-------------------------------------------------------------------------------------------------------------------
SIMPLE     | O(n log n)   | OK         | O(n^2)       | info        | PASS    | <= O(n^2)
MEDIUM     | O(n log n)   | OK         | O(n^2)       | info        | PASS    | <= O(n sqrt(n))
COMPLEX    | O(n log n)   | OK         | O(n^2)       | info        | PASS    | O(n log n)
ADAPTIVE   | O(n log n)   | OK         | O(n^2)       | info        | PASS    | O(n^2) down to O(n log n)

Details by mode:
  SIMPLE   | Ops:  Avg ratio: 2.40x, Max ops at n=800: 7732
           | Time: Avg ratio: 3.90x, Max time(ms) at n=800: 206.76
  MEDIUM   | Ops:  Avg ratio: 2.40x, Max ops at n=800: 7732
           | Time: Avg ratio: 3.90x, Max time(ms) at n=800: 206.76
  COMPLEX  | Ops:  Avg ratio: 2.40x, Max ops at n=800: 7732
           | Time: Avg ratio: 3.90x, Max time(ms) at n=800: 206.76
  ADAPTIVE | Ops:  Avg ratio: 2.40x, Max ops at n=800: 7732
           | Time: Avg ratio: 3.90x, Max time(ms) at n=800: 206.76

Reference (operation-count complexity — the subject's metric):
  Simple  : Expected <= O(n^2)  (nearly sorted)
  Medium  : Expected <= O(n sqrt(n))
  Complex : Expected O(n log n) (optimal comparison sort)
  Adaptive: Expected O(n^2) down to O(n log n) (should adapt to disorder)

Note: Overall reflects OPERATIONS only — the subject's metric. Time is informational.

Modes / Flags

Your push_swap must support the following flags (passed as --<mode> before the numbers):

Mode	Disorder range	Description
`simple`	15.0% – 19.9%	Nearly sorted sequences
`medium`	20.0% – 49.9%	Moderately shuffled sequences
`complex`	50.0% – 55.0%	Heavily shuffled sequences
`adaptive`	15.0% – 55.0%	Random disorder across the full spectrum

Running with no flag must behave like --adaptive (checked by test 8). An optional --bench flag (combined with a mode, e.g. --bench --simple) enables benchmark mode: operations still go to fd1, while a report is written to fd2 (stderr) containing at least the strategy name and the input's disorder %.

Note: If your push_swap does not implement these flags, the tester will still work if your program ignores unknown flags and simply sorts the provided numbers. However, for accurate mode-based testing, your push_swap should parse and use the flag to adjust its algorithm. The mode-specialization test treats identical operation counts across all modes as invalid (the flags must actually run different algorithms).

Grading Thresholds

Size	Excellent	Good	Pass
3	—	≤ 3	≤ 5
5	—	≤ 12	≤ 15
100	< 700	< 1500	≤ 2000
500	< 5500	< 8000	≤ 12000

Sizes 3 and 5 are graded by the --basic suite (GOOD / PASS only); sizes 100 and 500 by the full performance suite.

Results are shown with color-coded grades:

EXCELLENT — green
GOOD — blue
PASS — yellow
FAIL — red

Example Output

Number range: INT_MIN..INT_MAX [-2147483648, 2147483647] — your push_swap must support these values.
   Range flags: default INT_MIN..INT_MAX, --1m (-1M..1M), --u1m (0..1M)
   Timeouts: scaled by input size (~5s small, ~10s @500, ~15s @800).

Running FULL TEST SUITE for ./push_swap

[ basic & edge-case tests run first — see the --basic section ]

>> Testing Size: 100 | Mode: SIMPLE  ..................................................
>> Testing Size: 100 | Mode: MEDIUM  ..................................................
>> Testing Size: 100 | Mode: COMPLEX ..................................................
>> Testing Size: 100 | Mode: ADAPTIVE..................................................
>> Testing Size: 500 | Mode: SIMPLE  ..................................................
>> Testing Size: 500 | Mode: MEDIUM  ..................................................
>> Testing Size: 500 | Mode: COMPLEX ..................................................
>> Testing Size: 500 | Mode: ADAPTIVE..................................................

========================================================================================
PERFORMANCE SUMMARY
========================================================================================
SIZE   | MODE     | MAX (GRADE)        | MIN (GRADE)        | AVG (GRADE)        | FAILS
----------------------------------------------------------------------------------------
100    | SIMPLE   | 450 (EXCELLENT)    | 320 (EXCELLENT)    | 380 (EXCELLENT)    | 0
100    | MEDIUM   | 1200 (GOOD)        | 900 (EXCELLENT)    | 1050 (GOOD)        | 0
...

License

This project is licensed under the MIT License.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.0.8

Jun 4, 2026

1.0.7

Jun 3, 2026

1.0.6

May 27, 2026

1.0.5

May 27, 2026

1.0.4

May 27, 2026

1.0.3

May 26, 2026

1.0.2

May 26, 2026

1.0.1

May 23, 2026

1.0.0

May 23, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ft_ps_tester-1.0.8.tar.gz (34.1 kB view details)

Uploaded Jun 4, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ft_ps_tester-1.0.8-py3-none-any.whl (28.2 kB view details)

Uploaded Jun 4, 2026 Python 3

File details

Details for the file ft_ps_tester-1.0.8.tar.gz.

File metadata

Download URL: ft_ps_tester-1.0.8.tar.gz
Upload date: Jun 4, 2026
Size: 34.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.5

File hashes

Hashes for ft_ps_tester-1.0.8.tar.gz
Algorithm	Hash digest
SHA256	`7bcfa648cfb7f55846aa599f3dc0f8a751553654545341e1a7ed41ecd8624aba`
MD5	`d5157fd7de45b4d1a1705f64c70902e5`
BLAKE2b-256	`5c053c2c316b6512962ebadb4c1175768675ac970b7c304374e5d93d7bdefc25`

See more details on using hashes here.

File details

Details for the file ft_ps_tester-1.0.8-py3-none-any.whl.

File metadata

Download URL: ft_ps_tester-1.0.8-py3-none-any.whl
Upload date: Jun 4, 2026
Size: 28.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.5

File hashes

Hashes for ft_ps_tester-1.0.8-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d28ee0c0e79229a827d648ecd57e75b5f5e0c050c415acc438ccc9cb69eed455`
MD5	`a3a8b5f1dd70e6f5c9e1c01b02d587ed`
BLAKE2b-256	`5d0ef04cf6c95bc331338dee29664d77da16a2f4757d9648f548f7db1e546774`

See more details on using hashes here.

ft-ps-tester 1.0.8

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

ft_ps_tester

Features

Requirements

Installation

Option 1: Install from PyPI (recommended)

Option 2: Install from source

Option 3: Run with pipx (isolated, no install required)

Usage

Full test suite (recommended)

Single test run

Failure reports (--reports)

Basic & edge-case tests (--basic)

Number range & timeouts

Bonus checker (--bonus)

Big-O Complexity Analysis (--big-o)

Modes / Flags

Grading Thresholds

Example Output

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Option 3: Run with `pipx` (isolated, no install required)

Failure reports (`--reports`)

Basic & edge-case tests (`--basic`)

Bonus checker (`--bonus`)

Big-O Complexity Analysis (`--big-o`)