Skip to main content

CI Runs

Job reruns

CI platforms expose a "rerun" button on jobs. DAGZ automatically handles retries intelligently:

  • Rerun of a failed job: only the tests that failed in the previous attempt are rerun. The rest are skipped as redundant. Useful for flake hunting and for re-verifying a fix without paying for the full suite.
  • Rerun of a passing job: the entire suite reruns without selection. The previous job already covered the affected tests; a deliberate rerun signals you want a fresh full pass.

Inspecting results in the dashboard

The DAGZ dashboard surfaces every CI run as a job.

Jobs list

The jobs list at <DAGZ_URL>/jobs shows status, duration, pass/fail counts, flaky counts, and compute savings for the recent runs. A running job's result chips break down by state: blue for unfinished tests, yellow for tests that failed and are pending a retry, green for passed, and red for tests that failed all retries.

Job result chips with a running row

Tests and logs

Each Job ID is a link. A single click on the job ID lands directly on the job's result page; for failing jobs the Errors tab opens with the failing tests listed. Clicking a failed test shows its logs in the panel below:

Job errors view, with log panel

Download agent context

The ✨ button on each failed test row packages the test's logs, history, and surrounding context into a bundle for an AI coding agent to investigate and fix. Equivalent to zb span-logs <span-id> from the command line.

Agent context download button

Inspecting log results

DAGZ writes test progress, a per-worker status snapshot, and a job-end report to the console. The console output also carries links to and from the DAGZ dashboard.

Progress logs

DAGZ prints a progress log line for every test as it finishes, with the test's result and duration:

00:00:04.190 █████████ [36+98+32+1=167] FAILED 0.5s lag=+0.635s subprocs/test_subprocs.py::test_main_func_mod %cpu=3% worker #3

Status logs

While the run is in progress, the scheduler prints the workers' status every 30 seconds.

00:01:30.668 Worker #0 [Active] progress=7908+73846/81754 actual_delta=-0.5s last_finished=1.0s q=Regular batch #1 time_left=237.1s stolen=0,0finished reconnects=0 pandas/tests/arithmetic/test_datetime64.py::TestDatetime64DateOffsetArithmetic::test_dt64arr_add_sub_DateOffsets[Series-us-US/Central-5-True-CBMonthEnd]
00:01:30.668 Worker #1 [Active] progress=15369+37374/52743 actual_delta=+2.4s last_finished=0.7s q=Regular batch #1 time_left=240.1s stolen=0,0finished reconnects=0 pandas/tests/extension/test_masked.py::TestMaskedArrays::test_loc_series[BooleanDtype]
00:01:30.668 Worker #2 [Active] progress=7275+40253/47528 actual_delta=-0.1s last_finished=0.1s q=Regular batch #1 time_left=235.7s stolen=0,0finished reconnects=0 pandas/tests/groupby/test_raises.py::test_groupby_raises_category[by7-True-std-method]
00:01:30.668 Worker #3 [Active] progress=52+2414/2466 actual_delta=-2.4s last_finished=6.9s q=Regular batch #1 time_left=245.1s stolen=0,0finished reconnects=0 pandas/tests/window/test_numba.py::TestTableMethod::test_table_method_rolling_methods[False-True-arithmetic_numba_supported_operators2-1]

One line per worker, in worker order. Fields after the worker number:

FieldMeaning
[State]Worker state: Active, Idle, WorkSent, Disconnected, Failed, NotConnected.
progress=A+B/CA tests finished, B remaining, C total assigned to this worker.
actual_delta=±X.XsCumulative deviation from the plan. Positive = slower than planned.
last_finished=Y.YsSeconds since the last test on this worker finished. A growing number means the current test is taking long.
q=NAME batch #NActive queue and batch number.
time_left=Z.ZsEstimated time to drain all queues on this worker.
stolen=L,MfinishedL tasks were stolen from other workers; M of those have finished.
reconnects=KReconnect count for this worker.

The trailing text is the test currently running on the worker.

Job report

At the end of the session, every process (scheduler and each worker) prints the same report. In a parallel CI step this means the report lands in every parallel job's logs. Any one of them gets you a link to the dashboard without having to find the scheduler's log.

When the run passes, the report prints in green:

*** DAGZ SESSION END: 2026-05-17 09:35:34 Asia/Jerusalem *** *** See j0517.50 | https://dagz.example.com/jobs/43732fd6-a9d6-4a92-8960-ba30b8b3fe3f *** *** Summary of all 6 workers on 1 nodes (master=1/1) *** 239941/239941 PASSED, 1593 xfail *** 7146 SKIPPED ***

When any test fails, the report prints in red and lists the failures above the summary:

*** FAILED: pandas/tests/io/test_http_headers.py::test_request_headers[json] | AssertionError: expected 200, got 502 FAILED: pandas/tests/groupby/test_raises.py::test_groupby_raises_category[by7-True-std-method] | TimeoutError: ... FAILED: pandas/tests/window/test_numba.py::TestTableMethod::test_table_method_rolling_methods | ValueError: ... *** DAGZ SESSION END: 2026-05-17 09:35:34 Asia/Jerusalem *** *** See j0517.50 | https://dagz.example.com/jobs/43732fd6-a9d6-4a92-8960-ba30b8b3fe3f?spanTypes=failures *** *** Summary of all 6 workers on 1 nodes (master=1/1) *** 3/239941 FAILED *** 232792 PASSED, 1593 xfail *** 7146 SKIPPED ***

On failure, the dashboard URL deep-links to the failures view (?spanTypes=failures); on success it links to the job overview.

j0517.50 is the short job ID: the 50th job on May 17.

DAGZ prints the dashboard URL inline in the job report. CI platforms render it as a clickable link, so any CI log that includes the report has a one-click path to the job in the dashboard.

The dashboard reverses the link: each job page shows the originating CI run, so navigating from a failure in DAGZ back to the CI logs is a single click.

How it works

Parallel execution

DAGZ executes test suites in parallel, spreading the work across multiple nodes (machines) and multiple workers on each node.

DAGZ's test scheduler optimizes test distribution using:

  • Previous run durations, taken from the baselines generated for DAGZ test selection.
  • Fixture sharing - tests that need the same setup tend to land on the same worker.
  • Work stealing - idle workers steal tests from busy ones when the plan drifts from reality.

For installing the plugin on CI workers and pointing it at your team deployment, see CI Integration.

For hardware-side reasons actual durations drift from the plan (CPU throttling, hybrid cores, power limits), see Scheduling.

Process roles

DAGZ manages 4 types of test processes:

  1. Scheduler/Master: coordinates the entire run, plans work distribution, and collects results.
  2. Vassal: manages the worker pool on a single machine.
  3. Workers: run tests sequentially, collect code signals and reporting results back to the scheduler.
  4. Sub-processes: spawned by tests, automatically associated with the calling test.

On Linux, workers and vassals are forked processes. They share memory and can be recycled if they exceed memory limits.

CI job joining

When multiple pytest --dagz processes start in the same parallel CI step, they automatically join into a single job. The first node to come up becomes the scheduler; other nodes become vassals and connect back to the scheduler.