test(kanban): cover redeliver-on-cycle + flip stale unsub-on-abnormal-event tests

Follow-up to the previous commit's notifier behavior change. Two test fixes:

1. `tests/gateway/test_kanban_notifier.py` gains
   `test_notifier_redelivers_same_kind_on_dispatch_cycle` — pins the new
   contract directly: a task that crashes, gets reclaimed, and crashes
   again notifies the user BOTH times. Before #21398 the second crash
   silently dropped because the subscription was already deleted.

2. `tests/hermes_cli/test_kanban_notify.py::
   test_notifier_unsubs_after_abnormal_events[gave_up|crashed|timed_out]`
   is flipped. Those tests were added in the salvage of #22941 and
   asserted the OLD behavior (subscription deleted after gave_up /
   crashed / timed_out). They're now obsolete — the new contract is
   "subscription survives a non-final terminal event so retries reach
   the user." Updated docstring + asserts; the cursor-advance check is
   added to confirm the dedup mechanism still works.

The `test_notifier_unsubs_after_completed_event` test stays untouched
because `completed` IS still a terminal event that triggers unsub
(the task hits `done` status, which is handled by the `task_terminal`
branch in the notifier loop).
This commit is contained in:
Teknium 2026-05-10 14:26:55 -07:00
parent a96dd54872
commit 787e3c368c
2 changed files with 82 additions and 2 deletions

View file

@ -76,7 +76,13 @@ async def test_notifier_unsubs_after_completed_event(kanban_home):
@pytest.mark.parametrize('kind', ["gave_up", "crashed", "timed_out"])
async def test_notifier_unsubs_after_abnormal_events(kind, kanban_home):
"""
Event kind of gave_up, crashed, time_out would be cover, and remove subscription
Event kinds gave_up / crashed / timed_out send a notification but DO
NOT delete the subscription. The dispatcher may respawn the task and
fire the same event kind again (e.g. a worker that crashes, gets
reclaimed, and crashes a second time); the user must hear about the
second event too. Subscriptions are removed only when the task hits
a truly final status (done / archived) see the comment on
TERMINAL_KINDS in gateway/run.py and PR #21398.
"""
import hermes_cli.kanban_db as kb
from gateway.run import GatewayRunner
@ -114,15 +120,27 @@ async def test_notifier_unsubs_after_abnormal_events(kind, kanban_home):
timeout=10.0,
)
# The user is notified about the abnormal event...
fake_adapter.send.assert_called_once()
assert kind.replace('_', ' ') in fake_adapter.send.call_args[0][1]
# ...but the subscription survives so a respawn-then-same-event cycle
# reaches the user too. The cursor (last_event_id) advanced inside
# the same write txn as the claim, so the same event won't re-fire.
conn = kb.connect()
try:
subs = kb.list_notify_subs(conn, tid)
finally:
conn.close()
assert subs == [], "Subscription should be unsub after abnormal crash"
assert len(subs) == 1, (
f"Subscription should survive {kind!r} so the next cycle of the "
f"same event reaches the user; got {subs!r}"
)
assert int(subs[0]["last_event_id"]) >= 1, (
"Cursor should have advanced past the delivered event "
"(claim_unseen_events_for_sub advances atomically inside the "
"same write txn as the read)."
)
@pytest.mark.asyncio