hermes-agent/hermes_cli
Teknium ac51c4c1ad
feat(kanban): per-task max_retries override (#20263 follow-up, supersedes #20972) (#21330)
Adds a per-task override for the consecutive-failure circuit breaker,
so individual tasks can opt out of the global ``kanban.failure_limit``
without dragging everyone else with them.

Resolution order (now three tiers):
  1. per-task ``max_retries`` (new, this commit)
  2. caller-supplied ``failure_limit`` — the gateway threads
     ``kanban.failure_limit`` from config here
  3. ``DEFAULT_FAILURE_LIMIT`` (2)

Changes:
- ``tasks.max_retries INTEGER`` column + migration for existing DBs
  (NULL = no override, matches pre-column behavior).
- ``Task.max_retries`` field + ``from_row`` plumbing.
- ``create_task(..., max_retries=N)`` kwarg.
- ``_record_task_failure`` reads the per-task value first and records
  ``limit_source`` + ``effective_limit`` on the ``gave_up`` event so
  operators can see which tier won.
- CLI: ``hermes kanban create --max-retries N`` (rejects ``< 1``).
- CLI: ``hermes kanban show`` surfaces the effective threshold +
  source (``(task)``, ``(config kanban.failure_limit)``, ``(default)``).
- CLI: ``_task_to_dict`` includes ``max_retries`` in ``--json`` output.

Key design choice vs. the earlier #20972 attempt:
- No new config key. The existing ``kanban.failure_limit`` (landed in
  #21183) is the dispatcher-tier source — no silent break for users
  who already tuned it.
- No ``!=`` sentinel for "is config set" (which would misfire when
  config equals the default). The tier-winner is determined purely
  by "is per-task override set" — the dispatcher always wins when
  per-task is NULL, regardless of whether the caller passed the
  default or a configured value.

E2E verified across four scenarios: default-only (trips at 2),
config-only (trips at caller's value), per-task-only beats default
(trips at task value), per-task beats larger config (trips at task
value). ``gave_up`` event metadata correctly records ``limit_source``
and ``effective_limit`` in all cases.

Tests:
- ``test_per_task_max_retries_overrides_dispatcher_limit`` — task=1
  beats caller=10.
- ``test_per_task_max_retries_allows_more_than_default`` — task=5
  does not trip at caller=default of 2.
- ``test_max_retries_none_falls_through_to_dispatcher_limit`` — None
  honors caller's config value (4), records ``limit_source=dispatcher``.

Full kanban trio (db + core + cli + tools + dashboard-plugin): 342
passed, no regressions.

Supersedes: #20972 (@jelrod27) — credit in PR close comment.
Ref: #20263 (tangentially — the reporter asked about adapter API
drift, not retry caps, but the CLI discussion there is what
surfaced the original ask).
2026-05-07 07:29:02 -07:00
..
__init__.py fix(windows): enforce UTF-8 stdout/stderr to prevent UnicodeEncodeError crash 2026-05-03 16:58:25 -07:00
_parser.py fix: add dashboard to CLI help epilogue and Docker CI smoke test 2026-05-07 06:16:23 -07:00
auth.py fix(auth): keep Spotify logout from resetting model config 2026-05-07 05:53:14 -07:00
auth_commands.py feat(nous): persist Nous OAuth across profiles via shared token store (#19712) 2026-05-04 04:54:55 -07:00
azure_detect.py chore: remove unused imports and dead locals (ruff F401, F841) (#17010) 2026-04-28 06:46:45 -07:00
backup.py fix(backup): floor pre-update backup_keep to 1 so the new backup survives 2026-05-04 05:07:13 -07:00
banner.py fix(banner): show correct update status on nix-built hermes (#17550) 2026-04-30 07:03:00 +05:30
browser_connect.py fix(browser): address Copilot review on /browser connect 2026-04-28 22:11:10 -07:00
callbacks.py fix: ESC cancels secret/sudo prompts, clearer skip messaging (#9902) 2026-04-14 16:11:37 -07:00
checkpoints.py feat(checkpoints): v2 single-store rewrite with real pruning + disk guardrails (#20709) 2026-05-06 05:44:35 -07:00
claw.py fix(claw): handle missing dir in _scan_workspace_state 2026-05-05 06:08:14 -07:00
cli_output.py refactor: remove dead code — 1,784 lines across 77 files (#9180) 2026-04-13 16:32:04 -07:00
clipboard.py feat: fix img pasting in new ink plus newline after tools 2026-04-11 13:14:32 -05:00
codex_models.py feat(codex): add gpt-5.5 and wire live model discovery into picker (#14720) 2026-04-23 13:32:43 -07:00
colors.py feat: respect NO_COLOR env var and TERM=dumb (#4079) 2026-03-30 17:07:21 -07:00
commands.py feat(curator): add hermes curator list-archived command (#21236) 2026-05-07 05:46:51 -07:00
completion.py fix: preserve profile name completion in dynamic shell completion 2026-04-14 10:45:42 -07:00
config.py feat(gateway): generic plugin hooks for env enablement + cron delivery 2026-05-07 07:15:44 -07:00
copilot_auth.py fix(oauth,gateway): monotonic deadlines for polling/timeout loops 2026-05-07 05:09:39 -07:00
cron.py feat(cron): add no_agent mode for script-only cron jobs (watchdog pattern) (#19709) 2026-05-04 12:31:01 -07:00
curator.py feat(curator): add hermes curator list-archived command (#21236) 2026-05-07 05:46:51 -07:00
curses_ui.py fix: treat ctrl-c as curses cancel 2026-05-04 01:36:44 -07:00
debug.py fix(debug): redact log content at upload time in hermes debug share 2026-05-03 11:42:20 -07:00
default_soul.py fix: reset default SOUL.md to baseline identity text (#3159) 2026-03-26 01:34:27 -07:00
dingtalk_auth.py chore: remove unused imports and dead locals (ruff F401, F841) (#17010) 2026-04-28 06:46:45 -07:00
doctor.py fix(doctor): retry DashScope China endpoint 2026-05-07 05:55:06 -07:00
dump.py refactor(env): use shared Hermes dotenv loader 2026-05-05 10:13:13 -07:00
env_loader.py refactor: consolidate symlink-safe atomic replace into shared helper 2026-04-28 04:58:22 -07:00
fallback_cmd.py feat(cli): add 'hermes fallback' command to manage fallback providers (#16052) 2026-04-26 06:19:04 -07:00
gateway.py fix(docker): refuse root gateway runs in official image 2026-05-07 05:59:25 -07:00
goals.py feat: /goal — persistent cross-turn goals (Ralph loop) (#18262) 2026-04-30 23:10:20 -07:00
hooks.py chore: remove unused imports and dead locals (ruff F401, F841) (#17010) 2026-04-28 06:46:45 -07:00
kanban.py feat(kanban): per-task max_retries override (#20263 follow-up, supersedes #20972) (#21330) 2026-05-07 07:29:02 -07:00
kanban_db.py feat(kanban): per-task max_retries override (#20263 follow-up, supersedes #20972) (#21330) 2026-05-07 07:29:02 -07:00
kanban_diagnostics.py fix(kanban): unify failure counter across spawn/timeout/crash outcomes (#20410) 2026-05-05 13:55:37 -07:00
logs.py feat: component-separated logging with session context and filtering (#7991) 2026-04-11 17:23:36 -07:00
main.py fix(update): migrate config in non-interactive updates 2026-05-07 06:04:28 -07:00
mcp_config.py fix(mcp): give 'mcp add --command' a distinct argparse dest 2026-05-07 05:17:03 -07:00
memory_setup.py fix(cli): decode .env as UTF-8 to avoid GBK crash on Windows 2026-05-02 01:40:31 -07:00
model_catalog.py chore: remove unused imports and dead locals (ruff F401, F841) (#17010) 2026-04-28 06:46:45 -07:00
model_normalize.py fix(opencode-go): keep users on opencode-go instead of hijacking to native providers (#20802) 2026-05-06 09:08:33 -07:00
model_switch.py fix(model_switch): live model discovery for custom_providers in /model picker 2026-05-07 05:21:26 -07:00
models.py feat(models): add paid tencent/hy3-preview route on OpenRouter (#21077) 2026-05-07 06:34:48 -07:00
nous_subscription.py feat(web): add SearXNG as a native search-only backend 2026-05-06 10:05:29 -07:00
oneshot.py fix(tui): honor launch toolsets (#17623) 2026-04-29 16:55:27 -07:00
pairing.py fix(pairing): enforce lockout on approve_code, not just generate_code (#10195) (#21325) 2026-05-07 07:18:21 -07:00
platforms.py feat: complete plugin platform parity — all 12 integration points 2026-04-29 21:56:51 -07:00
plugins.py feat: add transform_llm_output plugin hook 2026-05-07 05:46:05 -07:00
plugins_cmd.py feat(dashboard): add Plugins page with enable/disable, auth status, install/remove 2026-04-30 20:29:37 -04:00
profiles.py feat(profiles): --no-skills flag for empty profile creation (#20986) 2026-05-07 04:34:38 -07:00
providers.py fix: prevent bare 'custom' slug in model.provider (#17478) 2026-04-30 04:32:11 -07:00
pty_bridge.py fix(pty): default TERM for resize probes 2026-05-04 02:38:54 -07:00
relaunch.py remove relaunch_chat 2026-04-29 20:33:29 -07:00
runtime_provider.py fix(credential_pool): resolve key mix-up when custom providers share base_url 2026-05-07 05:27:41 -07:00
setup.py fix(gateway): don't dead-end setup wizard when only system-scope unit is installed 2026-05-06 15:58:02 -07:00
skills_config.py refactor(config): migrate remaining 33 cfg_get call sites (#17311) 2026-04-29 04:03:03 -07:00
skills_hub.py feat(skills): install skills from a direct HTTP(S) URL (#16323) 2026-04-26 20:57:10 -07:00
skin_engine.py fix(tui): honor skin highlight colors (#20895) 2026-05-06 14:01:56 -07:00
slack_cli.py fix(paths): route achievements plugin + profile-tui through HERMES_HOME 2026-04-30 23:21:54 -07:00
status.py fix(status): add missing popular provider API keys to hermes status display 2026-05-04 05:14:13 -07:00
timeouts.py refactor(timeouts): drop redundant ImportError in except clause 2026-04-26 20:48:20 -07:00
tips.py docs: refresh stale platform/LOC/test counts; clarify gateway vs plugin platforms 2026-05-05 13:45:47 -07:00
tools_config.py feat(web): add SearXNG as a native search-only backend 2026-05-06 10:05:29 -07:00
uninstall.py feat(uninstall): offer to remove named profiles when uninstalling from default 2026-04-18 19:18:13 -07:00
vercel_auth.py feat: add Vercel Sandbox backend 2026-04-29 07:22:33 -07:00
voice.py fix(tui): restore voice push-to-talk parity (#20897) 2026-05-06 15:49:59 -07:00
web_server.py feat(dashboard): support serving under URL prefix via X-Forwarded-Prefix 2026-05-07 06:39:18 -07:00
webhook.py refactor(config): migrate remaining 33 cfg_get call sites (#17311) 2026-04-29 04:03:03 -07:00