Commit graph

2255 commits

Author SHA1 Message Date
Brooklyn Nicholson
aa583cb14e Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-17 17:51:40 -05:00
helix4u
2b60478fc2 fix(kimi): force kimi-for-coding temperature to 0.6 2026-04-17 15:49:14 -07:00
Teknium
c6fd2619f7
fix(gemini-cli): surface MODEL_CAPACITY_EXHAUSTED cleanly + drop retired gemma-4-26b (#11833)
Google-side 429 Code Assist errors now flow through Hermes' normal rate-limit
path (status_code on the exception, Retry-After preserved via error.response)
instead of being opaque RuntimeErrors. User sees a one-line capacity message
instead of a 500-char JSON dump.

Changes
- CodeAssistError grows status_code / response / retry_after / details attrs.
  _extract_status_code in error_classifier picks up status_code and classifies
  429 as FailoverReason.rate_limit, so fallback_providers triggers the same
  way it does for SDK errors. run_agent.py line ~10428 already walks
  error.response.headers for Retry-After — preserving the response means that
  path just works.
- _gemini_http_error parses the Google error envelope (error.status +
  error.details[].reason from google.rpc.ErrorInfo, retryDelay from
  google.rpc.RetryInfo). MODEL_CAPACITY_EXHAUSTED / RESOURCE_EXHAUSTED / 404
  model-not-found each produce a human-readable message; unknown shapes fall
  back to the previous raw-body format.
- Drop gemma-4-26b-it from hermes_cli/models.py, hermes_cli/setup.py, and
  agent/model_metadata.py — Google returned 404 for it today in local repro.
  Kept gemma-4-31b-it (capacity-constrained but not retired).

Validation
|                           | Before                         | After                                     |
|---------------------------|--------------------------------|-------------------------------------------|
| Error message             | 'Code Assist returned HTTP 429: {500 chars JSON}' | 'Gemini capacity exhausted for gemini-2.5-pro (Google-side throttle...)' |
| status_code on error      | None (opaque RuntimeError)     | 429                                       |
| Classifier reason         | unknown (string-match fallback) | FailoverReason.rate_limit                |
| Retry-After honored       | ignored                        | extracted from RetryInfo or header        |
| gemma-4-26b-it picker     | advertised (404s on Google)    | removed                                   |

Unit + E2E tests cover non-streaming 429, streaming 429, 404 model-not-found,
Retry-After header fallback, malformed body, and classifier integration.
Targeted suites: tests/agent/test_gemini_cloudcode.py (81 tests), full
tests/hermes_cli (2203 tests) green.

Co-authored-by: teknium1 <teknium@nousresearch.com>
2026-04-17 15:34:12 -07:00
Teknium
d2206c69cc fix(qqbot): add back-compat for env var rename; drop qrcode core dep
Follow-up to WideLee's salvaged PR #11582.

Back-compat for QQ_HOME_CHANNEL → QQBOT_HOME_CHANNEL rename:
  - gateway/config.py reads QQBOT_HOME_CHANNEL, falls back to QQ_HOME_CHANNEL
    with a one-shot deprecation warning so users on the old name aren't
    silently broken.
  - cron/scheduler.py: _HOME_TARGET_ENV_VARS['qqbot'] now maps to the new
    name; _get_home_target_chat_id falls back to the legacy name via a
    _LEGACY_HOME_TARGET_ENV_VARS table.
  - hermes_cli/status.py + hermes_cli/setup.py: honor both names when
    displaying or checking for missing home channels.
  - hermes_cli/config.py: keep legacy QQ_HOME_CHANNEL[_NAME] in
    _EXTRA_ENV_KEYS so .env sanitization still recognizes them.

Scope cleanup:
  - Drop qrcode from core dependencies and requirements.txt (remains in
    messaging/dingtalk/feishu extras). _qqbot_render_qr already degrades
    gracefully when qrcode is missing, printing a 'pip install qrcode' tip
    and falling back to URL-only display.
  - Restore @staticmethod on QQAdapter._detect_message_type (it doesn't
    use self). Revert the test change that was only needed when it was
    converted to an instance method.
  - Reset uv.lock to origin/main; the PR's stale lock also included
    unrelated changes (atroposlib source URL, hermes-agent version bump,
    fastapi additions) that don't belong.

Verified E2E:
  - Existing user (QQ_HOME_CHANNEL set): gateway + cron both pick up the
    legacy name; deprecation warning logs once.
  - Fresh user (QQBOT_HOME_CHANNEL set): gateway + cron use new name,
    no warning.
  - Both set: new name wins on both surfaces.

Targeted tests: 296 passed, 4 skipped (qqbot + cron + hermes_cli).
2026-04-17 15:31:14 -07:00
WideLee
103beea7a6 fix(qqbot): fix test failures after package refactor
- Re-export _ssrf_redirect_guard from __init__.py
- Fix _parse_json @staticmethod using self._log_tag
- Update test_detect_message_type to call as instance method
- Fix mock.patch path for httpx.AsyncClient in adapter submodule
2026-04-17 15:31:14 -07:00
Teknium
31e7276474
fix(gateway): consolidate per-session cleanup; close SessionDB on shutdown (#11800)
Three closely-related fixes for shutdown / lifecycle hygiene.

1. _release_running_agent_state(session_key) helper
   ----------------------------------------------------
   Per-running-agent state lived in three dicts that drifted out of sync
   across cleanup sites:
     self._running_agents       — AIAgent per session_key
     self._running_agents_ts    — start timestamp per session_key
     self._busy_ack_ts          — last busy-ack timestamp per session_key

   Inventory before this PR:
     8 sites: del self._running_agents[key]
       — only 1 (stale-eviction) cleaned all three
       — 1 cleaned _running_agents + _running_agents_ts only
       — 6 cleaned _running_agents only

   Each missed entry was a (str, float) tuple per session per gateway
   lifetime — small, persistent, accumulates across thousands of
   sessions over months.  Per-platform leaks compounded.

   This change adds a single helper that pops all three dicts in
   lockstep, and replaces every bare 'del self._running_agents[key]'
   site with it.  Per-session state that PERSISTS across turns
   (_session_model_overrides, _voice_mode, _pending_approvals,
   _update_prompt_pending) is intentionally NOT touched here — those
   have their own lifecycles tied to user actions, not turn boundaries.

2. _running_agents_ts cleared in _stop_impl
   ----------------------------------------
   Was being missed alongside _running_agents.clear(); now included.

3. SessionDB close() in _stop_impl
   ---------------------------------
   The SQLite WAL write lock stayed held by the old gateway connection
   until Python actually exited — causing 'database is locked' errors
   when --replace launched a new gateway against the same file.  We
   now explicitly close both self._db and self.session_store._db
   inside _stop_impl, with try/except so a flaky close on one doesn't
   block the other.

Tests
-----
tests/gateway/test_session_state_cleanup.py — 10 cases covering:
  * helper pops all three dicts atomically
  * idempotent on missing/empty keys
  * preserves other sessions
  * tolerates older runners without _busy_ack_ts attribute
  * thread-safe under concurrent release
  * regression guard: scans gateway/run.py and fails if a future
    contributor reintroduces 'del self._running_agents[...]'
    outside docstrings
  * SessionDB close called on both holders during shutdown
  * shutdown tolerates missing session_store
  * shutdown tolerates close() raising on one db (other still closes)

Broader gateway suite: 3108 passed (vs 3100 on baseline) — failure
delta is +8 net passes; the 10 remaining failures are pre-existing
cross-test pollution / missing optional deps (matrix needs olm,
signal/telegram approval flake, dingtalk Mock wiring), all reproduce
on stashed baseline.
2026-04-17 15:18:23 -07:00
Teknium
036dacf659
feat(telegram): auto-wrap markdown tables in code blocks (#11794)
Telegram's MarkdownV2 has no table syntax — pipes get backslash-escaped
and tables render as noisy unaligned text.  format_message now detects
GFM-style pipe tables (header row + delimiter row + optional body) and
wraps them in ``` fences before the existing MarkdownV2 conversion runs.
Telegram renders fenced code blocks as monospace preformatted text with
columns intact.

Tables already inside an existing code block are left alone.  Plain
prose with pipes, lone '---' horizontal rules, and non-table content
are unaffected.

Closes the recurring community request to stop having to ask the agent
to re-render tables as code blocks manually.
2026-04-17 14:27:26 -07:00
Teknium
3207b9bda0
test: speed up slow tests (backoff + subprocess + IMDS network) (#11797)
Cuts shard-3 local runtime in half by neutralizing real wall-clock
waits across three classes of slow test:

## 1. Retry backoff mocks

- tests/run_agent/conftest.py (NEW): autouse fixture mocks
  jittered_backoff to 0.0 so the `while time.time() < sleep_end`
  busy-loop exits immediately. No global time.sleep mock (would
  break threading tests).
- test_anthropic_error_handling, test_413_compression,
  test_run_agent_codex_responses, test_fallback_model: per-file
  fixtures mock time.sleep / asyncio.sleep for retry / compression
  paths.
- test_retaindb_plugin: cap the retaindb module's bound time.sleep
  to 0.05s via a per-test shim (background writer-thread retries
  sleep 2s after errors; tests don't care about exact duration).
  Plus replace arbitrary time.sleep(N) waits with short polling
  loops bounded by deadline.

## 2. Subprocess sleeps in production code

- test_update_gateway_restart: mock time.sleep. Production code
  does time.sleep(3) after `systemctl restart` to verify the
  service survived. Tests mock subprocess.run \u2014 nothing actually
  restarts \u2014 so the wait is dead time.

## 3. Network / IMDS timeouts (biggest single win)

- tests/conftest.py: add AWS_EC2_METADATA_DISABLED=true plus
  AWS_METADATA_SERVICE_TIMEOUT=1 and ATTEMPTS=1. boto3 falls back
  to IMDS (169.254.169.254) when no AWS creds are set. Any test
  hitting has_aws_credentials() / resolve_aws_auth_env_var() (e.g.
  test_status, test_setup_copilot_acp, anything that touches
  provider auto-detect) burned ~2-4s waiting for that to time out.
- test_exit_cleanup_interrupt: explicitly mock
  resolve_runtime_provider which was doing real network auto-detect
  (~4s). Tests don't care about provider resolution \u2014 the agent
  is already mocked.
- test_timezone: collapse the 3-test "TZ env in subprocess" suite
  into 2 tests by checking both injection AND no-leak in the same
  subprocess spawn (was 3 \u00d7 3.2s, now 2 \u00d7 4s).

## Validation

| Test | Before | After |
|---|---|---|
| test_anthropic_error_handling (8 tests) | ~80s | ~15s |
| test_413_compression (14 tests) | ~18s | 2.3s |
| test_retaindb_plugin (67 tests) | ~13s | 1.3s |
| test_status_includes_tavily_key | 4.0s | 0.05s |
| test_setup_copilot_acp_skips_same_provider_pool_step | 8.0s | 0.26s |
| test_update_gateway_restart (5 tests) | ~18s total | ~0.35s total |
| test_exit_cleanup_interrupt (2 tests) | 8s | 1.5s |
| **Matrix shard 3 local** | **108s** | **50s** |

No behavioral contract changed \u2014 tests still verify retry happens,
service restart logic runs, etc.; they just don't burn real seconds
waiting for it.

Supersedes PR #11779 (those changes are included here).
2026-04-17 14:21:22 -07:00
Teknium
eb07c05646
fix(gateway): prune stale SessionStore entries to bound memory + disk (#11789)
SessionStore._entries grew unbounded.  Every unique
(platform, chat_id, thread_id, user_id) tuple ever seen was kept in
RAM and rewritten to sessions.json on every message.  A Discord bot
in 100 servers x 100 channels x ~100 rotating users accumulates on
the order of 10^5 entries after a few months; each sessions.json
write becomes an O(n) fsync.  Nothing trimmed this — there was no
TTL, no cap, no eviction path.

Changes
-------
* SessionStore.prune_old_entries(max_age_days) — drops entries whose
  updated_at is older than the cutoff.  Preserves:
    - suspended entries (user paused them via /stop for later resume)
    - entries with an active background process attached
  Pruning is functionally identical to a natural reset-policy expiry:
  SQLite transcript stays, session_key -> session_id mapping dropped,
  returning user gets a fresh session.

* GatewayConfig.session_store_max_age_days (default 90; 0 disables).
  Serialized in to_dict/from_dict, coerced from bad types / negatives
  to safe defaults.  No migration needed — missing field -> 90 days.

* _session_expiry_watcher calls prune_old_entries once per hour
  (first tick is immediate).  Uses the existing watcher loop so no
  new background task is created.

Why not more aggressive
-----------------------
90 days is long enough that legitimate long-idle users (seasonal,
vacation, etc.) aren't surprised — pruning just means they get a
fresh session on return, same outcome they'd get from any other
reset-policy trigger.  Admins can lower it via config; 0 disables.

Tests
-----
tests/gateway/test_session_store_prune.py — 17 cases covering:
  * entry age based on updated_at, not created_at
  * max_age_days=0 disables; negative coerces to 0
  * suspended + active-process entries are skipped
  * _save fires iff something was removed
  * disk JSON reflects post-prune state
  * thread safety against concurrent readers
  * config field roundtrips + graceful fallback on bad values
  * watcher gate logic (first tick prunes, subsequent within 1h don't)

119 broader session/gateway tests remain green.
2026-04-17 13:48:49 -07:00
Teknium
f362083c64 fix(providers): complete NVIDIA NIM parity with other providers
Follow-up on the native NVIDIA NIM provider salvage. The original PR wired
PROVIDER_REGISTRY + HERMES_OVERLAYS correctly but missed several touchpoints
required for full parity with other OpenAI-compatible providers (xai,
huggingface, deepseek, zai).

Gaps closed:

- hermes_cli/main.py:
  - Add 'nvidia' to the _model_flow_api_key_provider dispatch tuple so
    selecting 'NVIDIA NIM' in `hermes model` actually runs the api-key
    provider flow (previously fell through silently).
  - Add 'nvidia' to `hermes chat --provider` argparse choices so the
    documented test command (`hermes chat --provider nvidia --model ...`)
    parses successfully.

- hermes_cli/config.py: Register NVIDIA_API_KEY and NVIDIA_BASE_URL in
  OPTIONAL_ENV_VARS so setup wizard can prompt for them and they're
  auto-added to the subprocess env blocklist.

- hermes_cli/doctor.py: Add NVIDIA NIM row to `_apikey_providers` so
  `hermes doctor` probes https://integrate.api.nvidia.com/v1/models.

- hermes_cli/dump.py: Add NVIDIA_API_KEY → 'nvidia' mapping for
  `hermes dump` credential masking.

- tests/tools/test_local_env_blocklist.py: Extend registry_vars fixture
  with NVIDIA_API_KEY to verify it's blocked from leaking into subprocesses.

- agent/model_metadata.py: Add 'nemotron' → 131072 context-length entry
  so all Nemotron variants get 128K context via substring match (rather
  than falling back to MINIMUM_CONTEXT_LENGTH).

- hermes_cli/models.py: Fix hallucinated model ID
  'nvidia/nemotron-3-nano-8b-a4b' → 'nvidia/nemotron-3-nano-30b-a3b'
  (verified against live integrate.api.nvidia.com/v1/models catalog).
  Expand curated list from 5 to 9 agentic models mapping to OpenRouter
  defaults per provider-guide convention: add qwen3.5-397b-a17b,
  deepseek-v3.2, llama-3.3-nemotron-super-49b-v1.5, gpt-oss-120b.

- cli-config.yaml.example: Document 'nvidia' provider option.

- scripts/release.py: Map asurla@nvidia.com → anniesurla in AUTHOR_MAP
  for CI attribution.

E2E verified: `hermes chat --provider nvidia ...` now reaches NVIDIA's
endpoint (returns 401 with bogus key instead of argparse error);
`hermes doctor` detects NVIDIA NIM when NVIDIA_API_KEY is set.
2026-04-17 13:47:46 -07:00
asurla
3b569ff576 feat(providers): add native NVIDIA NIM provider
Adds NVIDIA NIM as a first-class provider: ProviderConfig in
auth.py, HermesOverlay in providers.py, curated models
(Nemotron plus other open source models hosted on
build.nvidia.com), URL mapping in model_metadata.py, aliases
(nim, nvidia-nim, build-nvidia, nemotron), and env var tests.

Docs updated: providers page, quickstart table, fallback
providers table, and README provider list.
2026-04-17 13:47:46 -07:00
Brooklyn Nicholson
bd09e42eac Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-17 15:44:57 -05:00
Teknium
cc3aa76675
build(deps): add qrcode to dingtalk + feishu extras (parity with messaging) (#11627)
#4b1567f4 (anthhub) added qrcode to the messaging extra for Weixin's
QR login. The same package is needed by:

  * hermes_cli/dingtalk_auth.py — QR device-flow auth shipped in #11574
  * gateway/platforms/feishu.py:3962 — Feishu QR login

These extras are independent of [messaging] (users can install
hermes-agent[dingtalk] or hermes-agent[feishu] without [messaging]),
so the dep needs to be declared on each.

Pin matches anthhub's choice (>=7.0,<8) for consistency. The all
extra inherits from all three, so it picks up qrcode transitively.

Adds parallel tests to tests/test_project_metadata.py — same shape
as test_messaging_extra_includes_qrcode_for_weixin_setup.

Refs #9431.
2026-04-17 13:31:53 -07:00
Teknium
2ff1ef6ae6
fix(surrogates): sanitize reasoning/reasoning_content/reasoning_details fields (#11628)
Byte-level reasoning models (xiaomi/mimo-v2-pro, kimi, glm) can emit lone
surrogates in reasoning output. The proactive sanitizer walked content/
name/tool_calls but not extra fields like reasoning or the nested
reasoning_details array. Surrogates in those fields survived the
proactive pass, crashed json.dumps() in the OpenAI SDK, and the recovery
block's _sanitize_messages_surrogates(messages) call also didn't check
those fields — so 'found' was False, no retry happened, and after 3
attempts the user saw:

  API call failed after 3 retries. 'utf-8' codec can't encode characters
  in position N-M: surrogates not allowed

Changes:
- _sanitize_messages_surrogates: walk any extra string fields (reasoning,
  reasoning_content, etc.) and recurse into nested dict/list values
  (reasoning_details). Mirrors _sanitize_messages_non_ascii coverage
  added in PR #10537.
- _sanitize_structure_surrogates: new recursive walker, mirror of
  _sanitize_structure_non_ascii but for surrogate recovery.
- UnicodeEncodeError recovery block: also sanitize api_messages,
  api_kwargs, and prefill_messages (not just the canonical messages
  list — the API-copy carries reasoning_content transformed from
  reasoning and that's what the SDK actually serializes). Always
  retry on detected surrogate errors, not only when we found
  something to strip — gate on error type per PR #10537's pattern.

Tests: extended tests/cli/test_surrogate_sanitization.py with
coverage for reasoning, reasoning_content, reasoning_details (flat
and deeply nested), structure walker, and an integration case that
reproduces the exact api_messages shape that was crashing.
2026-04-17 13:30:47 -07:00
Henkey
cb883f9e97 fix(acp): improve zed integration 2026-04-17 13:29:26 -07:00
Brooklyn Nicholson
d5b9db8b4a Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-17 15:13:36 -05:00
Teknium
d0e1388ca9
fix(tests): make AIAgent constructor calls self-contained (#11755)
* fix(tests): make AIAgent constructor calls self-contained (no env leakage)

Tests in tests/run_agent/ were constructing AIAgent() without passing
both api_key and base_url, then relying on leaked state from other
tests in the same xdist worker (or process-level env vars) to keep
provider resolution happy. Under hermetic conftest + pytest-split,
that state is gone and the tests fail with 'No LLM provider configured'.

Fix: pass both api_key and base_url explicitly on 47 AIAgent()
construction sites across 13 files. AIAgent.__init__ with both set
takes the direct-construction path (line 960 in run_agent.py) and
skips the resolver entirely.

One call site (test_none_base_url_passed_as_none) left alone — that
test asserts behavior for base_url=None specifically.

This is a prerequisite for any future matrix-split or stricter
isolation work, and lands cleanly on its own.

Validation:
- tests/run_agent/ full: 760 passed, 0 failed (local)
- Previously relied on cross-test pollution; now self-contained

* fix(tests): update opencode-go model order assertion to match kimi-k2.5-first

commit 78a74bb promoted kimi-k2.5 to first position in model suggestion
lists but didn't update this test, which has been failing on main since.
Reorder expected list to match the new canonical order.
2026-04-17 12:32:03 -07:00
Brooklyn Nicholson
0dd5055d59 fix(tui): first-run setup preflight + actionable no-provider panel
- tui_gateway: new `setup.status` RPC that reuses CLI's
  `_has_any_provider_configured()`, so the TUI can ask the same question
  the CLI bootstrap asks before launching a session
- useSessionLifecycle: preflight `setup.status` before both `newSession`
  and `resumeById`, and render a clear "Setup Required" panel when no
  provider is configured instead of booting a session that immediately
  fails with `agent init failed`
- createGatewayEventHandler: drop duplicate startup resume logic in
  favor of the preflighted `resumeById`, and special-case the
  no-provider agent-init error as a last-mile fallback to the same
  setup panel
- add regression tests for both paths
2026-04-17 10:58:01 -05:00
Brooklyn Nicholson
5b386ced71 fix(tui): approval flow + input ergonomics + selection perf
- tui_gateway: route approvals through gateway callback (HERMES_GATEWAY_SESSION/
  HERMES_EXEC_ASK) so dangerous commands emit approval.request instead of
  silently falling through the CLI input() path and auto-denying
- approval UX: dedicated PromptZone between transcript and composer, safer
  defaults (sel=0, numeric quick-picks, no Esc=deny), activity trail line,
  outcome footer under the cost row
- text input: Ctrl+A select-all, real forward Delete, Ctrl+W always consumed
  (fixes Ctrl+Backspace at cursor 0 inserting literal w)
- hermes-ink selection: swap synchronous onRender() for throttled
  scheduleRender() on drag, and only notify React subscribers on presence
  change — no more per-cell paint/subscribe spam
- useConfigSync: silence config.get polling failures instead of surfacing
  'error: timeout: config.get' in the transcript
2026-04-17 10:37:48 -05:00
Brooklyn Nicholson
1f37ef2fd1 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-17 08:59:33 -05:00
Young Sherlock
8dcd08d8bb Fix Weixin media uploads and refresh lockfile 2026-04-17 06:50:36 -07:00
anthhub
4b1567f425 fix(packaging): include qrcode in messaging extra 2026-04-17 06:50:36 -07:00
Teknium
3f3d8a7b24 fix(discord): strip mention syntax from auto-thread names
Previously a message like `<@&1490963422786093149> help` would spawn a
thread literally named `<@&1490963422786093149> help`, exposing raw
Discord mention markers in the thread list. Only user mentions
(`<@id>`) were being stripped upstream — role mentions (`<@&id>`) and
channel mentions (`<#id>`) leaked through.

Fix: strip all three mention patterns in `_auto_create_thread` before
building the thread name. Collapse runs of whitespace left by the
removal. If the entire content was mention-only, fall back to 'Hermes'
instead of an empty title.

Fixes #6336.

Tests: two new regression guards in test_discord_slash_commands.py
covering mixed-mention content and mention-only content.
2026-04-17 06:46:52 -07:00
sgaofen
32a694ad5f fix(discord): fall back when auto-thread creation fails 2026-04-17 06:46:52 -07:00
OwenYWT
f5dc4e905d fix(discord): skip auto-threading reply messages 2026-04-17 06:46:52 -07:00
Matteo De Agazio
93fe4b357d fix(discord): free-response channels skip auto-threading
Free-response channels already bypassed the @mention gate so users could
chat inline with the bot, but auto-threading still fired on every
message — spinning off a thread per message and defeating the
lightweight-chat purpose.

Fix: fold `is_free_channel` into `skip_thread` so threading is skipped
whenever the channel is in DISCORD_FREE_RESPONSE_CHANNELS (via env or
discord.free_response_channels in config.yaml).

Net change: one line in _handle_message + one regression test.

Partially addresses #9399. Authored by @Hypn0sis (salvaged from PR #9650;
the bundled 'smart' auto-thread mode from that PR was dropped in favor
of deterministic true/false semantics).
2026-04-17 06:46:52 -07:00
Teknium
8d7b7feb0d
fix(gateway): bound _agent_cache with LRU cap + idle TTL eviction (#11565)
* fix(gateway): bound _agent_cache with LRU cap + idle TTL eviction

The per-session AIAgent cache was unbounded. Each cached AIAgent holds
LLM clients, tool schemas, memory providers, and a conversation buffer.
In a long-lived gateway serving many chats/threads, cached agents
accumulated indefinitely — entries were only evicted on /new, /model,
or session reset.

Changes:
- Cache is now an OrderedDict so we can pop least-recently-used entries.
- _enforce_agent_cache_cap() pops entries beyond _AGENT_CACHE_MAX_SIZE=64
  when a new agent is inserted. LRU order is refreshed via move_to_end()
  on cache hits.
- _sweep_idle_cached_agents() evicts entries whose AIAgent has been idle
  longer than _AGENT_CACHE_IDLE_TTL_SECS=3600s. Runs from the existing
  _session_expiry_watcher so no new background task is created.
- The expiry watcher now also pops the cache entry after calling
  _cleanup_agent_resources on a flushed session — previously the agent
  was shut down but its reference stayed in the cache dict.
- Evicted agents have _cleanup_agent_resources() called on a daemon
  thread so the cache lock isn't held during slow teardown.

Both tuning constants live at module scope so tests can monkeypatch
them without touching class state.

Tests: 7 new cases in test_agent_cache.py covering LRU eviction,
move_to_end refresh, cleanup thread dispatch, idle TTL sweep,
defensive handling of agents without _last_activity_ts, and plain-dict
test fixture tolerance.

* tweak: bump _AGENT_CACHE_MAX_SIZE 64 -> 128

* fix(gateway): never evict mid-turn agents; live spillover tests

The prior commit could tear down an active agent if its session_key
happened to be LRU when the cap was exceeded.  AIAgent.close() kills
process_registry entries for the task, tears down the terminal
sandbox, closes the OpenAI client (sets self.client = None), and
cascades .close() into any active child subagents — all fatal if
the agent is still processing a turn.

Changes:
- _enforce_agent_cache_cap and _sweep_idle_cached_agents now look at
  GatewayRunner._running_agents and skip any entry whose AIAgent
  instance is present (identity via id(), so MagicMock doesn't
  confuse lookup in tests).  _AGENT_PENDING_SENTINEL is treated
  as 'not active' since no real agent exists yet.
- Eviction only considers the LRU-excess window (first size-cap
  entries).  If an excess slot is held by a mid-turn agent, we skip
  it WITHOUT compensating by evicting a newer entry.  A freshly
  inserted session (zero cache history) shouldn't be punished to
  protect a long-lived one that happens to be busy.
- Cache may therefore stay transiently over cap when load spikes;
  a WARNING is logged so operators can see it, and the next insert
  re-runs the check after some turns have finished.

New tests (TestAgentCacheActiveSafety + TestAgentCacheSpilloverLive):
- Active LRU entry is skipped; no newer entry compensated
- Mixed active/idle excess window: only idle slots go
- All-active cache: no eviction, WARNING logged, all clients intact
- _AGENT_PENDING_SENTINEL doesn't block other evictions
- Idle-TTL sweep skips active agents
- End-to-end: active agent's .client survives eviction attempt
- Live fill-to-cap with real AIAgents, then spillover
- Live: CAP=4 all active + 1 newcomer — cache grows to 5, no teardown
- Live: 8 threads racing 160 inserts into CAP=16 — settles at 16
- Live: evicted session's next turn gets a fresh agent that works

30 tests pass (13 pre-existing + 17 new).  Related gateway suites
(model switch, session reset, proxy, etc.) all green.

* fix(gateway): cache eviction preserves per-task state for session resume

The prior commits called AIAgent.close() on cache-evicted agents, which
tears down process_registry entries, terminal sandbox, and browser
daemon for that task_id — permanently. Fine for session-expiry (session
ended), wrong for cache eviction (session may resume).

Real-world scenario: a user leaves a Telegram session open for 2+ hours,
idle TTL evicts the cached AIAgent, user returns and sends a message.
Conversation history is preserved via SessionStore, but their terminal
sandbox (cwd, env vars, bg shells) and browser state were destroyed.

Fix: split the two cleanup modes.

  close()               Full teardown — session ended. Kills bg procs,
                        tears down terminal sandbox + browser daemon,
                        closes LLM client. Used by session-expiry,
                        /new, /reset (unchanged).

  release_clients()     Soft cleanup — session may resume. Closes
                        LLM client only. Leaves process_registry,
                        terminal sandbox, browser daemon intact
                        for the resuming agent to inherit via
                        shared task_id.

Gateway cache eviction (_enforce_agent_cache_cap, _sweep_idle_cached_agents)
now dispatches _release_evicted_agent_soft on the daemon thread instead
of _cleanup_agent_resources. All session-expiry call sites of
_cleanup_agent_resources are unchanged.

Tests (TestAgentCacheIdleResume, 5 new cases):
- release_clients does NOT call process_registry.kill_all
- release_clients does NOT call cleanup_vm / cleanup_browser
- release_clients DOES close the LLM client (agent.client is None after)
- close() vs release_clients() — semantic contract pinned
- Idle-evicted session's rebuild with same session_id gets same task_id

Updated test_cap_triggers_cleanup_thread to assert the soft path fires
and the hard path does NOT.

35 tests pass in test_agent_cache.py; 67 related tests green.
2026-04-17 06:36:34 -07:00
Jorge
86f02d8d71 refactor(cli): align model picker viewport with PR #11260 vocabulary
Match the row-budget naming introduced in PR #11260 for the approval and
clarify panels: rename chrome_reserve=14 into reserved_below=6 (input
chrome below the panel) + panel_chrome=6 (this panel's borders, blanks,
and hint row) + min_visible=3 (floor on visible items). Same arithmetic
as before, but a reviewer reading both files now sees the same handle.

Compact-chrome mode is intentionally not adopted — that pattern fits the
"fixed mandatory content might overflow" shape of approval/clarify
(solved by truncating with a marker), whereas the picker's overflow is
already handled by the scrolling viewport.
2026-04-17 06:33:21 -07:00
Jorge
5fbe16635b fix(cli): scroll the /model picker viewport so long catalogs aren't clipped
The /model picker rendered every choice into a prompt_toolkit Window
with no max height. Providers with many models (e.g. Ollama Cloud's 36+)
overflowed the terminal, clipping the bottom border and the last items.

- Add HermesCLI._compute_model_picker_viewport() to slide a scroll
  offset that keeps the cursor on screen, sized from the live terminal
  rows minus chrome reserved for input/status/border.
- Render only the visible slice in _get_model_picker_display() and
  persist the offset on _model_picker_state across redraws.
- Bind ESC (eager) to close the picker, matching the Cancel button.
- Cover the viewport math with 8 unit tests in
  tests/hermes_cli/test_model_picker_viewport.py.
2026-04-17 06:33:21 -07:00
Teknium
f64241ed90 feat(cron+tests): extend origin fallback to email/dingtalk/qqbot + fix Weixin test mocks
Cron origin fallback extension (builds on #9193's _HOME_TARGET_ENV_VARS):
adds the three remaining origin-fallback-eligible platforms that have
home channel env vars configured in gateway/config.py but use non-generic
env var names:

- email    → EMAIL_HOME_ADDRESS   (non-standard suffix)
- dingtalk → DINGTALK_HOME_CHANNEL
- qqbot    → QQ_HOME_CHANNEL      (non-standard prefix: QQ_ not QQBOT_)

Picks up the completeness intent of @Xowiek's PR #11317 using the
architecturally-correct dict-based lookup from #9193, so platforms with
non-standard env var names actually resolve instead of silently missing.
Extended the parametrized regression test to cover the new three.

Weixin test mock alignment (builds on #10091's _send_session split):
Three test sites added in Batch 1 (TestWeixinSendImageFileParameterName)
and Batch 3 (TestWeixinVoiceSending) mocked only adapter._session, but
#10091 switched the send paths to check self._send_session. Added the
companion setter so the tests stay green with the session split in place.
2026-04-17 06:26:43 -07:00
bde3249023
b46db048c3 fix(cron): align home target env lookup 2026-04-17 06:26:43 -07:00
bde3249023
f696b4745a fix(cron): restore origin fallback for feishu home channels 2026-04-17 06:26:43 -07:00
Ubuntu
5ca52bae5b fix(gateway/weixin): split poll/send sessions, reuse live adapter for cron & send_message
- gateway/platforms/weixin.py:
  - Split aiohttp.ClientSession into _poll_session and _send_session
  - Add _LIVE_ADAPTERS registry so send_weixin_direct() reuses the connected gateway adapter instead of creating a competing session
  - Fixes silent message loss when gateway is running (iLink token contention)

- cron/scheduler.py:
  - Support comma-separated deliver values (e.g. 'feishu,weixin') for multi-target delivery
  - Delay pconfig/enabled check until standalone fallback so live adapters work even when platform is not in gateway config

- tools/send_message_tool.py:
  - Synthesize PlatformConfig from WEIXIN_* env vars when gateway config lacks a weixin entry
  - Fall back to WEIXIN_HOME_CHANNEL env var for home channel resolution

- tests/gateway/test_weixin.py:
  - Update mocks to include _send_session
2026-04-17 06:26:43 -07:00
Teknium
c60b6dc317 test(dingtalk): cover get_connected_platforms + null platform_toolsets
Follow-ups to the salvaged commits in this PR:

* gateway/config.py — strip trailing whitespace from youngDoo's diff
  (line 315 had ~140 trailing spaces).

* hermes_cli/tools_config.py — replace `config.get("platform_toolsets", {})`
  with `config.get("platform_toolsets") or {}`. Handles the case where the
  YAML key is present but explicitly null (parses as None, previously
  crashed with AttributeError on the next line's .get(platform)).
  Cherry-picked from yyq4193's #9003 with attribution.

* tests/gateway/test_config.py — 4 new tests for TestGetConnectedPlatforms
  covering DingTalk via extras, via env vars, disabled, and missing creds.

* tests/hermes_cli/test_tools_config.py — regression test for the null
  platform_toolsets edge case.

* scripts/release.py — add kagura-agent, youngDoo, yyq4193 to AUTHOR_MAP.

Co-authored-by: yyq4193 <39405770+yyq4193@users.noreply.github.com>
2026-04-17 06:26:18 -07:00
kagura-agent
47a0dd1024 fix(dingtalk): fire-and-forget message processing & session_webhook fallback
Fixes #11463: DingTalk channel receives messages but fails to reply
with 'No session_webhook available'.

Two changes:

1. **Fire-and-forget message processing**: process() now dispatches
   _on_message as a background task via asyncio.create_task instead of
   awaiting it. This ensures the SDK ACK is returned immediately,
   preventing heartbeat timeouts and disconnections when message
   processing takes longer than the SDK's ACK deadline.

2. **session_webhook extraction fallback**: If ChatbotMessage.from_dict()
   fails to map the sessionWebhook field (possible across SDK versions),
   the handler now falls back to extracting it directly from the raw
   callback data dict using both 'sessionWebhook' and 'session_webhook'
   key variants.

Added 3 tests covering webhook extraction, fallback behavior, and
fire-and-forget ACK timing.
2026-04-17 06:26:18 -07:00
Teknium
d404849351
test: make test env hermetic; enforce CI parity via scripts/run_tests.sh (#11577)
* test: make test env hermetic; enforce CI parity via scripts/run_tests.sh

Fixes the recurring 'works locally, fails in CI' (and vice versa) class
of flakes by making tests hermetic and providing a canonical local runner
that matches CI's environment.

## Layer 1 — hermetic conftest.py (tests/conftest.py)

Autouse fixture now unsets every credential-shaped env var before every
test, so developer-local API keys can't leak into tests that assert
'auto-detect provider when key present'.

Pattern: unset any var ending in _API_KEY, _TOKEN, _SECRET, _PASSWORD,
_CREDENTIALS, _ACCESS_KEY, _PRIVATE_KEY, etc. Plus an explicit list of
credential names that don't fit the suffix pattern (AWS_ACCESS_KEY_ID,
FAL_KEY, GH_TOKEN, etc.) and all the provider BASE_URL overrides that
change auto-detect behavior.

Also unsets HERMES_* behavioral vars (HERMES_YOLO_MODE, HERMES_QUIET,
HERMES_SESSION_*, etc.) that mutate agent behavior.

Also:
  - Redirects HOME to a per-test tempdir (not just HERMES_HOME), so
    code reading ~/.hermes/* directly can't touch the real dir.
  - Pins TZ=UTC, LANG=C.UTF-8, LC_ALL=C.UTF-8, PYTHONHASHSEED=0 to
    match CI's deterministic runtime.

The old _isolate_hermes_home fixture name is preserved as an alias so
any test that yields it explicitly still works.

## Layer 2 — scripts/run_tests.sh canonical runner

'Always use scripts/run_tests.sh, never call pytest directly' is the
new rule (documented in AGENTS.md). The script:
  - Unsets all credential env vars (belt-and-suspenders for callers
    who bypass conftest — e.g. IDE integrations)
  - Pins TZ/LANG/PYTHONHASHSEED
  - Uses -n 4 xdist workers (matches GHA ubuntu-latest; -n auto on
    a 20-core workstation surfaces test-ordering flakes CI will never
    see, causing the infamous 'passes in CI, fails locally' drift)
  - Finds the venv in .venv, venv, or main checkout's venv
  - Passes through arbitrary pytest args

Installs pytest-split on demand so the script can also be used to run
matrix-split subsets locally for debugging.

## Remove 3 module-level dotenv stubs that broke test isolation

tests/hermes_cli/test_{arcee,xiaomi,api_key}_provider.py each had a
module-level:

    if 'dotenv' not in sys.modules:
        fake_dotenv = types.ModuleType('dotenv')
        fake_dotenv.load_dotenv = lambda *a, **kw: None
        sys.modules['dotenv'] = fake_dotenv

This patches sys.modules['dotenv'] to a fake at import time with no
teardown. Under pytest-xdist LoadScheduling, whichever worker collected
one of these files first poisoned its sys.modules; subsequent tests in
the same worker that imported load_dotenv transitively (e.g.
test_env_loader.py via hermes_cli.env_loader) got the no-op lambda and
saw their assertions fail.

dotenv is a required dependency (python-dotenv>=1.2.1 in pyproject.toml),
so the defensive stub was never needed. Removed.

## Validation

- tests/hermes_cli/ alone: 2178 passed, 1 skipped, 0 failed (was 4
  failures in test_env_loader.py before this fix)
- tests/test_plugin_skills.py, tests/hermes_cli/test_plugins.py,
  tests/test_hermes_logging.py combined: 123 passed (the caplog
  regression tests from PR #11453 still pass)
- Local full run shows no F/E clusters in the 0-55% range that were
  previously present before the conftest hardening

## Background

See AGENTS.md 'Testing' section for the full list of drift sources
this closes. Matrix split (closed as #11566) will be re-attempted
once this foundation lands — cross-test pollution was the root cause
of the shard-3 hang in that PR.

* fix(conftest): don't redirect HOME — it broke CI subprocesses

PR #11577's autouse fixture was setting HOME to a per-test tempdir.
CI started timing out at 97% complete with dozens of E/F markers and
orphan python processes at cleanup — tests (or transitive deps)
spawn subprocesses that expect a stable HOME, and the redirect broke
them in non-obvious ways.

Env-var unsetting and TZ/LANG/hashseed pinning (the actual CI-drift
fixes) are unchanged and still in place. HERMES_HOME redirection is
also unchanged — that's the canonical way to isolate tests from
~/.hermes/, not HOME.

Any code in the codebase reading ~/.hermes/* via `Path.home() / ".hermes"`
instead of `get_hermes_home()` is a bug to fix at the callsite, not
something to paper over in conftest.
2026-04-17 06:09:09 -07:00
Teknium
e5b880264b fix(discord): harden DISCORD_ALLOWED_ROLES and cover gateway layer
Two follow-ups to the cherry-picked PR #9873 (`e3bcc819`):

1. `_is_allowed_user` now uses `getattr(self, '_allowed_*_ids', set())`
   so test fixtures that build the adapter via `object.__new__`
   (skipping __init__) don't crash with AttributeError.
   See AGENTS.md pitfall #17 — same pattern as gateway.run.

2. New 3-case regression coverage in test_discord_bot_auth_bypass.py:
   - role-only config bypasses the gateway 'no allowlists' branch
   - roles + users combined still authorizes user-allowlist matches
   - the role bypass does NOT leak to other platforms (Telegram, etc.)

3. Autouse fixture in test_discord_bot_auth_bypass.py clears all Discord
   auth env vars before each test so DISCORD_ALLOWED_ROLES leakage from
   a previous test in the session can't flip later 'should-reject' tests
   into false-pass.

Required because the bare cherry-pick of #9873 only added the adapter-
level role check — it didn't cover the gateway-level _is_user_authorized,
which still rejected role-only setups via the 'no allowlists configured'
branch.
2026-04-17 05:48:26 -07:00
Teknium
7d888ab49c test(discord): regression guard for DISCORD_ALLOW_BOTS auth bypass
Six test cases covering:
- DISCORD_ALLOW_BOTS=mentions + bot not in DISCORD_ALLOWED_USERS → authorized
- DISCORD_ALLOW_BOTS=all + bot not in DISCORD_ALLOWED_USERS → authorized
- DISCORD_ALLOW_BOTS=none → bots still rejected (preserves security)
- DISCORD_ALLOW_BOTS unset → same as 'none'
- Humans still checked against allowlist even with allow_bots=all
- Bot bypass is Discord-specific — doesn't leak to other platforms

Guards against a regression where the is_bot bypass in _is_user_authorized
gets moved, removed, or accidentally extended to other platforms.
2026-04-17 05:42:04 -07:00
Teknium
d7fb435e0e
fix(discord): flat /skill command with autocomplete — fits 8KB limit trivially (#11580)
Closes #11321, closes #10259.

## Problem

The nested /skill command group (category subcommand groups + skill
subcommands) serialized to ~14KB with the default 75-skill catalog,
exceeding Discord's ~8000-byte per-command registration payload. The
entire tree.sync() rejected with error 50035 — ALL slash commands
including the 27 base commands failed to register.

## Fix

Replace the nested Group layout with a single flat Command:

    /skill name:<autocomplete> args:<optional string>

Autocomplete options are fetched dynamically by Discord when the user
types — they do NOT count against the per-command registration budget.
So this single command registers at ~200 bytes regardless of how many
skills exist. Scales to thousands of skills with no size calculations,
no splitting, no hidden skills.

UX improvements:
- Discord live-filters by user's typed prefix against BOTH name and
  description, so '/skill pdf' finds 'ocr-and-documents' via its
  description. More discoverable than clicking through category menus.
- Unknown skill name → ephemeral error pointing user at autocomplete.
- Stable alphabetical ordering across restarts.

## Why not the other proposed approaches

Three prior PRs tried to fit within the 8KB limit by modifying the
nested layout:

- #10214 (njiangk): truncated all descriptions to 'Run <name>' and
  category descriptions to 'Skills'. Works but destroys slash picker UX.
- #11385 (LeonSGP43): 40-char description clamp + iterative
  trim-largest-category fallback. Works but HIDES skills the user can
  no longer invoke via slash — functional regression.
- #10261 (zeapsu): adaptive split into /skill-<cat> top-level groups.
  Preserves all skills but pollutes the slash namespace with 20
  top-level commands.

All three work around the symptom. The flat autocomplete design
dissolves the problem — there is no payload-size pressure to manage.

## Tests

tests/gateway/test_discord_slash_commands.py — 5 new test cases replace
the 3 old nested-structure tests:

- flat-not-nested structure assertion
- empty skills → no command registered
- callback dispatches the right cmd_key by name
- unknown name → ephemeral error, no dispatch
- large-catalog regression guard (500 skills) — command payload stays
  under 500 bytes regardless

E2E validated against real discord.py 2.7.1:
- Command registers as discord.app_commands.Command (not Group).
- Autocomplete filters by name AND description (verified across several
  queries including description-only matches like 'pdf' → OCR skill).
- 500-skill catalog returns max 25 results per autocomplete query
  (Discord's hard cap), filtered correctly.
- Choice labels formatted as 'name — description' clamped to 100 chars.
2026-04-17 05:19:14 -07:00
Teknium
13f2d997b0 test(dingtalk): cover QR device-flow auth + OpenClaw branding disclosure
Adds 15 regression tests for hermes_cli/dingtalk_auth.py covering:
  * _api_post — network error mapping, errcode-nonzero mapping, success path
  * begin_registration — 2-step chain, missing-nonce/device_code/uri
    error cases
  * wait_for_registration_success — success path, missing-creds guard,
    on_waiting callback invocation
  * render_qr_to_terminal — returns False when qrcode missing, prints
    when available
  * Configuration — BASE_URL default + override, SOURCE default

Also adds a one-line disclosure in dingtalk_qr_auth() telling users
the scan page will be OpenClaw-branded. Interim measure: DingTalk's
registration portal is hardcoded to route all sources to /openapp/
registration/openClaw, so users see OpenClaw branding regardless of
what 'source' value we send. We keep 'openClaw' as the source token
until DingTalk-Real-AI registers a Hermes-specific template.

Also adds meng93 to scripts/release.py AUTHOR_MAP.
2026-04-17 05:08:07 -07:00
Berny Linville
6ee65b4d61 fix(weixin): preserve native markdown rendering
- stop rewriting markdown tables, headings, and links before delivery
- keep markdown table blocks and headings together during chunking
- update Weixin tests and docs for native markdown rendering

Closes #10308
2026-04-17 05:01:29 -07:00
Patrick Wang
4ed6e4c1a5 refactor(weixin): drop pilk dependency from voice fallback 2026-04-17 05:01:29 -07:00
Patrick Wang
649f38390c fix: force Weixin voice fallback to file attachments 2026-04-17 05:01:29 -07:00
Patrick Wang
678b69ec1b fix(weixin): use Tencent SILK encoding for voice replies 2026-04-17 05:01:29 -07:00
Teknium
53da34a4fc
fix(discord): route attachment downloads through authenticated bot session (#11568)
Three open issues — #8242, #6587, #11345 — all trace to the same root
cause: the image / audio / document download paths in
`DiscordAdapter._handle_message` used plain, unauthenticated HTTP to
fetch `att.url`. That broke in three independent ways:

  #8242  cdn.discordapp.com attachment URLs increasingly require the
         bot session to download; unauthenticated httpx sees 403
         Forbidden, image/voice analysis fail silently.

  #6587  Some user environments (VPNs, corporate DNS, tunnels) resolve
         cdn.discordapp.com to private-looking IPs. Our is_safe_url()
         guard correctly blocks them as SSRF risks, but the user
         environment is legitimate — image analysis and voice STT die.

  #11345 The document download path skipped is_safe_url() entirely —
         raw aiohttp.ClientSession.get(att.url) with no SSRF check,
         inconsistent with the image/audio branches.

Unified fix: use `discord.Attachment.read()` as the primary download
path on all three branches. `att.read()` routes through discord.py's
own authenticated HTTPClient, so:

  - Discord CDN auth is handled (#8242 resolved).
  - Our is_safe_url() gate isn't consulted for the attachment path at
    all — the bot session handles networking internally (#6587 resolved).
  - All three branches now share the same code path, eliminating the
    document-path SSRF gap (#11345 resolved).

Falls back to the existing cache_*_from_url helpers (image/audio) or an
SSRF-gated aiohttp fetch (documents) when `att.read()` is unavailable
or fails — preserves defense-in-depth for any future payload-schema
drift that could slip a non-CDN URL into att.url.

New helpers on DiscordAdapter:
  - _read_attachment_bytes(att)  — safe att.read() wrapper
  - _cache_discord_image(att, ext)     — primary + URL fallback
  - _cache_discord_audio(att, ext)     — primary + URL fallback
  - _cache_discord_document(att, ext)  — primary + SSRF-gated aiohttp fallback

Tests:
  - tests/gateway/test_discord_attachment_download.py — 12 new cases
    covering all three helpers: primary path, fallback on missing
    .read(), fallback on validator rejection, SSRF guard on document
    fallback, aiohttp fallback happy-path, and an E2E case via
    _handle_message confirming cache_image_from_url is never invoked
    when att.read() succeeds.
  - All 11 existing document-handling tests continue to pass via the
    aiohttp fallback path (their SimpleNamespace attachments have no
    .read(), which triggers the fallback — now SSRF-gated).

Closes #8242, closes #6587, closes #11345.
2026-04-17 04:59:03 -07:00
LehaoLin
504e7eb9e5 fix(gateway): wait for reconnection before dropping WebSocket sends
When a WebSocket-based platform adapter (e.g. QQ Bot) temporarily
loses its connection, send() now polls is_connected for up to 15s
instead of immediately returning a non-retryable failure. If the
auto-reconnect completes within the window, the message is delivered
normally. On timeout, the SendResult is marked retryable=True so the
base class retry mechanism can attempt re-delivery.

Same treatment applied to _send_media().

Adds 4 async tests covering:
- Successful send after simulated reconnection
- Retryable failure on timeout
- Immediate success when already connected
- _send_media reconnection wait

Fixes #11163
2026-04-17 04:22:40 -07:00
dieutx
995177d542 fix(gateway): honor QQ_GROUP_ALLOWED_USERS in runner auth 2026-04-17 04:22:40 -07:00
Pedro Gonzalez
590c9964e1 Fix QQ voice attachment SSRF validation 2026-04-17 04:22:40 -07:00
yeyitech
a97b08e30c fix: allow trusted QQ CDN benchmark IP resolution 2026-04-17 04:22:40 -07:00
Teknium
aca81ac7bb test(dingtalk): cover require_mention + allowed_users gating
Adds 16 regression tests for the gating logic introduced in the
salvaged commit:

  * TestAllowedUsersGate — empty/wildcard/case-insensitive matching,
    staff_id vs sender_id, env var CSV population
  * TestMentionPatterns — compilation, case-insensitivity, invalid
    regex is skipped-not-raised, JSON env var, newline fallback
  * TestShouldProcessMessage — DM always accepted, group gating via
    require_mention / is_in_at_list / wake-word pattern / free_response_chats

Also adds yule975 to scripts/release.py AUTHOR_MAP (release CI blocks
unmapped emails).
2026-04-17 04:21:49 -07:00