Commit graph

122 commits

Author SHA1 Message Date
Teknium
2aa983e2f2
feat(gateway): recognize .pdf in MEDIA: tag extraction (#13683)
PDFs emitted by tools (report generators, document exporters, etc.) now
deliver as native attachments when wrapped in MEDIA: — same as images,
audio, and video.

Bare .pdf paths are intentionally NOT added to extract_local_files(), so
the agent can still reference PDFs in text without auto-sending them.
2026-04-21 13:48:10 -07:00
unlinearity
155b619867 fix(agent): normalize socks:// env proxies for httpx/anthropic
WSL2 / Clash-style setups often export ALL_PROXY=socks://127.0.0.1:PORT. httpx and the Anthropic SDK reject that alias and expect socks5://, so agent startup failed early with "Unknown scheme for proxy URL" before any provider request could proceed.

Add shared normalize_proxy_url()/normalize_proxy_env_vars() helpers in utils.py and route all proxy entry points through them:
  - run_agent._get_proxy_from_env
  - agent.auxiliary_client._validate_proxy_env_urls
  - agent.anthropic_adapter.build_anthropic_client
  - gateway.platforms.base.resolve_proxy_url

Regression coverage:
  - run_agent proxy env resolution
  - auxiliary proxy env normalization
  - gateway proxy URL resolution

Verified with:
PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 /home/nonlinear/.hermes/hermes-agent/venv/bin/pytest -o addopts='' -p pytest_asyncio.plugin tests/run_agent/test_create_openai_client_proxy_env.py tests/agent/test_proxy_and_url_validation.py tests/gateway/test_proxy_mode.py

39 passed.
2026-04-21 05:52:46 -07:00
alt-glitch
1010e5fa3c refactor: remove redundant local imports already available at module level
Sweep ~74 redundant local imports across 21 files where the same module
was already imported at the top level. Also includes type fixes and lint
cleanups on the same branch.
2026-04-21 00:50:58 -07:00
JP Lew
9fdfb09aed fix(telegram): cache inbound videos and accept mp4 uploads 2026-04-20 05:10:23 -07:00
kshitijk4poor
4b6ff0eb7f fix: tighten gateway interrupt salvage follow-ups
Follow-up on top of the helix4u #12388 cherry-picks:
- make deferred post-delivery callbacks generation-aware end-to-end so
  stale runs cannot clear callbacks registered by a fresher run for the
  same session
- bind callback ownership to the active session event at run start and
  snapshot that generation inside base adapter processing so later event
  mutation cannot retarget cleanup
- pass run_generation through proxy mode and drop stale proxy streams /
  final results the same way local runs are dropped
- centralize stop/new interrupt cleanup into one helper and replace the
  open-coded branches with shared logic
- unify internal control interrupt reason strings via shared constants
- remove the return from base.py's finally block so cleanup no longer
  swallows cancellation/exception flow
- add focused regressions for generation forwarding, proxy stale
  suppression, and newer-callback preservation

This addresses all review findings from the initial #12388 review while
keeping the fix scoped to stale-output/typing-loop interrupt handling.
2026-04-19 03:03:57 -07:00
helix4u
8466268ca5 fix(gateway): keep typing loop overrides backward-compatible 2026-04-19 03:03:57 -07:00
helix4u
150382e8b7 fix(gateway): stop typing loops on session interrupt 2026-04-19 03:03:57 -07:00
Teknium
62ce6a38ae
fix(gateway): cancel_background_tasks must drain late-arrivals (#12471)
During gateway shutdown, a message arriving while
cancel_background_tasks is mid-await (inside asyncio.gather) spawns
a fresh _process_message_background task via handle_message and adds
it to self._background_tasks.  The original implementation's
_background_tasks.clear() at the end of cancel_background_tasks
dropped the reference; the task ran untracked against a disconnecting
adapter, logged send-failures, and lingered until it completed on
its own.

Fix: wrap the cancel+gather in a bounded loop (MAX_DRAIN_ROUNDS=5).
If new tasks appeared during the gather, cancel them in the next
round.  The .clear() at the end is preserved as a safety net for
any task that appeared after MAX_DRAIN_ROUNDS — but in practice the
drain stabilizes in 1-2 rounds.

Tests: tests/gateway/test_cancel_background_drain.py — 3 cases.
- test_cancel_background_tasks_drains_late_arrivals: spawn M1, start
  cancel, inject M2 during M1's shielded cleanup, verify M2 is
  cancelled.
- test_cancel_background_tasks_handles_no_tasks: no-op path still
  terminates cleanly.
- test_cancel_background_tasks_bounded_rounds: baseline — single
  task cancels in one round, loop terminates.

Regression-guard validated: against the unpatched implementation,
the late-arrival test fails with exactly the expected message
('task leaked').  With the fix it passes.

Blast radius is shutdown-only; the audit classified this as MED.
Shipping because the fix is small and the hygiene is worth it.

While investigating the audit's other MEDs (busy-handler double-ack,
Discord ExecApprovalView double-resolve, UpdatePromptView
double-resolve), I verified all three were false positives — the
check-and-set patterns have no await between them, so they're
atomic on single-threaded asyncio.  No fix needed for those.
2026-04-19 01:48:42 -07:00
Teknium
3a6351454b
fix(gateway): close pending-drain and late-arrival races in base adapter (#12371)
Two related race conditions in gateway/platforms/base.py that could
produce duplicate agent runs or silently drop messages. Neither is
specific to any one platform — all adapters inherit this logic.

R5 (HIGH) — duplicate agent spawn on turn chain
  In _process_message_background, the pending-drain path deleted
  _active_sessions[session_key] before awaiting typing_task.cancel()
  and then recursively awaiting _process_message_background for the
  queued event. During the typing_task await, a fresh inbound message
  M3 could pass the Level-1 guard (entry now missing), set its own
  Event, and spawn a second _process_message_background for the same
  session_key — two agents running simultaneously, duplicate responses,
  duplicate tool calls.

  Fix: keep the _active_sessions entry populated and only clear() the
  Event. The guard stays live, so any concurrent inbound message takes
  the busy-handler path (queue + interrupt) as intended.

R6 (MED-HIGH) — message dropped during finally cleanup
  The finally block has two await points (typing_task, stop_typing)
  before it deletes _active_sessions. A message arriving in that
  window passes the guard (entry still live), lands in
  _pending_messages via the busy-handler — and then the unconditional
  del removes the guard with that message still queued. Nothing
  drains it; the user never gets a reply.

  Fix: before deleting _active_sessions in finally, pop any late
  pending_messages entry and spawn a drain task for it. Only delete
  _active_sessions when no pending is waiting.

Tests: tests/gateway/test_pending_drain_race.py — three regression
cases. Validated: without the fix, two of the three fail exactly
where the races manifest (duplicate-spawn guard loses identity,
late-arrival 'LATE' message not in processed list).
2026-04-18 19:32:26 -07:00
Teknium
45acd9beb5
fix(gateway): ignore redelivered /restart after PTB offset ACK fails (#11940)
When a Telegram /restart fires and PTB's graceful-shutdown `get_updates`
ACK call times out ("When polling for updates is restarted, updates may
be received twice" in gateway.log), the new gateway receives the same
/restart again and restarts a second time — a self-perpetuating loop.

Record the triggering update_id in `.restart_last_processed.json` when
handling /restart.  On the next process, reject a /restart whose
update_id <= the recorded one as a stale redelivery.  5-minute staleness
guard so an orphaned marker can't block a legitimately new /restart.

- gateway/platforms/base.py: add `platform_update_id` to MessageEvent
- gateway/platforms/telegram.py: propagate `update.update_id` through
  _build_message_event for text/command/location/media handlers
- gateway/run.py: write dedup marker in _handle_restart_command;
  _is_stale_restart_redelivery checks it before processing /restart
- tests/gateway/test_restart_redelivery_dedup.py: 9 new tests covering
  fresh restart, redelivery, staleness window, cross-platform,
  malformed-marker resilience, and no-update_id (CLI) bypass

Only active for Telegram today (the one platform with monotonic
cross-session update ordering); other platforms return False from
_is_stale_restart_redelivery and proceed normally.
2026-04-17 21:17:33 -07:00
pedh
4459913f40 feat(dingtalk): AI Cards streaming, emoji reactions, and media handling
Cherry-picked from #10985 by pedh, adapted to current main:

* Keeps main's full group-chat gating (require_mention + allowed_users +
  free_response_chats + mention_patterns) — PR's simpler subset dropped.
* Keeps main's fire-and-forget process() dispatch + session_webhook
  fallback for SDK >= 0.24.
* Picks up PR's REQUIRES_EDIT_FINALIZE capability flag on
  BasePlatformAdapter + finalize kwarg on edit_message(), plumbed through
  stream_consumer.  Default False so Telegram/Slack/Discord/Matrix stay
  on the zero-overhead fast path.
* DingTalk AI Card lifecycle: per-chat _message_contexts, two-card flow
  (tool-progress + final response) with sibling auto-close driven by
  reply_to, idempotent 🤔Thinking → 🥳Done swap, $alibabacloud-dingtalk$
  for media URL resolution (replaces raw HTTP that was 403-ing).
* pyproject: dingtalk extra now dingtalk-stream>=0.20,<1 +
  alibabacloud-dingtalk>=2.0.0 + qrcode.

Closes #10991

Co-authored-by: pedh
2026-04-17 19:26:53 -07:00
Xowiek
511ed4dacc fix(gateway): bypass active-session guard for gateway-handled slash commands 2026-04-17 18:58:03 -07:00
Brooklyn Nicholson
1f37ef2fd1 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-17 08:59:33 -05:00
gnanam1990
0f4403346d fix(discord): DISCORD_ALLOW_BOTS=mentions/all now works without DISCORD_ALLOWED_USERS
Fixes #4466.

Root cause: two sequential authorization gates both independently rejected
bot messages, making DISCORD_ALLOW_BOTS completely ineffective.

Gate 1 — `discord.py` `on_message`:
    _is_allowed_user ran BEFORE the bot filter, so bot senders were dropped
    before the DISCORD_ALLOW_BOTS policy was ever evaluated.

Gate 2 — `gateway/run.py` _is_user_authorized:
    The gateway-level allowlist check rejected bot IDs with 'Unauthorized
    user: <bot_id>' even if they passed Gate 1.

Fix:

  gateway/platforms/discord.py — reorder on_message so DISCORD_ALLOW_BOTS
  runs BEFORE _is_allowed_user. Bots permitted by the filter skip the
  user allowlist; non-bots are still checked.

  gateway/session.py — add is_bot: bool = False to SessionSource so the
  gateway layer can distinguish bot senders.

  gateway/platforms/base.py — expose is_bot parameter in build_source.

  gateway/platforms/discord.py _handle_message — set is_bot=True when
  building the SessionSource for bot authors.

  gateway/run.py _is_user_authorized — when source.is_bot is True AND
  DISCORD_ALLOW_BOTS is 'mentions' or 'all', return True early. Platform
  filter already validated the message at on_message; don't re-reject.

Behavior matrix:

  | Config                                     | Before  | After   |
  | DISCORD_ALLOW_BOTS=none (default)          | Blocked | Blocked |
  | DISCORD_ALLOW_BOTS=all                     | Blocked | Allowed |
  | DISCORD_ALLOW_BOTS=mentions + @mention     | Blocked | Allowed |
  | DISCORD_ALLOW_BOTS=mentions, no mention    | Blocked | Blocked |
  | Human in DISCORD_ALLOWED_USERS             | Allowed | Allowed |
  | Human NOT in DISCORD_ALLOWED_USERS         | Blocked | Blocked |

Co-authored-by: Hermes Maintainer <hermes@nousresearch.com>
2026-04-17 05:42:04 -07:00
Brooklyn Nicholson
7f1204840d test(tui): fix stale mocks + xdist flakes in TUI test suite
All 61 TUI-related tests green across 3 consecutive xdist runs.

tests/tui_gateway/test_protocol.py:
- rename `get_messages` → `get_messages_as_conversation` on mock DB (method
  was renamed in the real backend, test was still stubbing the old name)
- update tool-message shape expectation: `{role, name, context}` matches
  current `_history_to_messages` output, not the legacy `{role, text}`

tests/hermes_cli/test_tui_resume_flow.py:
- `cmd_chat` grew a first-run provider-gate that bailed to "Run: hermes
  setup" before `_launch_tui` was ever reached; 3 tests stubbed
  `_resolve_last_session` + `_launch_tui` but not the gate
- factored a `main_mod` fixture that stubs `_has_any_provider_configured`,
  reused by all three tests

tests/test_tui_gateway_server.py:
- `test_config_set_personality_resets_history_and_returns_info` was flaky
  under xdist because the real `_write_config_key` touches
  `~/.hermes/config.yaml`, racing with any other worker that writes
  config. Stub it in the test.
2026-04-16 19:07:49 -05:00
helix4u
5d7d574779 fix(gateway): let /queue bypass active-session guard 2026-04-16 16:36:40 -07:00
Brooklyn Nicholson
3746c60439 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-16 18:25:49 -05:00
Siddharth Balyan
d38b73fa57
fix(matrix): E2EE and migration bugfixes (#10860)
* - make buffered streaming
- fix path naming to expand `~` for agent.
- fix stripping of matrix ID to not remove other mentions / localports.

* fix(matrix): register MembershipEventDispatcher for invite auto-join

The mautrix migration (#7518) broke auto-join because InternalEventType.INVITE
events are only dispatched when MembershipEventDispatcher is registered on the
client. Without it, _on_invite is dead code and the bot silently ignores all
room invites.

Closes #10094
Closes #10725
Refs: PR #10135 (digging-airfare-4u), PR #10732 (fxfitz)

* fix(matrix): preserve _joined_rooms reference for CryptoStateStore

connect() reassigned self._joined_rooms = set(...) after initial sync,
orphaning the reference captured by _CryptoStateStore at init time.
find_shared_rooms() returned [] forever, breaking Megolm session rotation
on membership changes.

Mutate in place with clear() + update() so the CryptoStateStore reference
stays valid.

Refs #8174, PR #8215

* fix(matrix): remove dual ROOM_ENCRYPTED handler to fix dedup race

mautrix auto-registers DecryptionDispatcher when client.crypto is set.
The adapter also registered _on_encrypted_event for the same event type.
_on_encrypted_event had zero awaits and won the race to mark event IDs
in the dedup set, causing _on_room_message to drop successfully decrypted
events from DecryptionDispatcher. The retry loop masked this by re-decrypting
every message ~4 seconds later.

Remove _on_encrypted_event entirely. DecryptionDispatcher handles decryption;
genuinely undecryptable events are logged by mautrix and retried on next
key exchange.

Refs #8174, PR #8215

* fix(matrix): re-verify device keys after share_keys() upload

Matrix homeservers treat ed25519 identity keys as immutable per device.
share_keys() can return 200 but silently ignore new keys if the device
already exists with different identity keys. The bot would proceed with
shared=True while peers encrypt to the old (unreachable) keys.

Now re-queries the server after share_keys() and fails closed if keys
don't match, with an actionable error message.

Refs #8174, PR #8215

* fix(matrix): encrypt outbound attachments in E2EE rooms

_upload_and_send() uploaded raw bytes and used the 'url' key for all
rooms. In E2EE rooms, media must be encrypted client-side with
encrypt_attachment(), the ciphertext uploaded, and the 'file' key
(with key/iv/hashes) used instead of 'url'.

Now detects encrypted rooms via state_store.is_encrypted() and
branches to the encrypted upload path.

Refs: PR #9822 (charles-brooks)

* fix(matrix): add stop_typing to clear typing indicator after response

The adapter set a 30-second typing timeout but never cleared it.
The base class stop_typing() is a no-op, so the typing indicator
lingered for up to 30 seconds after each response.

Closes #6016
Refs: PR #6020 (r266-tech)

* fix(matrix): cache all media types locally, not just photos/voice

should_cache_locally only covered PHOTO, VOICE, and encrypted media.
Unencrypted audio/video/documents in plaintext rooms were passed as MXC
URLs that require authentication the agent doesn't have, resulting
in 401 errors.

Refs #3487, #3806

* fix(matrix): detect stale OTK conflict on startup and fail closed

When crypto state is wiped but the same device ID is reused, the
homeserver may still hold one-time keys signed with the previous
identity key. Identity key re-upload succeeds but OTK uploads fail
with "already exists" and a signature mismatch. Peers cannot
establish new Olm sessions, so all new messages are undecryptable.

Now proactively flushes OTKs via share_keys() during connect() and
catches the "already exists" error with an actionable log message
telling the operator to purge the device from the homeserver or
generate a fresh device ID.

Also documents the crypto store recovery procedure in the Matrix
setup guide.

Refs #8174

* docs(matrix): improve crypto recovery docs per review

- Put easy path (fresh access token) first, manual purge second
- URL-encode user ID in Synapse admin API example
- Note that device deletion may invalidate the access token
- Add "stop Synapse first" caveat for direct SQLite approach
- Mention the fail-closed startup detection behavior
- Add back-reference from upgrade section to OTK warning

* refactor(matrix): cleanup from code review

- Extract _extract_server_ed25519() and _reverify_keys_after_upload()
  to deduplicate the re-verification block (was copy-pasted in two
  places, three copies of ed25519 key extraction total)
- Remove dead code: _pending_megolm, _retry_pending_decryptions,
  _MAX_PENDING_EVENTS, _PENDING_EVENT_TTL — all orphaned after
  removing _on_encrypted_event
- Remove tautological TestMediaCacheGate (tested its own predicate,
  not production code)
- Remove dead TestMatrixMegolmEventHandling and
  TestMatrixRetryPendingDecryptions (tested removed methods)
- Merge duplicate TestMatrixStopTyping into TestMatrixTypingIndicator
- Trim comment to just the "why"
2026-04-17 04:03:02 +05:30
Brooklyn Nicholson
f81dba0da2 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-16 08:23:20 -05:00
Peter Berthelsen
9a9b8cd1e4 fix: keep rapid telegram follow-ups from getting cut off 2026-04-16 02:44:00 -07:00
Greer Guthrie
33ff29dfae fix(gateway): defer background review notifications until after main reply
Background review notifications ("💾 Skill created", "💾 Memory updated")
could race ahead of the main assistant reply in chat, making it look like
the agent stopped after creating a skill.

Gate bg-review notifications behind a threading.Event + pending queue.
Register a release callback on the adapter's _post_delivery_callbacks dict
so base.py's finally block fires it after the main response is delivered.

The queued-message path in _run_agent pops and calls the callback directly
to prevent double-fire.

Co-authored-by: Hermes Agent <hermes@nousresearch.com>
Closes #10541
2026-04-15 17:23:15 -07:00
Brooklyn Nicholson
097702c8a7 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-15 19:11:07 -05:00
Teknium
0d05bd34f8 feat: extend channel_prompts to Telegram, Slack, and Mattermost
Extract resolve_channel_prompt() shared helper into
gateway/platforms/base.py. Refactor Discord to use it.
Wire channel_prompts into Telegram (groups + forum topics),
Slack (channels), and Mattermost (channels).

Config bridging now applies to all platforms (not just Discord).
Added channel_prompts defaults to telegram/slack/mattermost
config sections.

Docs added to all four platform pages with platform-specific
examples (topic inheritance for Telegram, channel IDs for Slack,
etc.).
2026-04-15 16:31:28 -07:00
Brenner Spear
2fbdc2c8fa feat(discord): add channel_prompts config
Add native Discord channel_prompts support with parent forum fallback,
ephemeral runtime injection, config migration updates, docs, and tests.
2026-04-15 16:31:28 -07:00
Brooklyn Nicholson
371166fe26 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-15 10:21:00 -05:00
Teknium
2546b7acea fix(gateway): suppress duplicate replies on interrupt and streaming flood control
Three fixes for the duplicate reply bug affecting all gateway platforms:

1. base.py: Suppress stale response when the session was interrupted by a
   new message that hasn't been consumed yet. Checks both interrupt_event
   and _pending_messages to avoid false positives. (#8221, #2483)

2. run.py (return path): Remove response_previewed guard from already_sent
   check. Stream consumer's already_sent alone is authoritative — if
   content was delivered via streaming, the duplicate send must be
   suppressed regardless of the agent's response_previewed flag. (#8375)

3. run.py (queued-message path): Same fix — already_sent without
   response_previewed now correctly marks the first response as already
   streamed, preventing re-send before processing the queued message.

The response_previewed field is still produced by the agent (run_agent.py)
but is no longer required as a gate for duplicate suppression. The stream
consumer's already_sent flag is the delivery-level truth about what the
user actually saw.

Concepts from PR #8380 (konsisumer). Closes #8375, #8221, #2483.
2026-04-15 03:42:24 -07:00
Brooklyn Nicholson
7e4dd6ea02 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-13 18:32:13 -05:00
Teknium
9e992df8ae
fix(telegram): use UTF-16 code units for message length splitting (#8725)
Port from nearai/ironclaw#2304: Telegram's 4096 character limit is
measured in UTF-16 code units, not Unicode codepoints. Characters
outside the Basic Multilingual Plane (emoji like 😀, CJK Extension B,
musical symbols) are surrogate pairs: 1 Python char but 2 UTF-16 units.

Previously, truncate_message() used Python's len() which counts
codepoints. This could produce chunks exceeding Telegram's actual limit
when messages contain many astral-plane characters.

Changes:
- Add utf16_len() helper and _prefix_within_utf16_limit() for
  UTF-16-aware string measurement and truncation
- Add _custom_unit_to_cp() binary-search helper that maps a custom-unit
  budget to the largest safe codepoint slice position
- Update truncate_message() to accept optional len_fn parameter
- Telegram adapter now passes len_fn=utf16_len when splitting messages
- Fix fallback truncation in Telegram error handler to use
  _prefix_within_utf16_limit instead of codepoint slicing
- Update send_message_tool.py to use utf16_len for Telegram platform
- Add comprehensive tests: utf16_len, _prefix_within_utf16_limit,
  truncate_message with len_fn (emoji splitting, content preservation,
  code block handling)
- Update mock lambdas in reply_mode tests to accept **kw for len_fn
2026-04-12 19:06:20 -07:00
Brooklyn Nicholson
ec553fdb49 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-11 17:15:41 -05:00
Teknium
04c1c5d53f
refactor: extract shared helpers to deduplicate repeated code patterns (#7917)
* refactor: add shared helper modules for code deduplication

New modules:
- gateway/platforms/helpers.py: MessageDeduplicator, TextBatchAggregator,
  strip_markdown, ThreadParticipationTracker, redact_phone
- hermes_cli/cli_output.py: print_info/success/warning/error, prompt helpers
- tools/path_security.py: validate_within_dir, has_traversal_component
- utils.py additions: safe_json_loads, read_json_file, read_jsonl,
  append_jsonl, env_str/lower/int/bool helpers
- hermes_constants.py additions: get_config_path, get_skills_dir,
  get_logs_dir, get_env_path

* refactor: migrate gateway adapters to shared helpers

- MessageDeduplicator: discord, slack, dingtalk, wecom, weixin, mattermost
- strip_markdown: bluebubbles, feishu, sms
- redact_phone: sms, signal
- ThreadParticipationTracker: discord, matrix
- _acquire/_release_platform_lock: telegram, discord, slack, whatsapp,
  signal, weixin

Net -316 lines across 19 files.

* refactor: migrate CLI modules to shared helpers

- tools_config.py: use cli_output print/prompt + curses_radiolist (-117 lines)
- setup.py: use cli_output print helpers + curses_radiolist (-101 lines)
- mcp_config.py: use cli_output prompt (-15 lines)
- memory_setup.py: use curses_radiolist (-86 lines)

Net -263 lines across 5 files.

* refactor: migrate to shared utility helpers

- safe_json_loads: agent/display.py (4 sites)
- get_config_path: skill_utils.py, hermes_logging.py, hermes_time.py
- get_skills_dir: skill_utils.py, prompt_builder.py
- Token estimation dedup: skills_tool.py imports from model_metadata
- Path security: skills_tool, cronjob_tools, skill_manager_tool, credential_files
- Non-atomic YAML writes: doctor.py, config.py now use atomic_yaml_write
- Platform dict: new platforms.py, skills_config + tools_config derive from it
- Anthropic key: new get_anthropic_key() in auth.py, used by doctor/status/config/main

* test: update tests for shared helper migrations

- test_dingtalk: use _dedup.is_duplicate() instead of _is_duplicate()
- test_mattermost: use _dedup instead of _seen_posts/_prune_seen
- test_signal: import redact_phone from helpers instead of signal
- test_discord_connect: _platform_lock_identity instead of _token_lock_identity
- test_telegram_conflict: updated lock error message format
- test_skill_manager_tool: 'escapes' instead of 'boundary' in error msgs
2026-04-11 13:59:52 -07:00
Brooklyn Nicholson
b04248f4d5 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor
# Conflicts:
#	gateway/platforms/base.py
#	gateway/run.py
#	tests/gateway/test_command_bypass_active_session.py
2026-04-11 11:39:47 -05:00
Kenny Xie
ecfae98152 fix(gateway): address restart review feedback 2026-04-10 21:18:34 -07:00
Kenny Xie
3163731289 fix(gateway): drain in-flight work before restart 2026-04-10 21:18:34 -07:00
entropidelic
989b950fbc fix(security): enforce API_SERVER_KEY for non-loopback binding
Add is_network_accessible() helper using Python's ipaddress module to
robustly classify bind addresses (IPv4/IPv6 loopback, wildcards,
mapped addresses, hostname resolution with DNS-failure-fails-closed).

The API server connect() now refuses to start when the bind address is
network-accessible and no API_SERVER_KEY is set, preventing RCE from
other machines on the network.

Co-authored-by: entropidelic <entropidelic@users.noreply.github.com>
2026-04-10 16:51:44 -07:00
Teknium
360b21ce95
fix(gateway): reject file paths in get_command() + file-drop tests (#7356)
Gateway get_command() now rejects paths containing /. Also adds 28 _detect_file_drop regression tests. From #6978 (@ygd58) and #6963 (@betamod).
2026-04-10 13:06:02 -07:00
Teknium
76a1e6e0fe feat(discord): add channel_skill_bindings for auto-loading skills per channel
Simplified implementation of the feature from PR #6842 (RunzhouLi).
Allows Discord channels/forum threads to auto-bind skills via config:

    discord:
      channel_skill_bindings:
        - id: "123456"
          skills: ["skill-a", "skill-b"]

The run.py auto-skill loader now handles both str and list[str],
loading multiple skills in order and concatenating their payloads.
Forum threads inherit their parent channel's bindings.

Co-authored-by: RunzhouLi <RunzhouLi@users.noreply.github.com>
2026-04-10 05:19:26 -07:00
Teknium
7663c98c1e fix: make safe_url_for_log public, add SSRF redirect guards to base.py cache helpers
Follow-up to Dusk1e's PR #7120 (Slack send_image redirect guard):
- Rename _safe_url_for_log -> safe_url_for_log (drop underscore) since
  it is now imported cross-module by the Slack adapter
- Add _ssrf_redirect_guard httpx event hook to cache_image_from_url()
  and cache_audio_from_url() in base.py — same pattern as vision_tools
  and the Slack adapter fix
- Update url_safety.py docstring to reflect broader coverage
- Add regression tests for image/audio redirect blocking + safe passthrough
2026-04-10 05:04:28 -07:00
Evi Nova
0b143f2ea3 fix(gateway): validate Slack image downloads before caching
Slack may return an HTML sign-in/redirect page instead of actual media
bytes (e.g. expired token, restricted file access). This adds two layers
of defense:

1. Content-Type check in slack.py rejects text/html responses early
2. Magic-byte validation in base.py's cache_image_from_bytes() rejects
   non-image data regardless of source platform

Also adds ValueError guards in wecom.py and email.py so the new
validation doesn't crash those adapters.

Closes #6829
2026-04-10 03:53:09 -07:00
Tranquil-Flow
429da6cbce fix(gateway): route /background through active-session bypass
When /background was sent during an active run, it was not in the
platform adapter's bypass list and fell through to the interrupt path
instead of spawning a parallel background task.

Add "background" to the active-session command bypass in the platform
adapter, and add an early return in the gateway runner's running-agent
guard to route /background to _handle_background_command() before it
reaches the default interrupt logic.

Fixes #6827
2026-04-10 03:52:00 -07:00
Kenny Xie
4f2f09affa fix(gateway): avoid false failure reactions on restart cancellation 2026-04-10 03:52:00 -07:00
Austin Pickett
f805323517 chore: merge main 2026-04-09 20:00:34 -04:00
Brooklyn Nicholson
99fd3b518d feat: add /copy and /agents 2026-04-09 17:19:36 -05:00
Teknium
6f8e426275 fix: add SOCKS proxy support, DISCORD_PROXY env var, and send_message proxy coverage
Follow-up improvements on top of the shared resolver from PR #6562:

- Add platform_env_var parameter to resolve_proxy_url() so DISCORD_PROXY
  takes priority over generic HTTPS_PROXY/ALL_PROXY env vars
- Add SOCKS proxy support via aiohttp_socks.ProxyConnector with rdns=True
  (critical for GFW/Shadowrocket/Clash users — issue #6649)
- proxy_kwargs_for_bot() returns connector= for SOCKS, proxy= for HTTP
- proxy_kwargs_for_aiohttp() returns split (session_kw, request_kw) for
  standalone aiohttp sessions
- Add proxy support to send_message_tool.py (Discord REST, Slack, SMS)
  for cron job delivery behind proxies (from PR #2208)
- Add proxy support to Discord image/document downloads
- Fix duplicate import sys in base.py
2026-04-09 14:19:06 -07:00
Zheng Li
88dbbfe982 feat(gateway): unified proxy support for Discord and Telegram with macOS auto-detection
- Add resolve_proxy_url() to base.py — shared by all platform adapters
- Check HTTPS_PROXY / HTTP_PROXY / ALL_PROXY env vars first
- Fall back to macOS system proxy via scutil --proxy (zero-config)
- Pass proxy= to discord.py commands.Bot() for gateway connectivity
- Refactor telegram_network.py to use shared resolver
- Update test fixtures to accept proxy kwarg
2026-04-09 14:19:06 -07:00
Kira
e1b0b135cb fix(discord): accept .log attachments and raise document size limit 2026-04-09 02:26:33 -07:00
xingkongliang
1d8d4f28ae fix(gateway): prevent background process notifications from triggering false pairing requests
When a background process with notify_on_complete=True finishes, the
gateway injects a synthetic MessageEvent to notify the session. This
event was constructed without user_id, causing _is_user_authorized()
to reject it and — for DM-origin sessions — trigger the pairing flow,
sending "Hi~ I don't recognize you yet!" with a pairing code to the
chat owner.

Add an `internal` flag to MessageEvent that bypasses authorization
checks for system-generated synthetic events. Only the process watcher
sets this flag; no external/adapter code path can produce it.

Includes 4 regression tests covering the fix and the normal pairing path.
2026-04-08 23:01:04 -07:00
Teknium
469cd16fe0
fix(security): consolidated security hardening — SSRF, timing attack, tar traversal, credential leakage (#5944)
Salvaged from PRs #5800 (memosr), #5806 (memosr), #5915 (Ruzzgar), #5928 (Awsh1).

Changes:
- Use hmac.compare_digest for API key comparison (timing attack prevention)
- Apply provider env var blocklist to Docker containers (credential leakage)
- Replace tar.extractall() with safe extraction in TerminalBench2 (CVE-2007-4559)
- Add SSRF protection via is_safe_url to ALL platform adapters:
  base.py (cache_image_from_url, cache_audio_from_url),
  discord, slack, telegram, matrix, mattermost, feishu, wecom
  (Signal and WhatsApp protected via base.py helpers)
- Update tests: mock is_safe_url in Mattermost download tests
- Add security tests for tar extraction (traversal, symlinks, safe files)
2026-04-07 17:28:37 -07:00
Teknium
125e5ef089 fix: extend caption substring fix to all platforms
Move _merge_caption helper from TelegramAdapter to BasePlatformAdapter
so all adapters inherit it. Fix the same substring-containment bug in:
- gateway/platforms/base.py (photo burst merging)
- gateway/run.py (priority photo follow-up merging)
- gateway/platforms/feishu.py (media batch merging)

The original fix only covered telegram.py. The same bug existed in base.py
and run.py (pure substring check) and feishu.py (list membership without
whitespace normalization).
2026-04-07 14:08:59 -07:00
Teknium
ab0c1e58f1
fix: pause typing indicator during approval waits (#5893)
When the agent waits for dangerous-command approval, the typing
indicator (_keep_typing loop) kept refreshing. On Slack's Assistant
API this is critical: assistant_threads_setStatus disables the
compose box, preventing users from typing /approve or /deny.

- Add _typing_paused set + pause/resume methods to BasePlatformAdapter
- _keep_typing skips send_typing when chat_id is paused
- _approval_notify_sync pauses typing before sending approval prompt
- _handle_approve_command / _handle_deny_command resume typing after

Benefits all platforms — no reason to show 'is thinking...' while
the agent is idle waiting for human input.
2026-04-07 11:04:50 -07:00
Teknium
d0ffb111c2
refactor: codebase-wide lint cleanup — unused imports, dead code, and inefficient patterns (#5821)
Comprehensive cleanup across 80 files based on automated (ruff, pyflakes, vulture)
and manual analysis of the entire codebase.

Changes by category:

Unused imports removed (~95 across 55 files):
- Removed genuinely unused imports from all major subsystems
- agent/, hermes_cli/, tools/, gateway/, plugins/, cron/
- Includes imports in try/except blocks that were truly unused
  (vs availability checks which were left alone)

Unused variables removed (~25):
- Removed dead variables: connected, inner, channels, last_exc,
  source, new_server_names, verify, pconfig, default_terminal,
  result, pending_handled, temperature, loop
- Dropped unused argparse subparser assignments in hermes_cli/main.py
  (12 instances of add_parser() where result was never used)

Dead code removed:
- run_agent.py: Removed dead ternary (None if False else None) and
  surrounding unreachable branch in identity fallback
- run_agent.py: Removed write-only attribute _last_reported_tool
- hermes_cli/providers.py: Removed dead @property decorator on
  module-level function (decorator has no effect outside a class)
- gateway/run.py: Removed unused MCP config load before reconnect
- gateway/platforms/slack.py: Removed dead SessionSource construction

Undefined name bugs fixed (would cause NameError at runtime):
- batch_runner.py: Added missing logger = logging.getLogger(__name__)
- tools/environments/daytona.py: Added missing Dict and Path imports

Unnecessary global statements removed (14):
- tools/terminal_tool.py: 5 functions declared global for dicts
  they only mutated via .pop()/[key]=value (no rebinding)
- tools/browser_tool.py: cleanup thread loop only reads flag
- tools/rl_training_tool.py: 4 functions only do dict mutations
- tools/mcp_oauth.py: only reads the global
- hermes_time.py: only reads cached values

Inefficient patterns fixed:
- startswith/endswith tuple form: 15 instances of
  x.startswith('a') or x.startswith('b') consolidated to
  x.startswith(('a', 'b'))
- len(x)==0 / len(x)>0: 13 instances replaced with pythonic
  truthiness checks (not x / bool(x))
- in dict.keys(): 5 instances simplified to in dict
- Redefined unused name: removed duplicate _strip_mdv2 import in
  send_message_tool.py

Other fixes:
- hermes_cli/doctor.py: Replaced undefined logger.debug() with pass
- hermes_cli/config.py: Consolidated chained .endswith() calls

Test results: 3934 passed, 17 failed (all pre-existing on main),
19 skipped. Zero regressions.
2026-04-07 10:25:31 -07:00