hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-05-30 06:41:51 +00:00

Author	SHA1	Message	Date
briandevans	756900723a	fix(agent): add qwen and deepseek to TOOL_USE_ENFORCEMENT_MODELS Qwen3.x and DeepSeek-V3.x default to chatty/hallucinatory tool use without enforcement steering — agents narrate "calling tool X" without actually emitting a tool call, or run partial loops. Both model families fit the same failure pattern TOOL_USE_ENFORCEMENT_GUIDANCE was already injected for (gpt, codex, gemini, gemma, grok, glm). Co-authored-by: briandevans <252620095+briandevans@users.noreply.github.com> Squashed salvage of: - `403e567ce` fix(agent): add qwen and deepseek to TOOL_USE_ENFORCEMENT_MODELS - `9433eabe7` test(agent): use realistic qwen-plus identifier in enforcement test Fixes #28079.	2026-05-18 20:06:49 -07:00
teknium1	4a3f13b47b	perf(prompt-cache): date-only timestamp + loud gateway-DB roundtrip logging The system prompt's 'Conversation started:' line carried minute precision (%I:%M %p), making it byte-unstable across every rebuild path. Within a CLI session the in-memory cache held, but on the gateway path (fresh AIAgent per turn → restore from session DB), any silent failure in the read or write path dropped the cache stem and forced a full re-prefill on every subsequent turn. Local prefix-caching backends (llama.cpp / vLLM) saw this as KV-cache invalidation; remote prefix-caching providers saw it as an Anthropic-style cache miss. Three changes: 1. Date-only timestamp ('Sunday, May 17, 2026' instead of '... 03:42 PM'). System prompt now byte-stable for the full day. The model can still query exact time via tools when it actually needs it. Credit: @iamfoz (PR #20451). 2. Loud logging on session DB write failures. The update_system_prompt call used to log at DEBUG, hiding disk-full / locked-database / schema drift behind a silent fall-through that forced fresh rebuilds on every subsequent turn. Now WARN with the session id and exception so persistent issues show up in agent.log without verbose mode. 3. Three-way stored-state distinction on read. The previous 'session_row.get("system_prompt") or None' collapsed three states into one (missing row / null column / empty string). Now we tell them apart and WARN when a continuing session lands on null/empty (which means the previous turn's write never persisted — every subsequent turn rebuilds and the prefix cache misses every time). The restore block is extracted into _restore_or_build_system_prompt() so the prefix-cache path can be unit-tested in isolation. E2E proof: fresh AIAgent constructed for turn 2 across a minute-boundary sleep restores byte-identical bytes from the session DB. NULL stored prompt fires the new warning. Date-only timestamp survives the rebuild path. All on real SessionDB, no mocks. Tests: - tests/agent/test_system_prompt_restore.py (10 new tests) - tests/run_agent/test_run_agent.py::TestBuildSystemPrompt:: test_datetime_is_date_only_not_minute_precision Closes #20451 (date-only), #18547 (prefix stabilization), #8689 (stabilize timestamp across compression), #15866 (timestamp caching question), #8687 (compression timestamp), #27339 (claim #3: live timestamp in cached system prompt). Co-authored-by: Martyn Forryan <9133432+iamfoz@users.noreply.github.com>	2026-05-17 23:20:37 -07:00
Teknium	9b91377bec	feat(grok): apply OpenAI execution guidance to xAI Grok / xai-oauth models (#27797 ) Grok models hit the same failure modes that OPENAI_MODEL_EXECUTION_GUIDANCE addresses for GPT/Codex: claiming completion without tool calls ('to be honest, I didn't create the file yet'), suggesting workarounds instead of using existing tools (proposing a folder-based memory system when the memory tool exists), replying with plans instead of executing. TOOL_USE_ENFORCEMENT_GUIDANCE was already injected for any model whose name contains 'grok' (TOOL_USE_ENFORCEMENT_MODELS). This extends the follow-on family-specific block — OPENAI_MODEL_EXECUTION_GUIDANCE (tool_persistence / mandatory_tool_use / act_dont_ask / prerequisite_checks / verification / missing_context) — to grok-named models too. The OPENAI_ prefix is retained for backwards compat with imports/tests; docstring + inline comment now note that the body is family-agnostic and the prefix reflects origin, not exclusivity. Tests cover the OpenRouter slug (x-ai/grok-4.3) and the xai-oauth bare name (grok-4.3), plus a negative control on claude. E2E verified against a real AIAgent build of the system prompt for both xai-oauth and openrouter grok models.	2026-05-17 23:00:37 -07:00
Robin Fernandes	20bffa5b37	refactor(auth): mostly cleanups and style changes	2026-05-17 16:56:37 -07:00
Robin Fernandes	0bac7dd05b	refactor(auth): collapse Nous inference fallback controls	2026-05-17 16:56:37 -07:00
soynchux	280c63ce91	fix(mcp): prevent parallel-safe prefix collisions	2026-05-17 11:41:26 -07:00
teknium1	152d42d1a7	Merge origin/main into pr-27248 (resolving run_agent.py = ours) run_agent.py taken from HEAD (the extracted forwarder structure). The 25 run_agent.py fixes that landed on main during the PR's life need to be ported into the agent/* extracted modules in follow-up commits.	2026-05-16 23:16:52 -07:00
teknium1	47823790b0	refactor(run_agent): review fixes — keyword-forward __init__, drop dead code, tighten guards Four fixes from PR #27248 review: 1. __init__ forwarder is now keyword-forwarded (daimon-nous review). Previously the run_agent.AIAgent.__init__ wrapper forwarded all 64 params positionally to agent.agent_init.init_agent, so adding a 65th param on main would require three lockstep edits (signature, init_agent signature, forwarder call) or silently shift every value. Keyword forwarding makes this trivially safe — adding a param now only needs the two signatures and one extra keyword line. 2. Drop dead _ra() in agent/codex_runtime.py (daimon-nous + Copilot). The lazy run_agent reference was defined but never called inside this module — the codex paths use agent.* accessors only. 3. Drop unused imports in agent/codex_runtime.py (Copilot): contextvars, threading, time, uuid, Optional. Carried over from run_agent.py during the original extraction. 4. Tighten three source-introspection test guards (Copilot): - test_memory_nudge_counter_hydration.py — was scanning the concatenated source of run_agent.py + agent/conversation_loop.py and matching self.X or agent.X form. Now asserts the hydration block lives in agent/conversation_loop.py specifically with the agent.X form — the body never moves back, so if it ever drifts a future re-introduction fails the guard. - test_run_agent.py::TestMemoryNudgeCounterPersistence — anchor on agent.iteration_budget = IterationBudget exactly (was just iteration_budget = IterationBudget) so an unrelated identifier ending in iteration_budget can't match. - test_run_agent.py::TestMemoryProviderTurnStart — assert the agent._user_turn_count form directly (the extracted body uses agent.X, not self.X — accepting either was a transitional fudge). - test_jsondecodeerror_retryable.py — scan agent/conversation_loop.py only, not the concatenation. Not addressed in this commit: * Pre-existing bugs in agent/tool_executor.py (heartbeat index mismatch when calls are blocked, _current_tool clobber in result loop, blocked-counted-as-completed in spinner summary, dead result_preview computation). These were preserved byte-for-byte from the original _execute_tool_calls_concurrent — worth a separate follow-up PR with proper tests. * _OpenAIProxy.__instancecheck__ concern — pre-existing, not flagged by any of the original test patches (nothing actually does isinstance(x, OpenAI) against the proxy instance). * agent_init.py:949 mem_config potential NameError — pre-existing; only triggers if _agent_cfg.get('memory', {}) itself raises, which it can't with a stock dict. tests/run_agent/ + tests/agent/: 4313 passed, 1 pre-existing test_auxiliary_client failure (unchanged). run_agent.py: 3821 -> 3937 lines (+116 from the keyword-forwarded init call's verbosity). Final: 16083 -> 3937 (-12146, 75% reduction).	2026-05-16 22:55:49 -07:00
teknium1	0530252384	refactor(run_agent): extract run_conversation to agent/conversation_loop.py The 3,877-line run_conversation body — the agent loop itself — moves out of run_agent.py into a dedicated module. AIAgent.run_conversation is now a thin forwarder that delegates to agent.conversation_loop.run_conversation with the AIAgent instance as the first argument. This is the largest single extraction in the run_agent.py refactor. The body keeps all 163 self.X references intact (rewritten as agent.X), all nested closures, all retry/backoff/compression machinery. Symbols that tests or callers patch on run_agent (_set_interrupt, handle_function_call, AIAgent class attrs) are resolved through _ra() inside the extracted module so the patch surface is preserved. Five tests doing inspect.getsource(AIAgent.run_conversation) updated to scan agent.conversation_loop.run_conversation. Two source-introspection tests (TestMemoryNudgeCounterPersistence, TestMemoryProviderTurnStart) updated to accept either self.X (legacy) or agent.X (extracted form) in the matched assertions. Live E2E verified on three model paths: * openai/gpt-5.4 (OpenAI chat completions via OpenRouter) * anthropic/claude-sonnet-4.6 (Anthropic Messages via OpenRouter) * moonshotai/kimi-k2-thinking (reasoning model, reasoning_content path) Plus read_file tool execution, terminal tool, web_search. tests/run_agent/ + tests/agent/: 4313 passed, 1 pre-existing failure (test_auxiliary_client::test_custom_endpoint... — same as on main). run_agent.py: 9800 -> 5944 lines (-3856). Total reduction since baseline: 16083 -> 5944 (-10139, 63%).	2026-05-16 19:26:52 -07:00
teknium1	0430e71ec9	refactor(run_agent): extract streaming API caller (893 LOC) to agent/chat_completion_helpers.py Move _interruptible_streaming_api_call out of run_agent.py — the biggest single method in the file. Body lives next to interruptible_api_call in agent/chat_completion_helpers.py so streaming + non-streaming code share one home. Nested closures (_call_chat_completions, _call_anthropic, the codex stream branch) all come along with the body and still capture the parent function's locals as expected. AIAgent keeps a thin forwarder method. is_local_endpoint added to the import block (used by the stream stale-timeout disable logic). One source-introspection test in TestAnthropicInterruptHandler is updated to scan agent.chat_completion_helpers.interruptible_streaming_api_call instead of AIAgent._interruptible_streaming_api_call. tests/run_agent/ + tests/agent/: 4312 passed (same pre-existing test_auxiliary_client failure). run_agent.py: 12277 -> 11385 lines (-892).	2026-05-16 18:48:22 -07:00
teknium1	4b25619bc4	refactor(run_agent): extract chat-completion helpers to agent/chat_completion_helpers.py Six methods move into a new module — bodies live there, AIAgent keeps thin forwarder methods so call sites and tests are unchanged. * interruptible_api_call — non-streaming API call with interrupt handling * build_api_kwargs — assemble OpenAI / Anthropic / Codex / Bedrock request kwargs * build_assistant_message — normalize assistant message dict (reasoning, tool_calls, codex passthrough fields, alibaba glm-4.7 quirk) * try_activate_fallback — provider fallback chain activation * handle_max_iterations — controlled stop when iteration budget exhausts * cleanup_task_resources — per-turn VM + browser teardown (skipped for persistent environments) Names tests patch on run_agent (cleanup_vm, cleanup_browser) are routed through _ra() so the patch surface is preserved. Two TestAnthropicInterruptHandler source-introspection tests were updated to scan agent.chat_completion_helpers.interruptible_api_call instead of AIAgent._interruptible_api_call — the body lives in the extracted module now. tests/run_agent/ + tests/agent/: 4313 passed (same pre-existing test_auxiliary_client failure). run_agent.py: 13282 -> 12253 lines (-1029).	2026-05-16 18:41:44 -07:00
Maxim Esipov	e51d74ab91	fix(codex): rotate pool on usage limit 429	2026-05-16 16:49:56 -07:00
Teknium	395e9dd9e2	feat: add supports_parallel_tool_calls for MCP servers (#26825 ) Port from openai/codex#17667: MCP servers can now opt-in to parallel tool execution by setting supports_parallel_tool_calls: true in their config. This allows tools from the same server to run concurrently within a single tool-call batch, matching the behavior already available for built-in tools like web_search and read_file. Previously all MCP tools were forced sequential because they weren't in the _PARALLEL_SAFE_TOOLS set. Now _should_parallelize_tool_batch checks is_mcp_tool_parallel_safe() which looks up the server's config flag. Config example: mcp_servers: docs: command: "docs-server" supports_parallel_tool_calls: true Changes: - tools/mcp_tool.py: Track parallel-safe servers in _parallel_safe_servers set, populated during register_mcp_servers(). Add is_mcp_tool_parallel_safe() public API. - run_agent.py: Add _is_mcp_tool_parallel_safe() lazy-import wrapper. Update _should_parallelize_tool_batch() to check MCP tools against server config. - 11 new tests covering the feature end-to-end. - Updated MCP docs and config reference.	2026-05-16 01:04:28 -07:00
kshitij	db84a78e61	fix(langfuse): complete observability fix — trace I/O, tool outputs, placeholder credentials (closes #22342 , #22763 ) (#26320 ) * fix(langfuse): reject placeholder credentials with one-shot warning When operators leave HERMES_LANGFUSE_PUBLIC_KEY / HERMES_LANGFUSE_SECRET_KEY at a template value like 'placeholder', 'test-key', or 'your-langfuse-key', the Langfuse SDK silently accepts the credentials at construction time and drops every trace at flush time. No warning, no error — just an empty Langfuse dashboard the operator only notices hours later. Add prefix-based validation in _get_langfuse() against the documented 'pk-lf-' / 'sk-lf-' prefixes that Langfuse always issues server-side. Anything else fires a single warning naming the offending env var(s) with a log-safe value preview (full string for short placeholders so the operator knows which template they left in place; truncated for long values so a real secret pasted into the wrong field never hits the log), then short-circuits via the existing _INIT_FAILED cache so the warning fires once per process, not once per hook invocation. The check sits after the 'Langfuse is None' SDK-installed guard so hosts without the optional langfuse SDK don't see misleading 'set real keys' hints when the actionable fix is 'pip install langfuse'. Missing credentials remains the documented opt-out path and stays silent — no log noise for unconfigured installs. Fixes #22763 Fixes #23823 * fix(langfuse): use actual API request messages for generation input on_pre_llm_request previously used the messages kwarg alone, which could be None when Hermes passes the payload via request_messages, conversation_history, or user_message instead. Add _coerce_request_messages to pick the first available list across all variants, falling back to a synthetic user message. Generations now show the real outbound payload rather than an empty input. * fix(langfuse): record tool call outputs in traces Tool observations showed input (arguments) but output was always undefined. Root cause: when tool_call_id is empty, pre_tool_call stored observations under a unique time-based key that post_tool_call could never reconstruct, so every tool span was closed without output by the _finish_trace sweep. Fix pre/post matching by routing empty-tool_call_id tools through a per-name FIFO queue (pending_tools_by_name) instead of the time-based key. Tools with a tool_call_id continue to use the id-keyed dict. Also: - Preserve OpenAI-style nested function shape in serialized tool calls so Langfuse renders name/arguments correctly - Keep name + tool_call_id on role:tool messages for proper pairing - Backfill tool results onto the matching turn_tool_calls entry so the generation's tool-call record carries the result alongside arguments - Coerce request messages from whichever field the runtime provides (request_messages, messages, conversation_history, user_message) * fix(langfuse): salvage-review polish — drop dead is_first_turn, shallow-copy request_messages, real threaded FIFO test Self-review of the combined #22345 + #23831 salvage surfaced three issues worth fixing in the same PR rather than as follow-ups: 1. Drop is_first_turn from the pre_api_request hook. The boolean expression `not bool(conversation_history)` was wrong: conversation_history is reassigned to None mid-run after compression (5 sites in run_agent.py), so the value flips False -> True mid-conversation on every post-compression API call. The langfuse plugin never consumed it, so the kwarg was both misleading AND dead. 2. Replace copy.deepcopy(request_messages) with shallow list() copy. The pre_api_request hook contract discards return values (invoke_hook never writes back to api_kwargs), and the langfuse plugin's _serialize_messages already builds its own snapshot dicts via _safe_value. A deepcopy on every API call would walk every tool result and base64 image — significant overhead for no real isolation benefit. Shallow copy of the outer list protects against later mutations of api_messages without paying for the inner-dict walk. 3. Rename test_empty_tool_call_id_concurrent_fifo_order -> test_empty_tool_call_id_observations_are_fifo_within_tool_name and add a real test_threaded_post_calls_preserve_fifo_under_lock that spawns 8 threads behind a barrier to actually exercise _STATE_LOCK on the pending_tools_by_name queue. The original test was sequential and only validated Python list semantics; this one validates the lock discipline. 4. Fix stale 'Cleared by reset_cache_for_tests()' comment on _INIT_FAILED — that function does not exist. Tests reload the module via sys.modules.pop + importlib.import_module instead. Tests: 37 langfuse plugin tests pass, 658 plugin tests overall pass. --------- Co-authored-by: xxxigm <tuancanhnguyen706@gmail.com> Co-authored-by: Brian Conklin <brian@dralth.com>	2026-05-15 05:04:02 -07:00
liuhao1024	2b3bf17dfa	fix(kanban): call kanban_block on iteration-budget exhaustion to prevent protocol violation When a kanban worker subprocess hits the iteration budget, the agent loop strips tools and asks the model for a summary. The model cannot call kanban_block itself at that point, so the process exits rc=0 without calling kanban_complete or kanban_block — a protocol violation that the dispatcher detects as a fatal error, giving up after 1 failure and stranding downstream tasks. Fix: after _handle_max_iterations() returns, check HERMES_KANBAN_TASK and call kanban_block with a reason describing the exhaustion. The dispatcher then sees a clean block transition instead of a protocol violation, and the task can be retried or escalated by a human. Fixes [Bug] kanban-worker exits cleanly (rc=0) on iteration-budget exhaustion without calling kanban_complete or kanban_block #23216	2026-05-11 06:44:58 -07:00
Wesley Simplicio	68854cdcdb	fix(agent): extract thinking from content-list blocks for DeepSeek V4 Pro DeepSeek V4 Pro returns thinking content as typed blocks inside the content array rather than as a top-level reasoning_content field: [{"type": "thinking", "thinking": "..."}, {"type": "output", ...}] _extract_reasoning only handled content as a plain string, so the thinking text was silently dropped. On the next turn the session was replayed without the thinking block, causing: HTTP 400: The content[].thinking in the thinking mode must be passed back to the API. Fix: when content is a list and no structured reasoning field was found, scan for items with type=='thinking' and accumulate their 'thinking' (or 'text') value into reasoning_parts. Structured fields (reasoning, reasoning_content, reasoning_details) still take priority so existing provider behaviour is unchanged. Closes #21944	2026-05-09 13:36:12 -07:00
Blake Johnson	9076a2e74e	fix(agent): keep Nous GPT-5 fallback on chat completions	2026-05-07 13:04:42 -07:00
LeonSGP43	a78e622dfe	fix(agent): honor configured model max tokens	2026-05-07 06:40:30 -07:00
stormhierta	f648c2e3aa	fix: use max_completion_tokens for GitHub Copilot	2026-05-07 06:14:45 -07:00
kshitijk4poor	20a4f79ed1	feat: provider modules — ProviderProfile ABC, 33 providers, fetch_models, transport single-path Introduces providers/ package — single source of truth for every inference provider. Adding a simple api-key provider now requires one providers/<name>.py file with zero edits anywhere else. What this PR ships: - providers/ package (ProviderProfile ABC + 33 profiles across 4 api_modes) - ProviderProfile declarative fields: name, api_mode, aliases, display_name, env_vars, base_url, models_url, auth_type, fallback_models, hostname, default_headers, fixed_temperature, default_max_tokens, default_aux_model - 4 overridable hooks: prepare_messages, build_extra_body, build_api_kwargs_extras, fetch_models - chat_completions.build_kwargs: profile path via _build_kwargs_from_profile, legacy flag path retained for lmstudio/tencent-tokenhub (which have session-aware reasoning probing that doesn't map cleanly to hooks yet) - run_agent.py: profile path for all registered providers; legacy path variable scoping fixed (all flags defined before branching) - Auto-wires: auth.PROVIDER_REGISTRY, models.CANONICAL_PROVIDERS, doctor health checks, config.OPTIONAL_ENV_VARS, model_metadata._URL_TO_PROVIDER - GeminiProfile: thinking_config translation (native + openai-compat nested) - New tests/providers/ (79 tests covering profile declarations, transport parity, hook overrides, e2e kwargs assembly) Deltas vs original PR (salvaged onto current main): - Added profiles: alibaba-coding-plan, azure-foundry, minimax-oauth (were added to main since original PR) - Skipped profiles: lmstudio, tencent-tokenhub stay on legacy path (their reasoning_effort probing has no clean hook equivalent yet) - Removed lmstudio alias from custom profile (it's a separate provider now) - Skipped openrouter/custom from PROVIDER_REGISTRY auto-extension (resolve_provider special-cases them; adding breaks runtime resolution) - runtime_provider: profile.api_mode only as fallback when URL detection finds nothing (was breaking minimax /v1 override) - Preserved main's legacy-path improvements: deepseek reasoning_content preserve, gemini Gemma skip, OpenRouter response caching, Anthropic 1M beta recovery, etc. - Kept agent/copilot_acp_client.py in place (rejected PR's relocation — main has 7 fixes landed since; relocation would revert them) - _API_KEY_PROVIDER_AUX_MODELS alias kept for backward compat with existing test imports Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com> Closes #14418	2026-05-05 13:40:01 -07:00
Leon	19eebf6e0d	fix(openrouter): treat xiaomi models as reasoning-capable	2026-05-05 06:07:44 -07:00
molvikar	cb33c73418	fix(run_agent): gate iteration-limit provider routing to OpenRouter	2026-05-04 01:45:59 -07:00
leavr	ccb5d87076	test: cover max-iterations summary message sanitization	2026-05-04 01:36:27 -07:00
IMHaoyan	bfb704684e	fix(deepseek): use non-empty reasoning_content placeholder for V4 Pro thinking mode DeepSeek V4 Pro tightened thinking-mode validation and rejects empty-string reasoning_content with HTTP 400: The reasoning content in the thinking mode must be passed back to the API. run_agent.py injected "" at three fallback sites — the tool-call pad in _build_assistant_message and both injection branches of _copy_reasoning_content_for_api (cross-provider poison guard + unconditional thinking pad). All three now emit " " (single space), which satisfies the non-empty check on V4 Pro without leaking fabricated reasoning. Also upgrades stale empty-string placeholders on replay: sessions persisted before this change have reasoning_content="" pinned at creation time; when the active provider enforces thinking-mode echo, the replay path now rewrites "" -> " " so existing users don't 400 on their first V4 Pro turn after updating. Non-thinking providers still round-trip "" verbatim. Updates 9 existing assertions + adds 2 regression tests (stale-placeholder upgrade, non-thinking verbatim preservation). Refs #15250, #17400. Closes #17341.	2026-04-30 23:04:23 -07:00
Stephen Schoettler	b29b709a71	fix(agent): sanitize Codex tool-call history summaries	2026-04-30 19:58:46 -07:00
Teknium	71c8ca17dc	chore(salvage): strip duplicated/merge-corrupted blocks from PR #17664 Removes drive-by duplication that accumulated during the contributor branch's multiple rebases. All runtime-benign (dict last-wins, redefinition last-wins) but left dead source that would confuse reviewers and maintainers. Surgical in-place de-duplication (kept PR's intentional additions, removed only the doubled copy): * hermes_cli/auth.py: duplicate "gmi" + "azure-foundry" ProviderConfig * hermes_cli/models.py: duplicate "gmi" entry in _PROVIDER_MODELS * hermes_cli/config.py: duplicate NOTION/LINEAR/AIRTABLE/TENOR skill env block + duplicate get_custom_provider_context_length definition * hermes_cli/gateway.py: duplicate _setup_yuanbao * gateway/platforms/base.py: duplicate is_host_excluded_by_no_proxy * gateway/platforms/telegram.py: duplicate delete_message * gateway/stream_consumer.py: duplicate _should_send_fresh_final and _try_fresh_final * gateway/run.py: duplicate _parse_reasoning_command_args / _resolve_session_reasoning_config / _set_session_reasoning_override, duplicate "Drain silently when interrupted" interrupt check * run_agent.py: duplicate HERMES_AGENT_HELP_GUIDANCE append, duplicate codex_message_items capture, duplicate custom_providers resolution * tools/approval.py: duplicate HARDLINE_PATTERNS section and duplicate hardline call in check_dangerous_command * tools/mcp_tool.py: duplicate _orphan_stdio_pids module-level decl * cron/scheduler.py: duplicate "not configured/enabled" check — kept the new early-rejection, removed the stale late-path copy Full-file resets to origin/main (all PR additions were duplicates of content already on main): * ui-tui/packages/hermes-ink/index.d.ts * ui-tui/packages/hermes-ink/src/entry-exports.ts * ui-tui/packages/hermes-ink/src/ink/selection.ts * ui-tui/src/app/interfaces.ts * ui-tui/src/app/slash/commands/core.ts * ui-tui/src/components/thinking.tsx * ui-tui/src/lib/memoryMonitor.ts * ui-tui/src/types.ts * ui-tui/src/types/hermes-ink.d.ts * tests/hermes_cli/test_doctor.py * tests/hermes_cli/test_api_key_providers.py * tests/hermes_cli/test_model_validation.py * tests/plugins/memory/test_hindsight_provider.py * tests/run_agent/test_run_agent.py * tests/gateway/test_email.py * tests/tools/test_dockerfile_pid1_reaping.py * hermes_cli/commands.py (slack_native_slashes block — full duplicate)	2026-04-29 21:56:51 -07:00
Ari Lotter	868bc1c242	feat(irc): add interactive setup feat(gateway): refine Platform._missing_ and platform-connected dispatch Restricts plugin-name acceptance to bundled plugin scan + registry (no arbitrary string -> enum-pollution), pulls per-platform connectivity checks into a _PLATFORM_CONNECTED_CHECKERS lambda map with a clean _is_platform_connected method, and adds tests covering the checker map, plugin platform interface, and IRC setup wizard.	2026-04-29 21:56:51 -07:00
刘昊	60c6b07128	fix(cron): keep SOUL.md identity when workdir is unset	2026-04-29 08:10:25 -07:00
Teknium	d63abbc329	fix(agent): persist streamed reasoning_content on assistant turns (#16844 ) (#16892 ) Streaming-only providers (glm, MiniMax, gpt-5.x via aigw, Anthropic via openai-compat shims) emit reasoning through delta.reasoning_content chunks that get accumulated into the local reasoning_text string — but never land on the assistant message object as a top-level attribute. The prior guard at _build_assistant_message only wrote reasoning_content when the SDK exposed hasattr(msg, 'reasoning_content'), so these providers persisted the chain-of-thought under the internal 'reasoning' key and omitted the protocol-standard field. The poison was silent until the user later switched to a DeepSeek-v4 or Kimi thinking model, at which point replay failed with HTTP 400: 'The reasoning_content in the thinking mode must be passed back to the API.' One reported session store accumulated 4,031 poisoned messages across 1,101 files (#16844). Fix: add an additive fallback that promotes the already-sanitized reasoning_text to reasoning_content when no earlier branch wrote it AND reasoning text was actually captured. Layered on top of the existing SDK-attr branch and DeepSeek ''-pad (#15250) rather than replacing them, so every existing behavior is preserved: - SDK-exposed reasoning_content (OpenAI/Moonshot/DeepSeek SDK) still wins. - DeepSeek tool-call ''-pad still fires when the SDK exposes the attr but the value is None. - Non-thinking turns with no reasoning leave the field absent, so _copy_reasoning_content_for_api's cross-provider leak guard (#15748), promote-from-'reasoning' tier, and thinking-pad tier remain live at replay time. - No empty '' gets eagerly written on every assistant turn (which would have bypassed the read-side ladder and triggered empty thinking-block insertion in the Anthropic adapter). Tests: three new TestBuildAssistantMessage cases covering the streaming promotion path, SDK precedence, and field-absent-when-no-reasoning invariant. Credit @Sanjays2402 for the original diagnosis and patch in #16884; this is a scoped rework that preserves the existing read-side compensation code as defense in depth. Refs #16844, #16884, #15250, #15353, #15748.	2026-04-28 01:19:18 -07:00
Erosika	e553f6f3e4	fix(memory): narrow scrub surface to known wrapper boundaries Reviewer pushback on the original boundary-hardening commits — three overreach points pulled plugin-specific policy into shared core paths: 1. gateway/run.py hardcoded a '## Honcho Context' literal split for vision-LLM output. Plugin-format heading in framework code; could truncate legitimate output naturally containing that header. Drop the literal split; keep generic sanitize_context (the wrapper strip is plugin-agnostic). Plugin-specific cleanup belongs at the provider boundary, not the shared gateway path. 2. run_agent.run_conversation scrubbed user_message and persist_user_message before the conversation loop. User text is sacred — if a user types a literal <memory-context> tag we must not silently delete it. The producer (build_memory_context_block) is the only legitimate emitter; user input should never need the reverse op. 3. _build_assistant_message scrubbed model output before persistence. Same hazard: would silently mutate legitimate documentation/code the model emits containing the literal markers. The streaming scrubber catches real leaks delta-by-delta before content is concatenated; persist-time scrub was redundant belt-and-suspenders. 4. _fire_stream_delta stripped leading newlines from every delta unless a paragraph break flag was set. Mid-stream '\n' is legitimate markdown — lists, code fences, paragraph breaks — and chunk boundaries are arbitrary. Narrow lstrip to the very first delta of the stream only (so stale provider preamble still gets cleaned on turn start, but mid-stream formatting survives). Plus: build_memory_context_block now logs a warning when its defensive sanitize_context strips something — surfaces buggy providers returning pre-wrapped text instead of silently double-fencing. Net architectural change: scrub surface collapses from 8 sites to 3 (StreamingContextScrubber on output deltas, plugin→backend send, build_memory_context_block input-validation). Plugin-specific strings stay out of shared runtime paths. User input and persisted assistant output are no longer mutated. Tests: rescoped TestMemoryContextSanitization (helper-correctness only, no source-inspection of removed call sites), updated vision tests to drop '## Honcho Context' literal-split assertions, updated _build_assistant_message persistence test to assert preservation. Added: cross-turn scrubber reset, build_memory_context_block warn-on- violation, mid-stream newline preservation (plain + code fence).	2026-04-27 12:37:33 -07:00
dontcallmejames	f1ba4014e1	fix: harden memory-context leak boundaries	2026-04-27 12:37:33 -07:00
akhater	ac57114284	fix(agent): support Azure OpenAI gpt-5.x on chat/completions endpoint Azure OpenAI exposes an OpenAI-compatible endpoint at `{resource}.openai.azure.com/openai/v1` that accepts the standard `openai` Python client. Two issues prevented gpt-5.x models from working: 1. `_max_tokens_param()` only sent `max_completion_tokens` for `api.openai.com` URLs. Azure also requires `max_completion_tokens` for gpt-5.x models. 2. The `codex_responses` upgrade gate unconditionally upgraded gpt-5.x to Responses API. Azure does NOT support the Responses API — it serves gpt-5.x on the regular `/chat/completions` path, causing a 404. Fix: add `_is_azure_openai_url()` that matches `openai.azure.com` URLs. - `_max_tokens_param()` now returns `max_completion_tokens` for Azure. - The `codex_responses` upgrade gate skips Azure so gpt-5.x stays on `chat_completions` where Azure actually serves it. - The fallback-provider api_mode picker also recognises Azure and stays on chat_completions. - Tests cover max_tokens routing, api_mode behaviour, and URL detection. gpt-4.x models on Azure are unaffected (already used chat_completions + max_tokens, which Azure accepts for those models). Salvage of PR #10086 — rewritten against current main where the codex_responses upgrade gate gained copilot-acp / explicit-api_mode exclusions.	2026-04-25 18:48:43 -07:00
Teknium	ea01bdcebe	refactor(memory): remove flush_memories entirely (#15696 ) The AIAgent.flush_memories pre-compression save, the gateway _flush_memories_for_session, and everything feeding them are obsolete now that the background memory/skill review handles persistent memory extraction. Problems with flush_memories: - Pre-dates the background review loop. It was the only memory-save path when introduced; the background review now fires every 10 user turns on CLI and gateway alike, which is far more frequent than compression or session reset ever triggered flush. - Blocking and synchronous. Pre-compression flush ran on the live agent before compression, blocking the user-visible response. - Cache-breaking. Flush built a temporary conversation prefix (system prompt + memory-only tool list) that diverged from the live conversation's cached prefix, invalidating prompt caching. The gateway variant spawned a fresh AIAgent with its own clean prompt for each finalized session — still cache-breaking, just in a different process. - Redundant. Background review runs in the live conversation's session context, gets the same content, writes to the same memory store, and doesn't break the cache. Everything flush_memories claimed to preserve is already covered. What this removes: - AIAgent.flush_memories() method (~248 LOC in run_agent.py) - Pre-compression flush call in _compress_context - flush_memories call sites in cli.py (/new + exit) - GatewayRunner._flush_memories_for_session + _async_flush_memories (and the 3 call sites: session expiry watcher, /new, /resume) - 'flush_memories' entry from DEFAULT_CONFIG auxiliary tasks, hermes tools UI task list, auxiliary_client docstrings - _memory_flush_min_turns config + init - #15631's headroom-deduction math in _check_compression_model_feasibility (headroom was only needed because flush dragged the full main-agent system prompt along; the compression summariser sends a single user-role prompt so new_threshold = aux_context is safe again) - The dedicated test files and assertions that exercised flush-specific paths What this renames (with read-time backcompat on sessions.json): - SessionEntry.memory_flushed -> SessionEntry.expiry_finalized. The session-expiry watcher still uses the flag to avoid re-running finalize/eviction on the same expired session; the new name reflects what it now actually gates. from_dict() reads 'expiry_finalized' first, falls back to the legacy 'memory_flushed' key so existing sessions.json files upgrade seamlessly. Supersedes #15631 and #15638. Tested: 383 targeted tests pass across run_agent/, agent/, cli/, and gateway/ session-boundary suites. No behavior regressions — background memory review continues to handle persistent memory extraction on both CLI and gateway.	2026-04-25 08:21:14 -07:00
WildCat Eng Manager	7626f3702e	feat: read prompt caching cache_ttl from config - Load prompt_caching.cache_ttl in AIAgent (5m default, 1h opt-in) - Document DEFAULT_CONFIG and developer guide example - Add unit tests for default, 1h, and invalid TTL fallback Made-with: Cursor	2026-04-24 03:21:29 -07:00
maelrx	e020f46bec	fix(agent): preserve MiniMax context length on delta-only overflow	2026-04-23 14:06:37 -07:00
helix4u	1dfcda4e3c	fix(approval): guard env and config overwrites	2026-04-23 14:05:36 -07:00
Teknium	c345ec9a63	fix(display): strip standalone tool-call XML tags from visible text Port from openclaw/openclaw#67318. Some open models (notably Gemma variants served via OpenRouter) emit tool calls as XML blocks inside assistant content instead of via the structured tool_calls field: <function name="read_file"><parameter name="path">/tmp/x</parameter></function> <tool_call>{"name":"x"}</tool_call> <function_calls>[{...}]</function_calls> Left unstripped, this raw XML leaked to gateway users (Discord, Telegram, Matrix, Feishu, Signal, WhatsApp, etc.) and the CLI, since hermes-agent's existing reasoning-tag stripper handled only <think>/<thinking>/<thought> variants. Extend _strip_think_blocks (run_agent.py) and _strip_reasoning_tags (cli.py) to cover: * <tool_call>, <tool_calls>, <tool_result> * <function_call>, <function_calls> * <function name="..."> ... </function> (Gemma-style) The <function> variant is boundary-gated (only strips when the tag sits at start-of-line or after sentence punctuation AND carries a name="..." attribute) so prose mentions like 'Use <function> declarations in JS' are preserved. Dangling <function name="..."> with no close is intentionally left visible — matches OpenClaw's asymmetry so a truncated streaming tail still reaches the user. Tests: 9 new cases in TestStripThinkBlocks (run_agent) + 9 in new file tests/run_agent/test_strip_reasoning_tags_cli.py. Covers Qwen-style <tool_call>, Gemma-style <function name="...">, multi-line payloads, prose preservation, stray close tags, dangling open tags, and mixed reasoning+tool_call content. Note: this port covers the post-streaming final-text path, which is what gateway adapters and CLI display consume. Extending the per-delta stream filter in gateway/stream_consumer.py to hide these tags live as they stream is a separate follow-up; for now users may see raw XML briefly during a stream before the final cleaned text replaces it. Refs: openclaw/openclaw#67318	2026-04-22 18:12:42 -07:00
helix4u	a7d78d3bfd	fix: preserve reasoning_content on Kimi replay	2026-04-22 04:31:59 -07:00
kshitijk4poor	c832ebd67c	feat: add ResponsesApiTransport + wire all Codex transport paths Add ResponsesApiTransport wrapping codex_responses_adapter.py behind the ProviderTransport ABC. Auto-registered via _discover_transports(). Wire ALL Codex transport methods to production paths in run_agent.py: - build_kwargs: main _build_api_kwargs codex branch (50 lines extracted) - normalize_response: main loop + flush + summary + retry (4 sites) - convert_tools: memory flush tool override - convert_messages: called internally via build_kwargs - validate_response: response validation gate - preflight_kwargs: request sanitization (2 sites) Remove 7 dead legacy wrappers from AIAgent (_responses_tools, _chat_messages_to_responses_input, _normalize_codex_response, _preflight_codex_api_kwargs, _preflight_codex_input_items, _extract_responses_message_text, _extract_responses_reasoning_text). Keep 3 ID manipulation methods still used by _build_assistant_message. Update 18 test call sites across 3 test files to call adapter functions directly instead of through deleted AIAgent wrappers. 24 new tests. 343 codex/responses/transport tests pass (0 failures). PR 4 of the provider transport refactor.	2026-04-21 19:48:56 -07:00
Kian Meng	063bc3c1e2	fix(kimi): send max_tokens, reasoning_effort, and thinking for Kimi/Moonshot Kimi/Moonshot endpoints require explicit parameters that Hermes was not sending, causing 'Response truncated due to output length limit' errors and inconsistent reasoning behavior. Root cause analysis against Kimi CLI source (MoonshotAI/kimi-cli, packages/kosong/src/kosong/chat_provider/kimi.py): 1. max_tokens: Kimi's API defaults to a very low value when omitted. Reasoning tokens share the output budget — the model exhausts it on thinking alone. Send 32000, matching Kimi CLI's generate() default. 2. reasoning_effort: Kimi CLI sends this as a top-level parameter (not inside extra_body). Hermes was not sending it at all because _supports_reasoning_extra_body() returns False for non-OpenRouter endpoints. 3. extra_body.thinking: Kimi CLI uses with_thinking() which sets extra_body.thinking={"type":"enabled"} alongside reasoning_effort. This is a separate control from the OpenAI-style reasoning extra_body that Hermes sends for OpenRouter/GitHub. Without it, the Kimi gateway may not activate reasoning mode correctly. Covers api.kimi.com (Kimi Code) and api.moonshot.ai/cn (Moonshot). Tests: 6 new test cases for max_tokens, reasoning_effort, and extra_body.thinking under various configs.	2026-04-21 05:32:27 -07:00
Teknium	3cba81ebed	fix(kimi): omit temperature entirely for Kimi/Moonshot models (#13157 ) Kimi's gateway selects the correct temperature server-side based on the active mode (thinking -> 1.0, non-thinking -> 0.6). Sending any temperature value — even the previously "correct" one — conflicts with gateway-managed defaults. Replaces the old approach of forcing specific temperature values (0.6 for non-thinking, 1.0 for thinking) with an OMIT_TEMPERATURE sentinel that tells all call sites to strip the temperature key from API kwargs entirely. Changes: - agent/auxiliary_client.py: OMIT_TEMPERATURE sentinel, _is_kimi_model() prefix check (covers all kimi-* models), _fixed_temperature_for_model() returns sentinel for kimi models. _build_call_kwargs() strips temp. - run_agent.py: _build_api_kwargs, flush_memories, and summary generation paths all handle the sentinel by popping/omitting temperature. - trajectory_compressor.py: _effective_temperature_for_model returns None for kimi (sentinel mapped), direct client calls use kwargs dict to conditionally include temperature. - mini_swe_runner.py: same sentinel handling via wrapper function. - 6 test files updated: all 'forces temperature X' assertions replaced with 'temperature not in kwargs' assertions. Net: -76 lines (171 added, 247 removed). Inspired by PR #13137 (@kshitijk4poor).	2026-04-20 12:23:05 -07:00
kshitijk4poor	e485bc60cd	test(kimi): cover api.moonshot.cn direct-call regressions\n\n- add run_agent coverage for the Moonshot China endpoint\n- add sync/async trajectory compressor coverage for api.moonshot.cn	2026-04-20 00:32:06 -07:00
kshitijk4poor	50d6799389	fix: propagate kimi base-url temperature overrides Follow up salvaged PR #12668 by threading base_url through the remaining direct-call sites so kimi-k2.5 uses temperature=1.0 on api.moonshot.ai and keeps 0.6 on api.kimi.com/coding. Add focused regression tests for run_agent, trajectory_compressor, and mini_swe_runner.	2026-04-19 18:54:35 -07:00
Teknium	aa5bd09232	fix(tests): unstick CI — sweep stale tests from recent merges (#12670 ) One source fix (web_server category merge) + five test updates that didn't travel with their feature PRs. All 13 failures on the 04-19 CI run on main are now accounted for (5 already self-healed on main; 8 fixed here). Changes - web_server.py: add code_execution → agent to _CATEGORY_MERGE (new singleton section from #11971 broke no-single-field-category invariant). - test_browser_camofox_state: bump hardcoded _config_version 18 → 19 (also from #11971). - test_registry: add browser_cdp_tool (#12369) and discord_tool (#4753) to the expected built-in tool set. - test_run_agent::test_tool_call_accumulation: rewrite fragment chunks — #`0f778f77` switched streaming name-accumulation from += to = to fix MiniMax/NIM duplication; the test still encoded the old fragment-per-chunk premise. - test_concurrent_interrupt::_Stub: no-op _apply_pending_steer_to_tool_results — #12116 added this call after concurrent tool batches; the hand-rolled stub was missing it. - test_codex_cli_model_picker: drop the two obsolete tests that asserted auto-import from ~/.codex/auth.json into the Hermes auth store. #12360 explicitly removed that behavior (refresh-token reuse races with Codex CLI / VS Code); adoption is now explicit via `hermes auth openai-codex`. Remaining 3 tests in the file (normal path, Claude Code fallback, negative case) still cover the picker. Validation - scripts/run_tests.sh across all 6 affected files + surrounding tests (54 tests total) all green locally.	2026-04-19 12:39:58 -07:00
Teknium	f1fe29d1c3	feat(providers): extend request_timeout_seconds to all client paths Follow-up on top of mvanhorn's cherry-picked commit. Original PR only wired request_timeout_seconds into the explicit-creds OpenAI branch at run_agent.py init; router-based implicit auth, native Anthropic, and the fallback chain were still hardcoded to SDK defaults. - agent/anthropic_adapter.py: build_anthropic_client() accepts an optional timeout kwarg (default 900s preserved when unset/invalid). - run_agent.py: resolve per-provider/per-model timeout once at init; apply to Anthropic native init + post-refresh rebuild + stale/interrupt rebuilds + switch_model + _restore_primary_runtime + the OpenAI implicit-auth path + _try_activate_fallback (with immediate client rebuild so the first fallback request carries the configured timeout). - tests: cover anthropic adapter kwarg honoring; widen mock signatures to accept the new timeout kwarg. - docs/example: clarify that the knob now applies to every transport, the fallback chain, and rebuilds after credential rotation.	2026-04-19 11:23:00 -07:00
helix4u	cd59af17cc	fix(agent): silence quiet_mode in python library use	2026-04-19 00:28:25 -07:00
Tranquil-Flow	ec48ec5530	fix(agent): strip <think> blocks from stored assistant content Inline reasoning tags in an assistant message's content field leak to every downstream consumer: messaging platforms (#8878, #9568), API replay of prior turns, session transcript, CLI recap, generated session titles, and context compression. _extract_reasoning() already captures the reasoning text into msg['reasoning'] separately, so the raw tags in content are redundant. Stripping once at the storage boundary in _build_assistant_message() cleans the content for every downstream path in one place — no per-platform or per-path stripper needed. Measured impact on a real MiniMax M2.7-highspeed session (per @luoyejiaoe-source, #9306): 55% of assistant messages started with <think> blocks, 51/100 session titles were polluted, 16% content-size reduction. 3 new regression tests in TestBuildAssistantMessage: closed-pair strip with reasoning capture, no-think-tag passthrough, and unterminated-block strip. Resolves #8878 and #9568. Originally proposed as PR #9250.	2026-04-18 19:19:24 -07:00
Teknium	9489d1577d	fix(agent): strip unterminated <think> blocks from visible content Providers served via NIM (MiniMax M2.7, some Moonshot/DeepSeek proxies) sometimes drop the closing </think> tag, leaving raw reasoning in the assistant's content field. _strip_think_blocks()'s closed-pair regex is non-greedy so it only matches complete blocks — any orphan <think>...EOF survived the stripper and leaked to users (#8878, #9568, #10408). Adds an unterminated-tag pass that fires when an open reasoning tag sits at a block boundary (start of text or after a newline) with no matching close. Everything from that tag to end of string is stripped. The block-boundary check mirrors gateway/stream_consumer.py's filter so models that mention <think> in prose are not over-stripped. Also makes the closed-pair regexes consistently case-insensitive so <THINK>...</THINK> and <Thinking>...</Thinking> are handled uniformly — previously the mixed-case open tag would bypass the closed-pair pass and be caught by the unterminated-tag pass, taking trailing visible content with it. 6 new regression tests in TestStripThinkBlocks covering: unterminated <think>, unterminated <thought>, multi-line unterminated, line-start orphan with preserved prefix, prose-mention non-regression, mixed-case closed pairs. The implementation is inspired by @luinbytes's PR #10408 report of the NIM/MiniMax symptom. This commit does not include the 💭/🧠 emoji regexes from that PR — those glyphs are Hermes CLI display decorations, not model content markers.	2026-04-18 19:19:24 -07:00
Teknium	d0e1388ca9	fix(tests): make AIAgent constructor calls self-contained (#11755 ) * fix(tests): make AIAgent constructor calls self-contained (no env leakage) Tests in tests/run_agent/ were constructing AIAgent() without passing both api_key and base_url, then relying on leaked state from other tests in the same xdist worker (or process-level env vars) to keep provider resolution happy. Under hermetic conftest + pytest-split, that state is gone and the tests fail with 'No LLM provider configured'. Fix: pass both api_key and base_url explicitly on 47 AIAgent() construction sites across 13 files. AIAgent.__init__ with both set takes the direct-construction path (line 960 in run_agent.py) and skips the resolver entirely. One call site (test_none_base_url_passed_as_none) left alone — that test asserts behavior for base_url=None specifically. This is a prerequisite for any future matrix-split or stricter isolation work, and lands cleanly on its own. Validation: - tests/run_agent/ full: 760 passed, 0 failed (local) - Previously relied on cross-test pollution; now self-contained * fix(tests): update opencode-go model order assertion to match kimi-k2.5-first commit `78a74bb` promoted kimi-k2.5 to first position in model suggestion lists but didn't update this test, which has been failing on main since. Reorder expected list to match the new canonical order.	2026-04-17 12:32:03 -07:00
Teknium	77bdad5b02	fix(tests): resolve 12 CI failures + 10 errors across 6 root causes (#11040 ) Group A (3 tests): 'No LLM provider configured' RuntimeError - test_user_message_surrogates_sanitized, test_counters_initialized_in_init, test_openai_prompt_tokens_unchanged - Root cause: AIAgent.__init__ now requires base_url alongside api_key to skip resolve_provider_client() (which returns None when API keys are blanked in CI). Added base_url='http://localhost:1234/v1' to test agent construction. Group B (5 tests): Discord slash command auto-registration - test_auto_registers_missing_gateway_commands, test_auto_registered_command_, test_register_skill_group_ - Root cause: xdist workers that loaded a discord mock WITHOUT app_commands.Command/Group caused _register_slash_commands() to fail silently. Added comprehensive shared discord mock in tests/gateway/conftest.py (same pattern as existing telegram mock). Group C (5 errors): Discord reply mode 'NoneType has no DMChannel' - All TestReplyToText tests - Root cause: FakeDMChannel was not a subclass of real discord.DMChannel, so isinstance() checks in _handle_message failed when running in full suite (real discord installed). Made FakeDMChannel inherit from discord.DMChannel when available. Removed fragile monkeypatch approach. Group D (2 tests): detect_provider_for_model wrong provider - test_openrouter_slug_match (got 'ai-gateway'), test_bare_name_gets_ openrouter_slug (got 'copilot') - Root cause: ai-gateway, copilot, and kilocode are multi-vendor aggregators that list other providers' models (OpenRouter-style slugs). They were being matched in Step 1 before OpenRouter. Added all three to _AGGREGATORS set so they're skipped like nous/openrouter. Group E (1 test): model_flow_custom StopIteration - test_model_flow_custom_saves_verified_v1_base_url - Root cause: 'Display name' prompt was added after the test was written. The input iterator had 5 answers but the flow now asks 6 questions. Added 6th empty string answer. Group F (1 test): Telegram proxy env assertion - test_uses_proxy_env_for_primary_and_fallback_transports - Root cause: _resolve_proxy_url() now checks TELEGRAM_PROXY first (via resolve_proxy_url('TELEGRAM_PROXY')). Test didn't clear this env var, allowing potential leakage from other tests in xdist workers. Added TELEGRAM_PROXY to the cleanup list.	2026-04-16 06:49:36 -07:00

1 2

77 commits