fix(gateway): replace os.environ session state with contextvars for concurrency safety

When two gateway messages arrived concurrently, _set_session_env wrote
HERMES_SESSION_PLATFORM/CHAT_ID/CHAT_NAME/THREAD_ID into the process-global
os.environ. Because asyncio tasks share the same process, Message B would
overwrite Message A's values mid-flight, causing background-task notifications
and tool calls to route to the wrong thread/chat.

Replace os.environ with Python's contextvars.ContextVar. Each asyncio task
(and any run_in_executor thread it spawns) gets its own copy, so concurrent
messages never interfere.

Changes:
- New gateway/session_context.py with ContextVar definitions, set/clear/get
  helpers, and os.environ fallback for CLI/cron/test backward compatibility
- gateway/run.py: _set_session_env returns reset tokens, _clear_session_env
  accepts them for proper cleanup in finally blocks
- All tool consumers updated: cronjob_tools, send_message_tool, skills_tool,
  terminal_tool (both notify_on_complete AND check_interval blocks), tts_tool,
  agent/skill_utils, agent/prompt_builder
- Tests updated for new contextvar-based API

Fixes #7358

Co-authored-by: teknium1 <127238744+teknium1@users.noreply.github.com>
This commit is contained in:
0xFrank-eth 2026-04-10 16:50:56 -07:00 committed by Teknium
parent dab5ec8245
commit e8034e2f6a
10 changed files with 255 additions and 52 deletions

View file

@ -212,7 +212,8 @@ def _handle_send(args):
if isinstance(result, dict) and result.get("success") and mirror_text:
try:
from gateway.mirror import mirror_to_session
source_label = os.getenv("HERMES_SESSION_PLATFORM", "cli")
from gateway.session_context import get_session_env
source_label = get_session_env("HERMES_SESSION_PLATFORM", "cli")
if mirror_to_session(platform_name, chat_id, mirror_text, source_label=source_label, thread_id=thread_id):
result["mirrored"] = True
except Exception:
@ -1023,7 +1024,8 @@ async def _send_feishu(pconfig, chat_id, message, media_files=None, thread_id=No
def _check_send_message():
"""Gate send_message on gateway running (always available on messaging platforms)."""
platform = os.getenv("HERMES_SESSION_PLATFORM", "")
from gateway.session_context import get_session_env
platform = get_session_env("HERMES_SESSION_PLATFORM", "")
if platform and platform != "local":
return True
try: