fix: return effective session_id after context compression (#16938)

When context compression rotates the agent's session_id to a new
child session, the API server was still returning the stale parent
session_id in the X-Hermes-Session-Id response header.

This caused external clients to keep sending the old session_id,
loading uncompressed parent history instead of the compressed
continuation.

Fix: _run_agent() now includes the effective session_id in its
result dict, and the response header uses it instead of the
original provided session_id.
This commit is contained in:
vominh1919 2026-04-28 20:31:02 +07:00 committed by Teknium
parent 34c6f93496
commit 7f735b4db2

View file

@ -1209,7 +1209,9 @@ class APIServerAdapter(BasePlatformAdapter):
},
}
response_headers = {"X-Hermes-Session-Id": session_id}
response_headers = {
"X-Hermes-Session-Id": result.get("session_id", session_id),
}
if gateway_session_key:
response_headers["X-Hermes-Session-Key"] = gateway_session_key
return web.json_response(response_data, headers=response_headers)
@ -2483,6 +2485,10 @@ class APIServerAdapter(BasePlatformAdapter):
"output_tokens": getattr(agent, "session_completion_tokens", 0) or 0,
"total_tokens": getattr(agent, "session_total_tokens", 0) or 0,
}
# Include the effective session ID in the result so callers
# (e.g. X-Hermes-Session-Id header) can track compression-
# triggered session rotations. (#16938)
result["session_id"] = getattr(agent, "session_id", session_id)
return result, usage
return await loop.run_in_executor(None, _run)