mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-04-30 01:41:43 +00:00
feat: add Codex fast mode toggle (/fast command)
Add /fast slash command to toggle OpenAI Codex service_tier between
normal and priority ('fast') inference. Only exposed for models
registered in _FAST_MODE_BACKEND_CONFIG (currently gpt-5.4).
- Registry-based backend config for extensibility
- Dynamic command visibility (hidden from help/autocomplete for
non-supported models) via command_filter on SlashCommandCompleter
- service_tier flows through request_overrides from route resolution
- Omit max_output_tokens for Codex backend (rejects it)
- Persists to config.yaml under agent.service_tier
Salvage cleanup: removed simple_term_menu/input() menu (banned),
bare /fast now shows status like /reasoning. Removed redundant
override resolution in _build_api_kwargs — single source of truth
via request_overrides from route.
Co-authored-by: Hermes Agent <hermes@nousresearch.com>
This commit is contained in:
parent
4caa635803
commit
d416a69288
9 changed files with 473 additions and 5 deletions
|
|
@ -648,6 +648,15 @@ def test_preflight_codex_api_kwargs_allows_reasoning_and_temperature(monkeypatch
|
|||
assert result["max_output_tokens"] == 4096
|
||||
|
||||
|
||||
def test_preflight_codex_api_kwargs_allows_service_tier(monkeypatch):
|
||||
agent = _build_agent(monkeypatch)
|
||||
kwargs = _codex_request_kwargs()
|
||||
kwargs["service_tier"] = "priority"
|
||||
|
||||
result = agent._preflight_codex_api_kwargs(kwargs)
|
||||
assert result["service_tier"] == "priority"
|
||||
|
||||
|
||||
def test_run_conversation_codex_replay_payload_keeps_call_id(monkeypatch):
|
||||
agent = _build_agent(monkeypatch)
|
||||
responses = [_codex_tool_call_response(), _codex_message_response("done")]
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue