test: stop testing mutable data — convert change-detectors to invariants (#13363)

Catalog snapshots, config version literals, and enumeration counts are data
that changes as designed. Tests that assert on those values add no
behavioral coverage — they just break CI on every routine update and cost
engineering time to 'fix.'

Replace with invariants where one exists, delete where none does.

Deleted (pure snapshots):
- TestMinimaxModelCatalog (3 tests): 'MiniMax-M2.7 in models' et al
- TestGeminiModelCatalog: 'gemini-2.5-pro in models', 'gemini-3.x in models'
- test_browser_camofox_state::test_config_version_matches_current_schema
  (docstring literally said it would break on unrelated bumps)

Relaxed (keep plumbing check, drop snapshot):
- Xiaomi / Arcee / Kimi moonshot / Kimi coding / HuggingFace static lists:
  now assert 'provider exists and has >= 1 entry' instead of specific names
- HuggingFace main/models.py consistency test: drop 'len >= 6' floor

Dynamicized (follow source, not a literal):
- 3x test_config.py migration tests: raw['_config_version'] ==
  DEFAULT_CONFIG['_config_version'] instead of hardcoded 21

Fixed stale tests against intentional behavior changes:
- test_insights::test_gateway_format_hides_cost: name matches new behavior
  (no dollar figures); remove contradicting '$' in text assertion
- test_config::prefers_api_then_url_then_base_url: flipped per PR #9332;
  rename + update to base_url > url > api
- test_anthropic_adapter: relax assert_called_once() (xdist-flaky) to
  assert called — contract is 'credential flowed through'
- test_interrupt_propagation: add provider/model/_base_url to bare-agent
  fixture so the stale-timeout code path resolves

Fixed stale integration tests against opt-in plugin gate:
- transform_tool_result + transform_terminal_output: write plugins.enabled
  allow-list to config.yaml and reset the plugin manager singleton

Source fix (real consistency invariant):
- agent/model_metadata.py: add moonshotai/Kimi-K2.6 context length
  (262144, same as K2.5). test_model_metadata_has_context_lengths was
  correctly catching the gap.

Policy:
- AGENTS.md Testing section: new subsection 'Don't write change-detector
  tests' with do/don't examples. Reviewers should reject catalog-snapshot
  assertions in new tests.

Covers every test that failed on the last completed main CI run
(24703345583) except test_modal_sandbox_fixes::test_terminal_tool_present
+ test_terminal_and_file_toolsets_resolve_all_tools, which now pass both
alone and with the full tests/tools/ directory (xdist ordering flake that
resolved itself).
This commit is contained in:
Teknium 2026-04-20 23:20:33 -07:00 committed by GitHub
parent 7ab5eebd03
commit 62cbeb6367
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
14 changed files with 113 additions and 80 deletions

View file

@ -459,7 +459,8 @@ class TestCustomProviderCompatibility:
migrate_config(interactive=False, quiet=True)
raw = yaml.safe_load(config_path.read_text(encoding="utf-8"))
assert raw["_config_version"] == 21
from hermes_cli.config import DEFAULT_CONFIG
assert raw["_config_version"] == DEFAULT_CONFIG["_config_version"]
assert raw["providers"]["openai-direct"] == {
"api": "https://api.openai.com/v1",
"api_key": "test-key",
@ -501,7 +502,8 @@ class TestCustomProviderCompatibility:
assert compatible[0]["provider_key"] == "openai-direct"
assert compatible[0]["api_mode"] == "codex_responses"
def test_compatible_custom_providers_prefers_api_then_url_then_base_url(self, tmp_path):
def test_compatible_custom_providers_prefers_base_url_then_url_then_api(self, tmp_path):
"""URL field precedence is base_url > url > api (PR #9332)."""
config_path = tmp_path / "config.yaml"
config_path.write_text(
yaml.safe_dump(
@ -526,7 +528,7 @@ class TestCustomProviderCompatibility:
assert compatible == [
{
"name": "My Provider",
"base_url": "https://api.example.com/v1",
"base_url": "https://base.example.com/v1",
"provider_key": "my-provider",
}
]
@ -606,7 +608,8 @@ class TestInterimAssistantMessageConfig:
migrate_config(interactive=False, quiet=True)
raw = yaml.safe_load(config_path.read_text(encoding="utf-8"))
assert raw["_config_version"] == 21
from hermes_cli.config import DEFAULT_CONFIG
assert raw["_config_version"] == DEFAULT_CONFIG["_config_version"]
assert raw["display"]["tool_progress"] == "off"
assert raw["display"]["interim_assistant_messages"] is True
@ -626,7 +629,8 @@ class TestDiscordChannelPromptsConfig:
migrate_config(interactive=False, quiet=True)
raw = yaml.safe_load(config_path.read_text(encoding="utf-8"))
assert raw["_config_version"] == 21
from hermes_cli.config import DEFAULT_CONFIG
assert raw["_config_version"] == DEFAULT_CONFIG["_config_version"]
assert raw["discord"]["auto_thread"] is True
assert raw["discord"]["channel_prompts"] == {}