hermes-agent

5839 commits 895 branches 10 tags 1.7 GiB

Author	SHA1	Message	Date
Teknium	62cbeb6367	test: stop testing mutable data — convert change-detectors to invariants (#13363 ) Catalog snapshots, config version literals, and enumeration counts are data that changes as designed. Tests that assert on those values add no behavioral coverage — they just break CI on every routine update and cost engineering time to 'fix.' Replace with invariants where one exists, delete where none does. Deleted (pure snapshots): - TestMinimaxModelCatalog (3 tests): 'MiniMax-M2.7 in models' et al - TestGeminiModelCatalog: 'gemini-2.5-pro in models', 'gemini-3.x in models' - test_browser_camofox_state::test_config_version_matches_current_schema (docstring literally said it would break on unrelated bumps) Relaxed (keep plumbing check, drop snapshot): - Xiaomi / Arcee / Kimi moonshot / Kimi coding / HuggingFace static lists: now assert 'provider exists and has >= 1 entry' instead of specific names - HuggingFace main/models.py consistency test: drop 'len >= 6' floor Dynamicized (follow source, not a literal): - 3x test_config.py migration tests: raw['_config_version'] == DEFAULT_CONFIG['_config_version'] instead of hardcoded 21 Fixed stale tests against intentional behavior changes: - test_insights::test_gateway_format_hides_cost: name matches new behavior (no dollar figures); remove contradicting '$' in text assertion - test_config::prefers_api_then_url_then_base_url: flipped per PR #9332; rename + update to base_url > url > api - test_anthropic_adapter: relax assert_called_once() (xdist-flaky) to assert called — contract is 'credential flowed through' - test_interrupt_propagation: add provider/model/_base_url to bare-agent fixture so the stale-timeout code path resolves Fixed stale integration tests against opt-in plugin gate: - transform_tool_result + transform_terminal_output: write plugins.enabled allow-list to config.yaml and reset the plugin manager singleton Source fix (real consistency invariant): - agent/model_metadata.py: add moonshotai/Kimi-K2.6 context length (262144, same as K2.5). test_model_metadata_has_context_lengths was correctly catching the gap. Policy: - AGENTS.md Testing section: new subsection 'Don't write change-detector tests' with do/don't examples. Reviewers should reject catalog-snapshot assertions in new tests. Covers every test that failed on the last completed main CI run (24703345583) except test_modal_sandbox_fixes::test_terminal_tool_present + test_terminal_and_file_toolsets_resolve_all_tools, which now pass both alone and with the full tests/tools/ directory (xdist ordering flake that resolved itself).	2026-04-20 23:20:33 -07:00
Austin Pickett	720e1c65b2	Merge branch 'main' into feat/dashboard-skill-analytics	2026-04-20 05:25:49 -07:00
Teknium	e33cb65a98	fix(insights): hide cache read/write and cost metrics from display (#11477 ) The cache-read, cache-write, and total estimated-cost values shown in /insights (and the per-model Cost column) were unreliable. Hide them from both terminal and gateway renderings. The underlying data pipeline is untouched — sessions still store cache_read_tokens, cache_write_tokens, and estimated_cost_usd; the web server, /usage command, and status bar are unaffected. Only the InsightsEngine display layer is trimmed. Changes: - format_terminal: drop 'Cache read / Cache write' line, drop 'Est. cost' from the Total tokens row, drop per-model 'Cost' column, drop the '* Cost N/A for custom/self-hosted' footnote. - format_gateway: drop cache breakdown from Tokens line, drop 'Est. cost' line, drop per-model cost suffix. - Tests updated to assert these strings are now absent.	2026-04-17 01:02:06 -07:00
Arihant Sethia	857b543543	feat: add skill analytics to the dashboard Expose skill usage in analytics so the dashboard and insights output can show which skills the agent loads and manages over time. This adds skill aggregation to the InsightsEngine by extracting `skill_view` and `skill_manage` calls from assistant tool_calls, computing per-skill totals, and including the results in both terminal and gateway insights formatting. It also extends the dashboard analytics API and Analytics page to render a Top Skills table. Terminology is aligned with the skills docs: - Agent Loaded = `skill_view` events - Agent Managed = `skill_manage` actions Architecture: - agent/insights.py collects and aggregates per-skill usage - hermes_cli/web_server.py exposes `skills` on `/api/analytics/usage` - web/src/lib/api.ts adds analytics skill response types - web/src/pages/AnalyticsPage.tsx renders the Top Skills table - web/src/i18n/{en,zh}.ts updates user-facing labels Tests: - tests/agent/test_insights.py covers skill aggregation and formatting - tests/hermes_cli/test_web_server.py covers analytics API contract including the `skills` payload - verified with `cd web && npm run build` Files changed: - agent/insights.py - hermes_cli/web_server.py - tests/agent/test_insights.py - tests/hermes_cli/test_web_server.py - web/src/i18n/en.ts - web/src/i18n/types.ts - web/src/i18n/zh.ts - web/src/lib/api.ts - web/src/pages/AnalyticsPage.tsx	2026-04-15 06:44:43 +00:00
alt-glitch	96c060018a	fix: remove 115 verified dead code symbols across 46 production files Automated dead code audit using vulture + coverage.py + ast-grep intersection, confirmed by Opus deep verification pass. Every symbol verified to have zero production callers (test imports excluded from reachability analysis). Removes ~1,534 lines of dead production code across 46 files and ~1,382 lines of stale test code. 3 entire files deleted (agent/builtin_memory_provider.py, hermes_cli/checklist.py, tests/hermes_cli/test_setup_model_selection.py). Co-authored-by: alt-glitch <balyan.sid@gmail.com>	2026-04-10 03:44:43 -07:00
Siddharth Balyan	f3006ebef9	refactor(tests): re-architect tests + fix CI failures (#5946 ) * refactor: re-architect tests to mirror the codebase * Update tests.yml * fix: add missing tool_error imports after registry refactor * fix(tests): replace patch.dict with monkeypatch to prevent env var leaks under xdist patch.dict(os.environ) can leak TERMINAL_ENV across xdist workers, causing test_code_execution tests to hit the Modal remote path. * fix(tests): fix update_check and telegram xdist failures - test_update_check: replace patch("hermes_cli.banner.os.getenv") with monkeypatch.setenv("HERMES_HOME") — banner.py no longer imports os directly, it uses get_hermes_home() from hermes_constants. - test_telegram_conflict/approval_buttons: provide real exception classes for telegram.error mock (NetworkError, TimedOut, BadRequest) so the except clause in connect() doesn't fail with "catching classes that do not inherit from BaseException" when xdist pollutes sys.modules. * fix(tests): accept unavailable_models kwarg in _prompt_model_selection mock	2026-04-07 17:19:07 -07:00

Renamed from tests/test_insights.py (Browse further)

6 commits