0xbyt4
3b43f7267a
fix: count actual tool calls instead of tool-related messages
...
tool_call_count was inaccurate in two ways:
1. Under-counting: an assistant message with N parallel tool calls
(e.g. "kill the light and shut off the fan" = 2 ha_call_service)
only incremented tool_call_count by 1 instead of N.
2. Over-counting: tool response messages (role=tool) also incremented
tool_call_count, double-counting every tool interaction.
Combined: 2 parallel tool calls produced tool_call_count=3 (1 from
assistant + 2 from tool responses) instead of the correct value of 2.
Fix: only count from assistant messages with tool_calls, incrementing
by len(tool_calls) to handle parallel calls correctly. Tool response
messages no longer affect tool_call_count.
This impacts /insights and /usage accuracy for sessions with tool use.
2026-03-07 04:07:52 +03:00
teknium1
a44e041acf
test: strengthen assertions across 7 test files (batch 1)
...
Replaced weak 'is not None' / '> 0' / 'len >= 1' assertions with
concrete value checks across the most flagged test files:
gateway/test_pairing.py (11 weak → 0):
- Code assertions verify isinstance + len == CODE_LENGTH
- Approval results verify dict structure + specific user_id/user_name
- Added code2 != code1 check in rate_limit_expires
test_hermes_state.py (6 weak → 0):
- ended_at verified as float timestamp
- Search result counts exact (== 2, not >= 1)
- Context verified as non-empty list
- Export verified as dict, session ID verified
test_cli_init.py (4 weak → 0):
- max_turns asserts exact value (60)
- model asserts string with provider/name format
gateway/test_hooks.py (2 zero-assert tests → fixed):
- test_no_handlers_for_event: verifies no handler registered
- test_handler_error_does_not_propagate: verifies handler count + return
gateway/test_platform_base.py (9 weak image tests → fixed):
- extract_images tests now verify actual URL and alt_text
- truncate_message verifies content preservation after splitting
cron/test_scheduler.py (1 weak → 0):
- resolve_origin verifies dict equality, not just existence
cron/test_jobs.py (2 weak → 0 + 4 new tests):
- Schedule parsing verifies ISO timestamp type
- Cron expression verifies result is valid datetime string
- NEW: 4 tests for update_job() (was completely untested)
2026-03-05 18:39:37 -08:00
0xbyt4
0ac3af8776
test: add unit tests for 8 untested modules
...
Add comprehensive test coverage for:
- cron/jobs.py: schedule parsing, job CRUD, due-job detection (34 tests)
- tools/memory_tool.py: security scanning, MemoryStore ops, dispatcher (32 tests)
- toolsets.py: resolution, validation, composition, cycle detection (19 tests)
- tools/file_operations.py: write deny list, result dataclasses, helpers (37 tests)
- agent/prompt_builder.py: context scanning, truncation, skills index (24 tests)
- agent/model_metadata.py: token estimation, context lengths (16 tests)
- hermes_state.py: SessionDB SQLite CRUD, FTS5 search, export, prune (28 tests)
Total: 210 new tests, all passing (380 total suite).
2026-02-26 13:27:58 +03:00