``HomeAssistantAdapter._handle_ha_event`` writes the per-entity cooldown
timestamp *before* calling ``_format_state_change``, which is what
actually decides whether the event will be forwarded. For events
where ``old_state == new_state`` (or where ``new_state`` is missing),
the formatter returns ``None`` and the function returns early — but
``self._last_event_time[entity_id]`` has already been advanced.
As a result, a rapid no-op event "uses up" the cooldown window and
suppresses the next genuine state change. Reporter: #12062.
Root cause
----------
``gateway/platforms/homeassistant.py`` lines 286-299::
# Apply cooldown
now = time.time()
last = self._last_event_time.get(entity_id, 0)
if (now - last) < self._cooldown_seconds:
return
self._last_event_time[entity_id] = now # <- advanced before we know
# the event forwards
old_state = event_data.get("old_state", {})
new_state = event_data.get("new_state", {})
message = self._format_state_change(entity_id, old_state, new_state)
if not message: # <- no-op / malformed → None,
return # but cooldown already burned
Fix
---
Keep the cooldown *check* early (so throttled events don't waste time
formatting), but move the cooldown *write* to after ``_format_state_change``
returns a non-empty message. Only events that are actually forwarded
consume the cooldown window.
No API / config / public-behaviour change. Two lines effectively
swapped; one comment added.
Reproduction (confirmed on origin/main ``6fb69229``)
----------------------------------------------------
::
ha = HomeAssistantAdapter(PlatformConfig(enabled=True, token='t', extra={
'url': 'http://x', 'watch_all': True, 'cooldown_seconds': 60,
}))
ha.handle_message = AsyncMock()
await ha._handle_ha_event({'data': {'entity_id': 'sensor.temp',
'old_state': {'state': '20'},
'new_state': {'state': '20', 'attributes': {}}}})
await ha._handle_ha_event({'data': {'entity_id': 'sensor.temp',
'old_state': {'state': '20'},
'new_state': {'state': '21', 'attributes': {}}}})
assert ha.handle_message.await_count == 1 # fails on main (0)
Side benefit
------------
``_last_event_time`` no longer grows unbounded with entries for
entities that only ever emit no-op events.
Regression coverage
-------------------
``tests/gateway/test_homeassistant.py`` gets a new
``TestCooldownIssue12062`` class with 5 cases:
* ``test_no_op_state_change_does_not_consume_cooldown`` — reporter's
exact scenario.
* ``test_no_op_does_not_write_last_event_time`` — structural pin on
the cooldown map.
* ``test_missing_new_state_does_not_consume_cooldown`` — covers the
other ``_format_state_change → None`` branch.
* ``test_forwarded_event_still_consumes_cooldown`` — preserved-
behaviour canary so the fix can't silently disable cooldown.
* ``test_no_op_then_real_change_across_entities`` — independent
per-entity accounting.
4 of the 5 fail on clean ``origin/main`` with the reporter symptom;
the 5th pins preserved behaviour.
Validation
----------
``source venv/bin/activate && python -m pytest
tests/gateway/test_homeassistant.py -q`` → **50 passed** (45
pre-existing + 5 new).
Broader ``tests/gateway`` under ``-n auto`` → 13 pre-existing
baseline failures (dingtalk card lifecycle, matrix encrypted upload,
approve/deny E2E, whatsapp bridge runtime / xdist flakes). Zero in
``test_homeassistant.py`` or any touched code path.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
First pass of test-suite reduction to address flaky CI and bloat.
Removed tests that fall into these change-detector patterns:
1. Source-grep tests (tests/gateway/test_feishu.py, test_email.py): tests
that call inspect.getsource() on production modules and grep for string
literals. Break on any refactor/rename even when behavior is correct.
2. Platform enum tautologies (every gateway/test_X.py): assertions like
`Platform.X.value == 'x'` duplicated across ~9 adapter test files.
3. Toolset/PLATFORM_HINTS/setup-wizard registry-presence checks: tests that
only verify a key exists in a dict. Data-layout tests, not behavior.
4. Argparse wiring tests (test_argparse_flag_propagation, test_subparser_routing
_fallback): tests that do parser.parse_args([...]) then assert args.field.
Tests Python's argparse, not our code.
5. Pure dispatch tests (test_plugins_cmd.TestPluginsCommandDispatch): patch
cmd_X, call plugins_command with matching action, assert mock called.
Tests the if/elif chain, not behavior.
6. Kwarg-to-mock verification (test_auxiliary_client ~45 tests,
test_web_tools_config, test_gemini_cloudcode, test_retaindb_plugin): tests
that mock the external API client, call our function, and assert exact
kwargs. Break on refactor even when behavior is preserved.
7. Schedule-internal "function-was-called" tests (acp/test_server scheduling
tests): tests that patch own helper method, then assert it was called.
Kept behavioral tests throughout: error paths (pytest.raises), security
tests (path traversal, SSRF, redaction), message alternation invariants,
provider API format conversion, streaming logic, memory contract, real
config load/merge tests.
Net reduction: 169 tests removed. 38 empty classes cleaned up.
Collected before: 12,522 tests
Collected after: 12,353 tests
Previously, when no watch_domains or watch_entities were configured,
ALL state_changed events passed through to the agent, causing users
to be flooded with notifications for every HA entity change.
Now events are dropped by default unless the user explicitly configures:
- watch_domains: list of domains to monitor (e.g. climate, light)
- watch_entities: list of specific entity IDs to monitor
- watch_all: true (new option — opt-in to receive all events)
A warning is logged at connect time if no filters are configured,
guiding users to set up their HA platform config.
All 49 gateway HA tests + 52 HA tool tests pass.
Improvements to the HA integration merged from PR #184:
- Add ha_list_services tool: discovers available services (actions) per
domain with descriptions and parameter fields. Tells the model what
it can do with each device type (e.g. light.turn_on accepts brightness,
color_name, transition). Closes the gap where the model had to guess
available actions.
- Add HA to hermes tools config: users can enable/disable the homeassistant
toolset and configure HASS_TOKEN + HASS_URL through 'hermes tools' setup
flow instead of manually editing .env.
- Fix should-fix items from code review:
- Remove sys.path.insert hack from gateway adapter
- Replace all print() calls with proper logger (info/warning/error)
- Move env var reads from import-time to handler-time via _get_config()
- Add dedicated REST session reuse in gateway send()
- Update ha_call_service description to reference ha_list_services for
action discovery.
- Update tests for new ha_list_services tool in toolset resolution.
- Add ha_list_entities, ha_get_state, ha_call_service tools via REST API
- Add WebSocket gateway adapter for real-time state_changed event monitoring
- Support domain/entity filtering, cooldown, and auto-reconnect with backoff
- Use REST API for outbound notifications to avoid WS race condition
- Gate tool availability on HASS_TOKEN env var
- Add 82 unit tests covering real logic (filtering, payload building, event pipeline)