fix(x_search): surface degraded results + validate dates

The xAI Responses API for x_search returns 200 OK with a
synthesized fluff answer in two failure modes that callers currently
cannot distinguish from a real, citation-backed result:

1. Any narrowing filter (allowed_x_handles, excluded_x_handles,
   from_date, to_date) was active, but the X index returned no
   matching posts. The model then answers from training data.
2. The date range is malformed, inverted, or pure-future (e.g.
   from_date=2030-01-01). The API call burns quota and Grok
   responds with a generic answer.

Mitigations, both client-side:

* Validate from_date / to_date before the HTTP call:
  - Strict YYYY-MM-DD.
  - from_date <= to_date when both set.
  - from_date <= today UTC (no posts in a window that hasn't
    started). to_date in the future remains allowed so callers
    can request 'from yesterday to tomorrow'.

* Add 'degraded' + 'degraded_reason' to successful responses.
  degraded=True iff any narrowing filter was active AND both the
  top-level 'citations' array and inline 'url_citation'
  annotations came back empty. A broad query with no filters that
  returns no citations is *not* flagged degraded — that case is
  just an unsourced answer, not a filter miss.

Tests cover all four validation paths plus six degraded-flag
scenarios (each filter type, inline vs top-level citation
recovery, broad query baseline). All existing tests continue to
pass; the additions are purely additive on the success-path
response shape.

Discovered while testing the x_search toolset end-to-end:
queries scoped to @Teknium1 returned confident-sounding generic
text about Nous Research with zero citations, and from_date in
2030 produced sassy non-answers. Both are now detectable by the
caller.
This commit is contained in:
kshitijk4poor 2026-05-21 02:38:45 +05:30
parent 31a0100104
commit 2a352f96ee
3 changed files with 412 additions and 0 deletions

View file

@ -78,9 +78,22 @@ The tool returns JSON with:
- `answer` — synthesized text response from Grok
- `citations` — citations returned by the Responses API top-level field
- `inline_citations``url_citation` annotations extracted from the message body (each with `url`, `title`, `start_index`, `end_index`)
- `degraded``true` when any narrowing filter (`allowed_x_handles`, `excluded_x_handles`, `from_date`, `to_date`) was set AND both citation channels came back empty. In that case the `answer` was synthesized from the model's own knowledge rather than the X index, so treat it as unsourced. `false` otherwise (including the "no filters set" case — a broad unsourced answer is just an answer, not a filter miss)
- `degraded_reason` — short string naming which filters were active, or `null` when `degraded` is `false`
- `credential_source``"xai-oauth"` if OAuth resolved, `"xai"` if API key resolved
- `model`, `query`, `provider`, `tool`, `success`
### Date validation
`from_date` / `to_date` are validated client-side before the HTTP call:
- Both, if provided, must parse as `YYYY-MM-DD`.
- When both are set, `from_date` must be on or before `to_date`.
- `from_date` must not be later than today UTC — no posts can exist in a window that hasn't started yet, so the call would be guaranteed to return zero citations.
- `to_date` in the future is allowed (callers may legitimately request "from yesterday to tomorrow" to catch posts as they arrive).
Validation failures surface as a structured `{"error": "..."}` tool result, never as an HTTP call to xAI.
## Example
Talking to the agent:
@ -110,6 +123,16 @@ Two possible causes:
1. **Toolset not enabled.** Run `hermes tools` and confirm `🐦 X (Twitter) Search` is checked.
2. **No xAI credentials.** The check_fn returns False, so the schema stays hidden. Run `hermes auth status` to confirm xai-oauth login state, and check that `XAI_API_KEY` is set (if you're using the API-key path).
### `degraded: true` — answer with no citations
When you used `allowed_x_handles`, `excluded_x_handles`, or a date range and the response comes back with `degraded: true`, xAI's X index returned no matching posts but Grok still produced a synthesized answer from its own training data. The answer is unsourced — do not treat it as a real X result.
Causes worth checking:
- **Typo in the handle.** Strip the `@`, double-check spelling, and confirm the account exists.
- **Date range too narrow** or sliding past today's posts; widen and retry.
- **xAI index gap.** Some active accounts intermittently fail to surface in `x_search` even when they post regularly. Retry after a few minutes, or use the `xurl` skill for direct X API reads when you need an exact handle's timeline.
## See Also
- [xAI Grok OAuth (SuperGrok Subscription)](../../guides/xai-grok-oauth.md) — the OAuth setup guide