hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-06-24 10:52:21 +00:00

Author	SHA1	Message	Date
kshitijk4poor	d6cb69a7a9	chore: add sweetcornna to AUTHOR_MAP Salvage co-author of the cron ticker-liveness fix.	2026-06-21 13:00:50 +05:30
annguyenNous	07424da76f	fix(cron): keep ticker alive on BaseException + heartbeat-aware status The in-process cron ticker (cron/scheduler_provider.py) caught only `Exception` and logged at DEBUG, so a `SystemExit`/`KeyboardInterrupt` raised from a misbehaving provider SDK or agent retry path killed the ticker thread silently. The gateway PROCESS stayed up, so `hermes cron status` — which only checks `find_gateway_pids()` — kept reporting "✓ jobs will fire automatically" while no jobs ever fired (#32612, #32895). This makes ticker death survivable and detectable: - The ticker loop now catches `BaseException` and logs at ERROR with a traceback, so a single bad tick no longer tears the thread down and the failure is visible in the gateway log. - The loop records a heartbeat (`cron/ticker_heartbeat`, epoch seconds) on startup and after every tick — best-effort, never raised into the loop. Both ticker entry points (the gateway and the desktop fallback in web_server.py) funnel through `InProcessCronScheduler.start`, so one heartbeat site covers both. - `hermes cron status` now reads the heartbeat age: if the gateway is running but the heartbeat is stale (> 200s, i.e. several missed ~60s ticks), it reports the ticker as STALLED and suggests a restart instead of falsely claiming jobs will fire. A missing heartbeat (older build / never ran) is treated as "unknown", not "dead". Adds tests for BaseException survival, per-iteration heartbeat recording, heartbeat round-trip/age, staleness detection, and silent-write-failure. Salvaged from #49660 (BaseException survival on current structure), extended with the heartbeat + honest-status reporting that the earlier (pre-refactor) watchdog PRs #35616 and #33849 proposed. Fixes #32612 Fixes #32895 Co-authored-by: banditburai <promptsiren@gmail.com> Co-authored-by: sweetcornna <96944678+sweetcornna@users.noreply.github.com>	2026-06-21 13:00:50 +05:30
kshitijk4poor	35752fc3a5	chore: add szzhoujiarui-sketch and rayjun to AUTHOR_MAP Salvage co-authors of the cron model.default fix.	2026-06-21 12:37:56 +05:30
konsisumer	73b92264ee	fix(cron): resolve model.default + fail fast on missing model Cron jobs created without an explicit `model` are stored as `model: null`. At fire time `run_job` resolved `model = job.get("model") or os.getenv( "HERMES_MODEL") or ""` and then `_model_cfg.get("default", model)`, so when config.yaml had no `model.default` (or `model: {default: null}`) an empty string flowed straight to the provider and surfaced as an opaque HTTP 400 ("Model parameter is required" / "model: String should have at least 1 character"). The operator had to inspect jobs.json to discover the job was stored with a null model. This change makes cron model resolution robust and symmetric with the CLI: - Coerce `model: null`/missing config to `{}` so a falsy default never overwrites an already-resolved env value with `None`. - Only overwrite `model` from `model.default` when the resolved value is truthy; accept a `model.model` alias key, mirroring the sibling resolvers in hermes_cli/oneshot.py, fallback_cmd.py and prompt_size.py. - Resolve AFTER the managed-scope overlay so an administrator-pinned model still wins. - Fail fast with an actionable error (caught by run_job's outer handler and recorded as the job's last_error — the cron ticker is unaffected) instead of letting an empty model reach the API. - The per-job model is re-read every tick, so a `cronjob action=update model=...` after a failed run takes effect on the next tick (no cache). Adds tests/cron/conftest.py pinning a default HERMES_MODEL so existing run_job tests don't trip the new guard, plus regression tests covering env fallback, config.default fallback, string-form config, the model alias key, null-default-no-clobber, corrupt-config graceful degradation, fail-fast, and the no-cache re-read property. Salvaged from #24005, rebased onto current main, with additional test coverage folded in from #45550 and the alias-key behavior from #43952. Fixes #43899 Fixes #23979 Fixes #22761 Co-authored-by: szzhoujiarui-sketch <szzhoujiarui@gmail.com> Co-authored-by: rayjun <rayjun0412@gmail.com>	2026-06-21 12:37:56 +05:30
teknium1	14ef6312b5	fix(compression): decay protect_first_n so early turns don't fossilize (#11996 ) protect_first_n keeps the first N non-system messages verbatim through compaction so the original task framing survives. But it was applied on EVERY compression pass: the same early user turns were re-copied into each child session and never summarized away, so across a long, repeatedly- compressed session those old messages became immortal and grew the protected head unboundedly (#11996, P1). Decay it: protect_first_n applies on the FIRST compaction only. Once the session has been compressed at least once (compression_count >= 1, or a handoff summary already exists), the early turns are captured in the summary, so _effective_protect_first_n() returns 0 and only the system prompt stays protected. The decay is read at compress_start computation time, before compression_count/_previous_summary are mutated at the end of compress(), so the first pass still protects correctly. Co-authored-by: truenorth-lj <liliangjya@gmail.com> Co-authored-by: davidvv <david.vv@icloud.com>	2026-06-21 00:06:58 -07:00
Teknium	c6bf6bda90	fix(memory): recover from missing old_text on single-op replace/remove (#49997 ) Single-op replace/remove failed with a dead-end 'old_text is required' error when a structured-output client omitted the optional old_text field (it can't be schema-required without a top-level if/then combinator that OpenAI's Codex backend 400s on). The model couldn't recover. Now a missing old_text returns the current entry inventory plus a retry instruction (mirroring the batch path's _batch_error), so the model can reissue the call with old_text set. Also sharpens the old_text schema description to state it's required for replace/remove. Fixes #49466, #43412.	2026-06-20 23:46:52 -07:00
Teknium	d5f0e737d9	chore(release): add AUTHOR_MAP entry for #49544 salvage	2026-06-20 23:42:47 -07:00
Teknium	c1f11f8c69	fix(telegram): index streamed rich finals via editMessageText too The native echo recovery handles replies to most rich messages, but messages sent before the bot's first rich send have no echo to read. record() was only called on the fresh-send path (_try_send_rich); a streamed final finalized via _try_edit_rich/editMessageText was never indexed, so a reply to it had neither a native echo nor an index entry. Mirror the fresh-send record() into the edit success path to close that gap.	2026-06-20 23:42:47 -07:00
izumi0uu	29e5e127c6	fix(telegram): recover reply text from native rich echo Telegram DOES echo a rich message's content back in reply_to_message.api_kwargs['rich_message']['blocks'] when a user replies to it. Read that native field first in _build_message_event, keeping the local send-time index only as a fallback. Duck-type api_kwargs via .get() since it is a mappingproxy, not a dict. Fixes #49534	2026-06-20 23:42:47 -07:00
teknium	fcdefb4181	chore(release): add AUTHOR_MAP entries for docs PR salvage cluster 2	2026-06-20 23:23:47 -07:00
Tony Simons	2008a96b20	docs: align contributor test checklist with wrapper	2026-06-20 23:23:47 -07:00
BBCrypto-web	72e4cca00e	docs(config): correct MCP docs path in cli-config.yaml.example The MCP section pointed to docs/mcp.md, which does not exist. Point it to website/docs/user-guide/features/mcp.md, matching the existing hooks.md reference convention in the same file. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-20 23:23:47 -07:00
namredips	b1ab5a8ae1	docs(antigravity-cli): add delegation patterns + output/bounding caveats Brings the antigravity-cli skill to parity with the codex / claude-code delegation playbooks. Additive only — auth/sandbox/plugin/settings content is unchanged. - New 'Delegation patterns' section: one-shot, background bounded runs, interactive PTY+tmux, parallel worktree fan-out, and an orchestration boundary note (agy is a worker backend / reviewer, not a coordination primitive). - Documents the two ways agy -p differs from claude-code: plain-text output (no --output-format json / result envelope) and bounding via --print-timeout rather than a nonexistent --max-turns. Mirrored into Pitfalls. - Bumps version 0.1.0 -> 0.2.0.	2026-06-20 23:23:47 -07:00
Sworntech-dev	9f507a0aa3	docs: remove file tools TBD placeholder	2026-06-20 23:23:47 -07:00
BBCrypto-web	225dcf855c	docs(.env.example): add HF_BASE_URL placeholder Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-20 23:23:47 -07:00
loes5050	85f108ef03	test(cron): document consent-first self-learning suggestions	2026-06-20 23:23:47 -07:00
allo	bc85f6150e	docs: document per-event extra keys in shell-hook wire protocol The shell-hook stdin payload's extra object contains event-specific kwargs, but the docstring only mentioned the field without listing what each event actually puts inside it. Add a reference table covering post_tool_call, pre_tool_call, on_session_start, on_session_end, and subagent_stop — the five hook sites that emit extra keys beyond the top-level payload. Closes #49370	2026-06-20 23:23:47 -07:00
Tortugasaur	c02648c5dd	fix(docs): align slash-command and docker docs	2026-06-20 23:23:47 -07:00
teknium1	98ecd0beeb	docs(mcp): fix stale ~0.75s discovery-wait reference in late-refresh docstring The MCP discovery wait is now bounded by the config-driven mcp_discovery_timeout (default 1.5s), not the old 0.75s flat value. Updates the _schedule_mcp_late_refresh docstring that still cited ~0.75s after #49208 made the bound configurable.	2026-06-20 23:23:47 -07:00
Kevin Anderson	b337afdf6e	docs(cli): fix broken terminal-backend guide link in setup wizard The terminal backend onboarding step pointed at /docs/developer-guide/environments, which no longer exists. Point it at the live docs page /docs/user-guide/configuration#terminal-backend-configuration. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-20 23:23:47 -07:00
virtuadex	defeda8c55	docs: sync documentation with current implementation	2026-06-20 23:23:47 -07:00
miha	95d970a752	docs: sharpen software-development skills	2026-06-20 23:23:47 -07:00
aieng-abdullah	74b5cc7ca4	docs(spotify): document 6-month re-auth cycle and add client-level invalid_grant test - Remove the 'you only log in once per machine' claim from spotify.md and document the ~6-month refresh token expiry with re-auth instructions - Add test_client_wraps_invalid_grant_as_spotify_auth_required_error to confirm SpotifyClient wraps AuthError(code=spotify_refresh_invalid_grant) into SpotifyAuthRequiredError with a user-facing message Refs: #28155	2026-06-20 23:23:47 -07:00
EloquentBrush0x	9bd5003d4f	fix(spotify): quarantine dead tokens on terminal refresh failure resolve_spotify_runtime_credentials() called _refresh_spotify_oauth_state() without a try/except, so a terminal failure (HTTP 400/401, invalid_grant, refresh_token_reused) raised AuthError but left the dead refresh_token in auth.json. Every subsequent session re-read and retried the same token over the network, failing identically each time. Fix: wrap the refresh call and, when exc.relogin_required is True and a refresh_token is present, clear the dead OAuth fields (access_token, refresh_token, expires_at, expires_in, obtained_at) and write a last_auth_error quarantine marker to auth.json before re-raising. The next call sees no access_token and fails fast with spotify_access_token_missing — no network retry — and the user is prompted to re-authenticate. Mirrors the quarantine pattern already in place for Nous, xAI-OAuth, Codex-OAuth (#28116, #28118), and MiniMax-OAuth (#28119).	2026-06-20 23:23:47 -07:00
HwangJohn	242962e1f5	docs(providers): clarify vllm qwen reasoning output Signed-off-by: HwangJohn <angelic805@gmail.com> Co-authored-by: OpenAI Codex <codex@openai.com>	2026-06-20 23:23:47 -07:00
X7	fe5c8d2316	fix(docs): document curl, xz-utils, and g++ as Linux prerequisites	2026-06-20 23:23:47 -07:00
Sworntech-dev	fa53e36438	docs(hooks): document manual shell hook allowlisting	2026-06-20 23:23:47 -07:00
DrZM007	f80088f035	docs: add missing Prerequisites/How to Run sections to SKILL.md template The SKILL.md template in CONTRIBUTING.md was missing the Prerequisites and How to Run sections, even though the "modern section order" guidance immediately below it lists both as required. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-20 23:23:47 -07:00
brett-bonner_infodesk	eec9c1d84e	docs(agents): clarify background delegation durability	2026-06-20 23:23:47 -07:00
michael.chen	063155e234	docs(hooks): document subagent_start plugin hook	2026-06-20 23:23:47 -07:00
x7peeps	df4015bbc1	docs: session lifecycle documentation	2026-06-20 23:23:47 -07:00
e10552	2609bcccca	feat(i18n): add complete Spanish translation - Complete README.es.md (full Spanish translation of README) - Add CONTRIBUTING.es.md (Spanish contributing guide) - Add SECURITY.es.md (Spanish security policy) - Fix remaining English strings in locales/es.yaml (resume Matrix section) - Add Spanish badge to README.md All 47 i18n tests pass, including catalog key parity and placeholder parity. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-20 23:23:47 -07:00
Sworntech-dev	38756f2d55	docs(docker): document gateway tool-loop hard stops	2026-06-20 23:23:47 -07:00
GauravPatil2515	cc30e0b659	docs(config): document auxiliary task fallback_chain	2026-06-20 23:23:47 -07:00
Greg DeYoung	5eb158e317	docs(hermes-agent skill): document project context files and their discovery rules Adds a new 'Project Context Files' section to the hermes-agent skill explaining the priority order and discovery rules for .hermes.md, AGENTS.md, CLAUDE.md, and .cursorrules. Specifically clarifies: - .hermes.md walks parents up to the git root (good for monorepos) - AGENTS.md / agents.md is cwd-only (portable to other agents) - The 20K cap and head+tail truncation strategy - The threat-pattern scanner behavior (blocks content, not file) - What --ignore-rules actually skips (everything) Also fixes an inaccurate docstring in agent/agent_init.py for skip_context_files — the previous text only mentioned SOUL.md, AGENTS.md, and .cursorrules, but the actual behavior (per build_context_files_prompt and the --ignore-rules CLI flag) skips all of them plus .hermes.md and CLAUDE.md. Refs: https://github.com/NousResearch/hermes-agent/issues/46775	2026-06-20 23:23:47 -07:00
Andres Sommerhoff	97563ab821	fix: warn on line-oriented newline search patterns	2026-06-20 23:23:47 -07:00
Andres Sommerhoff	eb9a002284	docs: clarify search_files newline regex behavior	2026-06-20 23:23:47 -07:00
lkz-de	6403ed06b3	docs(session-search): document source-first retrieval limits Clarify that session_search is secondary context and direct source identifiers must be inspected first when accessible. Add regression coverage for the tool description.	2026-06-20 23:23:47 -07:00
BBCrypto-web	1eb2959309	docs(.env.example): add missing ELEVENLABS_API_KEY placeholder	2026-06-20 23:23:47 -07:00
skyc1e	46cc0345ae	docs(skills): add hermes-agent verification rule	2026-06-20 23:23:47 -07:00
teknium1	8ac5e90ec2	fix(gateway): dedup image_generate media across the compression boundary After context compression, the agent re-sent an already-delivered generated image on every subsequent turn (#46627). The auto-append fallback rescans full history when the message list shrinks (compression- safe path), deduping against _history_media_paths — but that set was built by scanning ONLY MEDIA: text tags in tool results. image_generate returns its path in a JSON payload field (host_image/image/agent_visible_image), never a MEDIA: tag, so generated-image paths never entered the dedup set and were re-emitted after the boundary. Extract the history-path collection into _collect_history_media_paths(), which now covers BOTH delivery shapes: MEDIA: text tags AND image_generate JSON-payload paths (mirroring what _collect_auto_append_media_tags extracts). The inline block in _handle_message is replaced with a call to the helper. Co-authored-by: liuhao1024 <sunsky.lau@gmail.com>	2026-06-20 23:20:16 -07:00
teknium1	1f874dfe44	fix(compression): stop fallback summary triplicating the latest user ask When LLM summarization fails, the deterministic fallback summary rendered the latest user ask (active_task = "User asked: '<ask>'") verbatim under THREE headings — Historical Task Snapshot, Historical In-Progress State, and Historical Pending User Asks. Re-presenting an already-handled ask as unresolved in-progress/pending work made the model re-answer it AND treat the resurrected ask as the active turn, burying the genuinely-new post-compaction user message (#49307: answer repetition + new-instruction loss, P1). Keep the latest ask once, under Task Snapshot, as historical context only. The In-Progress and Pending-Asks sections now say 'Unknown / None recoverable from deterministic fallback' (consistent with the Active State / Key Decisions / Resolved Questions sections) and explicitly note the ask is historical, not outstanding. The raw turn text still appears in the verbatim 'Last Dropped Turns' transcript — that's the dropped-turn record, not a re-labeled instruction. Note: the separate role=assistant standalone-summary regurgitation (#33256) is left as-is — that role choice is constrained by strict message alternation (user collides with a user-ending head) and is already mitigated by the summary end-marker; forcing the role would risk the alternation invariant. Co-authored-by: r266-tech <r2668940489@gmail.com> Co-authored-by: kyssta-exe <kyssta-exe@users.noreply.github.com>	2026-06-20 23:19:27 -07:00
teknium1	2f3177adf4	fix(compression): protect the summary call from mid-flight interrupts Context compression is atomic, but a gateway interrupt (an incoming user message while the agent is busy) could abort the in-flight summary call. The Codex Responses aux stream polls the thread interrupt flag and raised InterruptedError unconditionally — so compression fell back to a degraded static 'summary unavailable' marker, losing the real handoff (#23975). Add a thread-local interrupt-protection flag (aux_interrupt_protection context manager) in auxiliary_client; the Codex stream's cancellation check honors it. The compressor wraps its summary call_llm in the context manager. Timeouts still fire (a hung call must die) and all other aux tasks (vision, web_extract, title_generation, …) stay interruptible. Re-entrant, so the main-model retry recursion is safe. Co-authored-by: konsisumer <der@konsi.org>	2026-06-20 21:32:30 -07:00
Teknium	4b7f9a4d30	test(matrix): make voice-detection tests hermetic against mention gating (#49946 ) test_matrix_voice flaked in CI (6/7 failing on some shards, passing on others and on main) depending on leaked MATRIX_REQUIRE_MENTION env state. Root cause: the adapter defaults require_mention=True (falling back to the MATRIX_REQUIRE_MENTION env var). These tests fire a group-room audio event with no @mention, so _resolve_message_context drops it before dispatch ('No event was captured') whenever require_mention resolves True — which happens in a clean shard, but an earlier test in another shard can leave MATRIX_REQUIRE_MENTION=false in os.environ and mask it. The plugin migration (#5600105478 adapter→bundled plugin) shifted shard composition and exposed it. Pin require_mention: False in the test adapter config so these media-TYPE detection tests are no longer gated by the mention requirement, regardless of ambient env. Verified: 7/7 pass with MATRIX_REQUIRE_MENTION=true (the failing condition) AND with the env unset.	2026-06-20 21:22:11 -07:00
teknium1	4c349e85f8	fix(gateway): preserve transcript when hygiene auto-compress can't rotate Gateway Session Hygiene auto-compression destroyed the original transcript when the throwaway hygiene agent couldn't rotate the session (#21301, P1). The _hyg_agent is built WITHOUT a session_db, so _compress_context cannot end-and-fork the session (its rotate block is gated on agent._session_db). The session_id stays unchanged, and the rewrite_transcript() call ran UNCONDITIONALLY — replacing the full original transcript with just the head+summary list. Permanent data loss on every hygiene compaction. Guard the rewrite behind 'rotated OR in-place' exactly like the /compress path already does (#44794/#39704): only overwrite when a new session id was minted or in-place compaction succeeded; otherwise preserve the original transcript and log a warning. The token/count bookkeeping that followed the rewrite is moved inside the guard, with no-change values in the preserve branch. Co-authored-by: SandroHub013 <sandrohub013@gmail.com> Co-authored-by: WuTianyi123 <wtyopenclaw@gmail.com> Co-authored-by: kyssta-exe <kyssta-exe@users.noreply.github.com>	2026-06-20 21:07:11 -07:00
teknium1	79f297834a	fix(gateway): widen cron namespace-collision fix to all migrated adapters #49431 corrected parents[2]->parents[3] for discord + raft only. The same bug existed in slack, whatsapp, and telegram adapters (migrated from gateway/platforms/ in `5600105478`): each inserts parents[2] = plugins/ onto sys.path[0], shadowing the real cron/ package with plugins/cron/ so 'import cron.scheduler_provider' raises ModuleNotFoundError on gateway start. Fixes #49410, #49824.	2026-06-20 20:45:12 -07:00
kyssta-exe	4c206b972d	fix(gateway): correct sys.path insertion in plugins to prevent cron namespace collision (#49410 )	2026-06-20 20:45:12 -07:00
teknium	e5e173eefd	chore(release): add AUTHOR_MAP entries for docs PR salvage cluster	2026-06-20 20:42:49 -07:00
mintybasil	5d05415292	Expand .gitignore example	2026-06-20 20:42:49 -07:00
mintybasil	094d9cba6c	Update docs to clarify requirement for gitignore	2026-06-20 20:42:49 -07:00

1 2 3 4 5 ...

12339 commits