Context files (AGENTS.md, CLAUDE.md, .hermes.md, .cursorrules, SOUL.md) were
hard-capped at a flat 20K chars before head/tail truncation. Among the agent
harnesses we track, only Codex caps project docs at all (32 KiB); Claude Code,
OpenCode, and Cline load them whole. The flat 20K predates large context
windows and silently truncates real-world AGENTS.md files.
B — dynamic cap: when context_file_max_chars is unset (now the shipped
default), the cap scales with the model's context window
(ctx_tokens * 4 * 0.06, floor 20K, ceiling 500K). Small-context models stay at
the historical 20K; a 200K model gets 48K; large models stop truncating real
docs. An explicit context_file_max_chars still wins. Context length is resolved
once per conversation (stable -> prompt cache untouched).
C — when truncation does happen, the marker now names the concrete file path
and tells the agent to read_file it for the full content.
Validation: 154 targeted tests + full agent/ + hermes_cli/ + test_config
(0 failures); E2E against a real 60K AGENTS.md confirms small windows truncate
with the path-bearing marker, large windows load whole, and the system prompt
is byte-stable across rebuilds.
* feat(agent): coding-context posture with per-model edit-format tuning
Hermes detects when it's running in a coding context — an interactive
surface (CLI, TUI, ACP, desktop) sitting in a code workspace (git repo or
recognised project root) — and shifts into a coding posture. Outside that
(chat platforms, non-workspaces) nothing changes.
The posture is modelled as a frozen RuntimeMode selected from a small
ContextProfile registry (coding/general). A profile is data: the toolset to
collapse to, the operating brief to inject, and seams for model routing and
memory. Every domain reads the same resolved object instead of re-probing
git/config on its own:
- System prompt — RuntimeMode.system_blocks(): an operating brief (gather
context before editing, edit through tools not chat, verify with terminal,
cap retry loops) plus a live git/workspace snapshot, built once and baked
into the stable prompt tier so per-conversation caching is preserved.
- Per-model edit-format tuning — the brief nudges each model family toward
the patch mode it handles best: OpenAI/Codex toward mode='patch' (V4A
multi-file diffs), Anthropic toward mode='replace' (string replacement).
The model id rides on RuntimeMode; unknown families keep neutral wording.
- Skill index — non-coding skill categories are pruned from the prompt's
skill index (discovery-only; skills_list/skill_view still reach the full
catalog, with a disclosure note).
- Toolset — only under the opt-in 'focus' mode does the posture collapse to
the coding toolset + enabled MCP servers; the default posture is
prompt-only and never overrides configured toolsets.
Activation via agent.coding_context: auto (default), focus, on, off.
Subagents inherit the posture for free via toolset inheritance + the shared
prompt builder. Detection is not memoized so a long-lived gateway/TUI
process can't pin a stale posture across working directories.
* feat(agent): cover new-file authoring in the coding edit-format nudge
The per-model edit-format guidance only addressed editing existing code
(patch mode='patch' vs 'replace'), but authoring a brand-new file —
write_file, not patch — is a large fraction of real coding work and the
nudge was silent on it. Surfaced when building a single-file artifact where
the dominant operation was write_file and the steering offered no guidance.
Both family lines now lead with "author new files with write_file; for
edits to existing code prefer ...". Tests assert write_file appears in each
family's brief; unknown families still get neutral wording.
* docs(agent): correct memoization docstring + clarify TUI config-load asymmetry
* feat(agent): sharpen the coding posture — verify-loop facts, wider edit steering, $HOME guard
Tuning pass on the coding posture from dogfooding it as a harness:
- Workspace snapshot now hands the model its verify loop up front:
detected manifests + package manager (lockfile sniff), the exact
verify commands (package.json scripts, Makefile targets,
scripts/run_tests.sh, pytest config), and which context files
(AGENTS.md / CLAUDE.md / .cursorrules) exist at the root. Marker-only
(non-git) projects get the snapshot too instead of nothing. The
"verify before claiming done" brief line was the highest-value piece
in evals — this turns it from advice into an executable loop instead
of making the model rediscover the test command every session. Still
stat-cheap, size-guarded reads, built once at prompt time.
- Edit-format steering covers the families Hermes actually serves:
Gemini and open-weight coding models (DeepSeek, Qwen, Kimi, GLM,
Grok, Hermes, Llama, Mistral, Devstral, MiniMax) steer to
mode='replace' — their RL scaffolds use str_replace-style editors.
Previously only GPT/Codex and Claude families got steering; the
models Hermes users disproportionately run all fell to neutral.
- Operating brief gains four behaviors elite harnesses encode: batch
independent reads/searches in one turn; fix root causes and the bug
class (sibling call paths), not the reported site; no drive-by
refactors/renames/reformatting; never read, print, or commit secrets.
Plus a patch-failure escalation ladder: after the same region fails
twice, rewrite the enclosing function/file with write_file instead of
a third patch attempt.
- $HOME dotfiles guard: a git repo rooted exactly at the home directory
(or a marker sitting in it, e.g. a global ~/AGENTS.md) is user config,
not a code workspace — without the guard, every session anywhere under
a dotfiles-managed home silently flipped to the coding posture. Real
projects under such a home still detect via their own markers/repos;
'on' mode bypasses the guard.