mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-05-22 05:22:09 +00:00

docs: round 2 audit — messaging, developer-guide, guides, integrations (#22858 )

Cross-checked 75 docs pages under user-guide/messaging/, developer-guide/,
guides/, and integrations/ against the live registries and gateway code.

messaging/
- index.md: API Server toolset is hermes-api-server (was 'hermes (default)');
  Google Chat slug is hermes-google_chat (underscore — plugin name uses _).
- google_chat.md: drop bogus 'pip install hermes-agent[google_chat]' (no such
  extra); list the actual deps (google-cloud-pubsub, google-api-python-client,
  google-auth, google-auth-oauthlib).
- qqbot.md: config namespace is platforms.qqbot (was platforms.qq, which is
  silently ignored by the adapter); QQ_STT_BASE_URL is not read directly —
  baseUrl lives under platforms.qqbot.extra.stt.
- teams-meetings.md: 'hermes teams-pipeline' is plugin-gated (teams_pipeline
  plugin must be enabled), not a built-in subcommand.
- sms.md: example log line 0.0.0.0:8080 -> 127.0.0.1:8080 (default
  SMS_WEBHOOK_HOST).
- open-webui.md: API_SERVER_* are env vars, not YAML keys — write them to
  per-profile .env, not 'hermes config set' (same pattern fixed in
  api-server.md last round). Also bumped example ports to 8650+ to dodge the
  default webhook (8644)/wecom-callback (8645)/msgraph-webhook (8646)
  collision.

developer-guide/
- architecture.md: tool/toolset counts (61/52 -> 70+/~28); LOC stamps for
  run_agent.py, cli.py, hermes_cli/main.py, setup.py, mcp_tool.py,
  gateway/run.py replaced with 'large file' to stop drifting.
- agent-loop.md: same LOC drift (~13,700 -> 'a large file (15k+ lines)').
- gateway-internals.md: '14+ external messaging platforms' -> '20+'; gateway
  platform tree updated (qqbot is a sub-package, not qqbot.py; added
  yuanbao.py, feishu_comment.py, msgraph_webhook.py); 'gateway/builtin_hooks/
  (always active)' was wrong — it's an empty extension point and
  _register_builtin_hooks() is a no-op stub.
- acp-internals.md: drop fictional 'message_callback' from the bridged-
  callbacks list; clarify thinking_callback is currently set to None.
- provider-runtime.md: provider list was missing AWS Bedrock, Azure Foundry,
  NVIDIA NIM, xAI, Arcee, GMI Cloud, StepFun, Qwen OAuth, Xiaomi, Ollama
  Cloud, LM Studio, Tencent TokenHub. Fallback section described only the
  legacy single-pair model — corrected to the canonical list-form
  fallback_providers chain.
- environments.md: parsers list missing llama4_json and the deepseek_v31
  alias; both register via @register_parser.
- browser-supervisor.md: drop reference to scripts/browser_supervisor_e2e.py
  which doesn't exist in-repo.
- contributing.md: tinker-atropos is a git submodule — note that
  'git submodule update --init' is required if cloning without
  --recurse-submodules.

guides/
- operate-teams-meeting-pipeline.md: cron flags were all wrong — schedule is
  positional (not --schedule), the script-only flag is --no-agent (not
  --script-only), and there's no --command flag. Replaced with a real example
  that creates the script under ~/.hermes/scripts/ and uses the actual flags.
  Also replaced fictional 'hermes cron show <name>' with 'hermes cron status'.
- automation-templates.md: 'cron create --skills "a,b"' doesn't work —
  the flag is --skill (singular, repeatable). Fixed all 5 occurrences via AST
  rewrite.
- minimax-oauth.md: 'hermes auth add minimax-oauth --region cn' silently
  fails because --region isn't registered on the auth-add argparse spec.
  Pointed users at the minimax-cn provider (or MINIMAX_CN_API_KEY env) for
  China-region access.
- cron-script-only.md: 'hermes send' is fictional — replaced the comparison-
  table mention with a webhook-subscription pointer; also fixed the dead link
  to /guides/pipe-script-output (page doesn't exist).
- cron-troubleshooting.md: 'hermes serve' isn't a real subcommand. Pointed
  at 'hermes gateway' (foreground) / 'hermes gateway start' (service).
- local-ollama-setup.md: 'agent.api_timeout' is not a config key. The right
  knob is the HERMES_API_TIMEOUT env var.
- python-library.md: run_conversation() return dict has only final_response
  and messages — task_id is stored on the agent instance, not echoed back.
- use-mcp-with-hermes.md: '--args /c "npx -y …"' wraps the npx command in
  one quoted string, so cmd.exe gets a single arg instead of the multi-token
  command line it needs. Removed the surrounding quotes — argparse nargs='*'
  collects each token correctly.

integrations/
- providers.md: Bedrock guardrail YAML keys were 'id'/'version' (don't exist);
  actual keys are guardrail_identifier/guardrail_version (matches DEFAULT_CONFIG
  and the run_agent.py reader). GMI default base URL (api.gmi.ai/v1 ->
  api.gmi-serving.com/v1) and portal URL (inference.gmi.ai -> www.gmicloud.ai)
  refreshed. Fallback section rewritten to lead with the canonical
  fallback_providers list form (was leading with the legacy fallback_model
  single dict); supported-providers list extended to include azure-foundry,
  alibaba-coding-plan, lmstudio.

index.md
- '68 built-in tools' -> '70+'; '15+ platforms' was both inconsistent with
  integrations/index.md ('19+') and undercounted — bumped to 20+ and added
  Weixin/QQ Bot/Yuanbao/Google Chat to the list.

Validation: 'npm run build' clean (exit 0); broken-link count unchanged at
155 (same as round-1 post-skill-regen baseline). 24 files, +132/-89.

2026-05-09 15:00:24 -07:00

8.3 KiB

Raw Blame History

title	description
Operate the Teams Meeting Pipeline	Runbook, go-live checklist, and operator worksheet for the Microsoft Teams meeting pipeline

Operate the Teams Meeting Pipeline

Use this guide after you have already enabled the feature from Teams Meetings.

This page covers:

operator CLI flows
routine subscription maintenance
failure triage
go-live checks
rollout worksheet

Core Operator Commands

Validate the config snapshot

hermes teams-pipeline validate

Use this first after any config change.

Inspect token health

hermes teams-pipeline token-health
hermes teams-pipeline token-health --force-refresh

Use --force-refresh when you suspect stale auth state.

Inspect subscriptions

hermes teams-pipeline subscriptions

Renew near-expiry subscriptions

hermes teams-pipeline maintain-subscriptions
hermes teams-pipeline maintain-subscriptions --dry-run

Automating subscription renewal (REQUIRED for production)

Microsoft Graph subscriptions expire in at most 72 hours. If nothing renews them, meeting notifications silently stop after 3 days and the pipeline looks "broken." This is the #1 operational failure mode for any Graph-backed integration.

You MUST run maintain-subscriptions on a schedule. Pick one of these three options:

Option 1: Hermes cron (recommended if you already run the Hermes gateway)

Hermes ships a built-in cron scheduler. The --no-agent mode runs a script as the job (rather than using an LLM), and --script must point at a file under ~/.hermes/scripts/. First create the script:

mkdir -p ~/.hermes/scripts
cat > ~/.hermes/scripts/maintain-teams-subscriptions.sh <<'EOF'
#!/usr/bin/env bash
exec hermes teams-pipeline maintain-subscriptions
EOF
chmod +x ~/.hermes/scripts/maintain-teams-subscriptions.sh

Then register a script-only cron job that runs every 12 hours (gives 6x headroom against the 72h expiry window):

hermes cron create "0 */12 * * *" \
  --name "teams-pipeline-maintain-subscriptions" \
  --no-agent \
  --script maintain-teams-subscriptions.sh \
  --deliver local

Verify it was registered and inspect the next run time:

hermes cron list
hermes cron status        # scheduler status

Option 2: systemd timer (recommended for Linux production deployments)

Create /etc/systemd/system/hermes-teams-pipeline-maintain.service:

[Unit]
Description=Hermes Teams pipeline subscription maintenance
After=network-online.target

[Service]
Type=oneshot
User=hermes
EnvironmentFile=/etc/hermes/env
ExecStart=/usr/local/bin/hermes teams-pipeline maintain-subscriptions

And /etc/systemd/system/hermes-teams-pipeline-maintain.timer:

[Unit]
Description=Run Hermes Teams pipeline subscription maintenance every 12 hours

[Timer]
OnBootSec=5min
OnUnitActiveSec=12h
Persistent=true

[Install]
WantedBy=timers.target

Enable:

sudo systemctl daemon-reload
sudo systemctl enable --now hermes-teams-pipeline-maintain.timer
systemctl list-timers hermes-teams-pipeline-maintain.timer

Option 3: Plain crontab

0 */12 * * * /usr/local/bin/hermes teams-pipeline maintain-subscriptions >> /var/log/hermes/teams-pipeline-maintain.log 2>&1

Make sure the cron environment has the MSGRAPH_* credentials. Simplest fix: source ~/.hermes/.env at the top of a wrapper script that crontab calls.

Verifying renewal is working

After you've set up the schedule, check renewal activity after the first scheduled run:

hermes teams-pipeline subscriptions   # should show expirationDateTime advanced
hermes teams-pipeline maintain-subscriptions --dry-run   # should show "0 expiring soon" most of the time

If you ever see your Graph webhook mysteriously "stop working" after exactly ~72 hours, this is the first thing to check: did the renewal job actually run?

Inspect recent jobs

hermes teams-pipeline list
hermes teams-pipeline list --status failed
hermes teams-pipeline show <job-id>

Replay a stored job

hermes teams-pipeline run <job-id>

Dry-run meeting artifact fetches

hermes teams-pipeline fetch --meeting-id <meeting-id>
hermes teams-pipeline fetch --join-web-url "<join-url>"

Routine Runbook

After first setup

Run these in order:

hermes teams-pipeline validate
hermes teams-pipeline token-health --force-refresh
hermes teams-pipeline subscriptions

Then trigger or wait for a real meeting event and confirm:

hermes teams-pipeline list
hermes teams-pipeline show <job-id>

Daily or periodic checks

run hermes teams-pipeline maintain-subscriptions --dry-run
inspect hermes teams-pipeline list --status failed
verify the Teams delivery target is still the correct chat or channel

Before changing webhook URLs or delivery targets

update the public notification URL or Teams target config
run hermes teams-pipeline validate
renew or recreate affected subscriptions
confirm new events land in the expected sink

Failure Triage

No jobs are being created

Check:

msgraph_webhook is enabled
the public notification URL points to /msgraph/webhook
the client state in the subscription matches MSGRAPH_WEBHOOK_CLIENT_STATE
subscriptions still exist remotely and are not expired

Jobs stay in retry or fail before summarization

Check:

transcript permissions and availability
recording permissions and artifact availability
ffmpeg availability if recording fallback is enabled
Graph token health

Summaries are produced but not delivered to Teams

Check:

platforms.teams.enabled: true
delivery_mode
incoming_webhook_url for webhook mode
chat_id or team_id plus channel_id for Graph mode
Teams auth config if Graph posting is used

Duplicate or unexpected replays

Check:

whether you manually replayed a job with hermes teams-pipeline run
whether the sink record already exists for that meeting
whether you intentionally enabled a resend path in your local config

Go-Live Checklist

Graph credentials are present and correct
msgraph_webhook is enabled and reachable from the public internet
MSGRAPH_WEBHOOK_CLIENT_STATE is set and matches subscriptions
transcript subscription is created
recording subscription is created if STT fallback is required
ffmpeg is installed if recording fallback is enabled
Teams outbound delivery target is configured and verified
Notion and Linear sinks are configured only if actually needed
hermes teams-pipeline validate returns an OK snapshot
hermes teams-pipeline token-health --force-refresh succeeds
maintain-subscriptions is scheduled (Hermes cron, systemd timer, or crontab — see Automating subscription renewal). Without this, Graph subscriptions silently expire within 72 hours.
a real end-to-end meeting event has produced a stored job
at least one summary has reached the intended delivery sink

Delivery-Mode Decision Guide

Mode	Use when	Tradeoff
`incoming_webhook`	you only need simple posting into Teams	simplest setup, less control
`graph`	you need channel or chat posting through Graph	more control, more auth and target config

Operator Worksheet

Fill this out before rollout:

Item	Value
Public notification URL
Graph tenant ID
Graph client ID
Webhook client state
Transcript resource subscription
Recording resource subscription
Teams delivery mode
Teams chat ID or team/channel
Notion database ID
Linear team ID
Store path override, if any
Owner for daily checks

Change Review Worksheet

Use this before changing the deployment:

Question	Answer
Are we changing the public webhook URL?
Are we rotating Graph credentials?
Are we changing Teams delivery mode?
Are we moving to a new Teams chat or channel?
Do subscriptions need to be recreated or renewed?
Do we need a fresh end-to-end verification run?

8.3 KiB Raw Blame History

Operate the Teams Meeting Pipeline

Core Operator Commands

Validate the config snapshot

Inspect token health

Inspect subscriptions

Renew near-expiry subscriptions

Automating subscription renewal (REQUIRED for production)

Option 1: Hermes cron (recommended if you already run the Hermes gateway)

Option 2: systemd timer (recommended for Linux production deployments)

Option 3: Plain crontab

Verifying renewal is working

Inspect recent jobs

Replay a stored job

Dry-run meeting artifact fetches

Routine Runbook

After first setup

Daily or periodic checks

Before changing webhook URLs or delivery targets

Failure Triage

No jobs are being created

Jobs stay in retry or fail before summarization

Summaries are produced but not delivered to Teams

Duplicate or unexpected replays

Go-Live Checklist

Delivery-Mode Decision Guide

Operator Worksheet

Change Review Worksheet

Related Docs

8.3 KiB

Raw Blame History