hermes-agent/website/docs/guides/operate-teams-meeting-pipeline.md
Teknium 242da9db96 docs(teams-pipeline): cron renewal recipe, sidebar wiring, skill rewrite
Fifth and final slice polish on top of @dlkakbs's docs + skill. Three
things ship here:

1. Subscription renewal cron recipe (the #1 operational footgun).

   Microsoft Graph webhook subscriptions expire at 72 hours max and
   don't auto-renew. The shipped operator runbook mentioned
   `maintain-subscriptions --dry-run` as a "daily or periodic check"
   but never told operators how to actually automate it. Without a
   scheduled job, any production deployment silently stops ingesting
   meetings three days after go-live.

   Adds an "Automating subscription renewal (REQUIRED for production)"
   section to website/docs/guides/operate-teams-meeting-pipeline.md
   with three concrete options and copy-pasteable configs:

   - Option 1: Hermes cron (`hermes cron add --schedule "0 */12 * * *"
     --script-only --command "hermes teams-pipeline maintain-subscriptions"`)
   - Option 2: systemd service + timer (12h cadence, Persistent=true
     so missed runs catch up after reboots)
   - Option 3: plain crontab with a wrapper that sources .env for
     credentials

   Go-Live Checklist gains a bolded mandatory item for the schedule
   being in place, with a cross-link to the section.

   website/docs/user-guide/messaging/teams-meetings.md adds a
   `::⚠️::` admonition right after the manual `subscribe`
   examples so anyone who creates a subscription manually is told
   the same day that it will silently expire in 72 hours.

2. Sidebar wiring. Shela's new docs pages (teams-meetings.md and
   operate-teams-meeting-pipeline.md) weren't in website/sidebars.ts,
   so they were orphaned URLs — reachable only if someone knew the
   path. Wired teams-meetings into Messaging Platforms next to the
   existing teams entry, and operate-teams-meeting-pipeline into
   Guides & Tutorials next to microsoft-graph-app-registration from
   PR #21922. Adjacent placement keeps the related pages discoverable
   from each other.

3. SKILL.md rewrite (v1.0.0 → v1.1.0).

   The original skill had five Turkish-only trigger phrases, which
   works in a Turkish-speaking session but doesn't match English
   triggers. Rewrote the skill to:

   - Describe triggers by intent instead of exact phrases, with
     explicit "works in any language" framing and example phrases
     in both English and Turkish.
   - Add a Decision Tree section covering the three most common user
     asks (missing summary, setup verification, re-run request) and
     the specific CLI command sequence for each.
   - Add a dedicated "Critical pitfall: Graph subscriptions expire
     in 72 hours" section that tells the agent exactly what to do
     when a user reports "worked yesterday, nothing today" — the
     most common operational failure mode.
   - Expand the command reference into three labeled groups (Status
     and inspection / Re-running and debugging / Subscription
     management) so the agent can reach for the right command
     without scanning.
   - Add cross-links to all four related docs pages (Azure app
     registration, webhook listener setup, full pipeline setup,
     operator runbook).

Validation:
- npm run build: all new pages route, anchor to
  #automating-subscription-renewal-required-for-production resolves
  from both the runbook TOC and the teams-meetings.md admonition.
- scripts/run_tests.sh on the relevant test suites (607 tests): all
  pass.
2026-05-08 12:41:41 -07:00

7.9 KiB

title description
Operate the Teams Meeting Pipeline Runbook, go-live checklist, and operator worksheet for the Microsoft Teams meeting pipeline

Operate the Teams Meeting Pipeline

Use this guide after you have already enabled the feature from Teams Meetings.

This page covers:

  • operator CLI flows
  • routine subscription maintenance
  • failure triage
  • go-live checks
  • rollout worksheet

Core Operator Commands

Validate the config snapshot

hermes teams-pipeline validate

Use this first after any config change.

Inspect token health

hermes teams-pipeline token-health
hermes teams-pipeline token-health --force-refresh

Use --force-refresh when you suspect stale auth state.

Inspect subscriptions

hermes teams-pipeline subscriptions

Renew near-expiry subscriptions

hermes teams-pipeline maintain-subscriptions
hermes teams-pipeline maintain-subscriptions --dry-run

Automating subscription renewal (REQUIRED for production)

Microsoft Graph subscriptions expire in at most 72 hours. If nothing renews them, meeting notifications silently stop after 3 days and the pipeline looks "broken." This is the #1 operational failure mode for any Graph-backed integration.

You MUST run maintain-subscriptions on a schedule. Pick one of these three options:

Hermes ships a built-in cron scheduler. Add a script-only cron job that runs every 12 hours (gives 6x headroom against the 72h expiry window):

hermes cron add \
  --name "teams-pipeline-maintain-subscriptions" \
  --schedule "0 */12 * * *" \
  --script-only \
  --command "hermes teams-pipeline maintain-subscriptions"

Verify it was registered and inspect the next run time:

hermes cron list
hermes cron show teams-pipeline-maintain-subscriptions

Create /etc/systemd/system/hermes-teams-pipeline-maintain.service:

[Unit]
Description=Hermes Teams pipeline subscription maintenance
After=network-online.target

[Service]
Type=oneshot
User=hermes
EnvironmentFile=/etc/hermes/env
ExecStart=/usr/local/bin/hermes teams-pipeline maintain-subscriptions

And /etc/systemd/system/hermes-teams-pipeline-maintain.timer:

[Unit]
Description=Run Hermes Teams pipeline subscription maintenance every 12 hours

[Timer]
OnBootSec=5min
OnUnitActiveSec=12h
Persistent=true

[Install]
WantedBy=timers.target

Enable:

sudo systemctl daemon-reload
sudo systemctl enable --now hermes-teams-pipeline-maintain.timer
systemctl list-timers hermes-teams-pipeline-maintain.timer

Option 3: Plain crontab

0 */12 * * * /usr/local/bin/hermes teams-pipeline maintain-subscriptions >> /var/log/hermes/teams-pipeline-maintain.log 2>&1

Make sure the cron environment has the MSGRAPH_* credentials. Simplest fix: source ~/.hermes/.env at the top of a wrapper script that crontab calls.

Verifying renewal is working

After you've set up the schedule, check renewal activity after the first scheduled run:

hermes teams-pipeline subscriptions   # should show expirationDateTime advanced
hermes teams-pipeline maintain-subscriptions --dry-run   # should show "0 expiring soon" most of the time

If you ever see your Graph webhook mysteriously "stop working" after exactly ~72 hours, this is the first thing to check: did the renewal job actually run?

Inspect recent jobs

hermes teams-pipeline list
hermes teams-pipeline list --status failed
hermes teams-pipeline show <job-id>

Replay a stored job

hermes teams-pipeline run <job-id>

Dry-run meeting artifact fetches

hermes teams-pipeline fetch --meeting-id <meeting-id>
hermes teams-pipeline fetch --join-web-url "<join-url>"

Routine Runbook

After first setup

Run these in order:

hermes teams-pipeline validate
hermes teams-pipeline token-health --force-refresh
hermes teams-pipeline subscriptions

Then trigger or wait for a real meeting event and confirm:

hermes teams-pipeline list
hermes teams-pipeline show <job-id>

Daily or periodic checks

  • run hermes teams-pipeline maintain-subscriptions --dry-run
  • inspect hermes teams-pipeline list --status failed
  • verify the Teams delivery target is still the correct chat or channel

Before changing webhook URLs or delivery targets

  • update the public notification URL or Teams target config
  • run hermes teams-pipeline validate
  • renew or recreate affected subscriptions
  • confirm new events land in the expected sink

Failure Triage

No jobs are being created

Check:

  • msgraph_webhook is enabled
  • the public notification URL points to /msgraph/webhook
  • the client state in the subscription matches MSGRAPH_WEBHOOK_CLIENT_STATE
  • subscriptions still exist remotely and are not expired

Jobs stay in retry or fail before summarization

Check:

  • transcript permissions and availability
  • recording permissions and artifact availability
  • ffmpeg availability if recording fallback is enabled
  • Graph token health

Summaries are produced but not delivered to Teams

Check:

  • platforms.teams.enabled: true
  • delivery_mode
  • incoming_webhook_url for webhook mode
  • chat_id or team_id plus channel_id for Graph mode
  • Teams auth config if Graph posting is used

Duplicate or unexpected replays

Check:

  • whether you manually replayed a job with hermes teams-pipeline run
  • whether the sink record already exists for that meeting
  • whether you intentionally enabled a resend path in your local config

Go-Live Checklist

  • Graph credentials are present and correct
  • msgraph_webhook is enabled and reachable from the public internet
  • MSGRAPH_WEBHOOK_CLIENT_STATE is set and matches subscriptions
  • transcript subscription is created
  • recording subscription is created if STT fallback is required
  • ffmpeg is installed if recording fallback is enabled
  • Teams outbound delivery target is configured and verified
  • Notion and Linear sinks are configured only if actually needed
  • hermes teams-pipeline validate returns an OK snapshot
  • hermes teams-pipeline token-health --force-refresh succeeds
  • maintain-subscriptions is scheduled (Hermes cron, systemd timer, or crontab — see Automating subscription renewal). Without this, Graph subscriptions silently expire within 72 hours.
  • a real end-to-end meeting event has produced a stored job
  • at least one summary has reached the intended delivery sink

Delivery-Mode Decision Guide

Mode Use when Tradeoff
incoming_webhook you only need simple posting into Teams simplest setup, less control
graph you need channel or chat posting through Graph more control, more auth and target config

Operator Worksheet

Fill this out before rollout:

Item Value
Public notification URL
Graph tenant ID
Graph client ID
Webhook client state
Transcript resource subscription
Recording resource subscription
Teams delivery mode
Teams chat ID or team/channel
Notion database ID
Linear team ID
Store path override, if any
Owner for daily checks

Change Review Worksheet

Use this before changing the deployment:

Question Answer
Are we changing the public webhook URL?
Are we rotating Graph credentials?
Are we changing Teams delivery mode?
Are we moving to a new Teams chat or channel?
Do subscriptions need to be recreated or renewed?
Do we need a fresh end-to-end verification run?