docs(teams-pipeline): cron renewal recipe, sidebar wiring, skill rewrite

Fifth and final slice polish on top of @dlkakbs's docs + skill. Three
things ship here:

1. Subscription renewal cron recipe (the #1 operational footgun).

   Microsoft Graph webhook subscriptions expire at 72 hours max and
   don't auto-renew. The shipped operator runbook mentioned
   `maintain-subscriptions --dry-run` as a "daily or periodic check"
   but never told operators how to actually automate it. Without a
   scheduled job, any production deployment silently stops ingesting
   meetings three days after go-live.

   Adds an "Automating subscription renewal (REQUIRED for production)"
   section to website/docs/guides/operate-teams-meeting-pipeline.md
   with three concrete options and copy-pasteable configs:

   - Option 1: Hermes cron (`hermes cron add --schedule "0 */12 * * *"
     --script-only --command "hermes teams-pipeline maintain-subscriptions"`)
   - Option 2: systemd service + timer (12h cadence, Persistent=true
     so missed runs catch up after reboots)
   - Option 3: plain crontab with a wrapper that sources .env for
     credentials

   Go-Live Checklist gains a bolded mandatory item for the schedule
   being in place, with a cross-link to the section.

   website/docs/user-guide/messaging/teams-meetings.md adds a
   `::⚠️::` admonition right after the manual `subscribe`
   examples so anyone who creates a subscription manually is told
   the same day that it will silently expire in 72 hours.

2. Sidebar wiring. Shela's new docs pages (teams-meetings.md and
   operate-teams-meeting-pipeline.md) weren't in website/sidebars.ts,
   so they were orphaned URLs — reachable only if someone knew the
   path. Wired teams-meetings into Messaging Platforms next to the
   existing teams entry, and operate-teams-meeting-pipeline into
   Guides & Tutorials next to microsoft-graph-app-registration from
   PR #21922. Adjacent placement keeps the related pages discoverable
   from each other.

3. SKILL.md rewrite (v1.0.0 → v1.1.0).

   The original skill had five Turkish-only trigger phrases, which
   works in a Turkish-speaking session but doesn't match English
   triggers. Rewrote the skill to:

   - Describe triggers by intent instead of exact phrases, with
     explicit "works in any language" framing and example phrases
     in both English and Turkish.
   - Add a Decision Tree section covering the three most common user
     asks (missing summary, setup verification, re-run request) and
     the specific CLI command sequence for each.
   - Add a dedicated "Critical pitfall: Graph subscriptions expire
     in 72 hours" section that tells the agent exactly what to do
     when a user reports "worked yesterday, nothing today" — the
     most common operational failure mode.
   - Expand the command reference into three labeled groups (Status
     and inspection / Re-running and debugging / Subscription
     management) so the agent can reach for the right command
     without scanning.
   - Add cross-links to all four related docs pages (Azure app
     registration, webhook listener setup, full pipeline setup,
     operator runbook).

Validation:
- npm run build: all new pages route, anchor to
  #automating-subscription-renewal-required-for-production resolves
  from both the runbook TOC and the teams-meetings.md admonition.
- scripts/run_tests.sh on the relevant test suites (607 tests): all
  pass.
This commit is contained in:
Teknium 2026-05-08 12:06:53 -07:00
parent 729a659a3c
commit 242da9db96
4 changed files with 179 additions and 21 deletions

View file

@ -46,6 +46,89 @@ hermes teams-pipeline maintain-subscriptions
hermes teams-pipeline maintain-subscriptions --dry-run
```
### Automating subscription renewal (REQUIRED for production)
**Microsoft Graph subscriptions expire in at most 72 hours.** If nothing renews them, meeting notifications silently stop after 3 days and the pipeline looks "broken." This is the #1 operational failure mode for any Graph-backed integration.
You MUST run `maintain-subscriptions` on a schedule. Pick one of these three options:
#### Option 1: Hermes cron (recommended if you already run the Hermes gateway)
Hermes ships a built-in cron scheduler. Add a script-only cron job that runs every 12 hours (gives 6x headroom against the 72h expiry window):
```bash
hermes cron add \
--name "teams-pipeline-maintain-subscriptions" \
--schedule "0 */12 * * *" \
--script-only \
--command "hermes teams-pipeline maintain-subscriptions"
```
Verify it was registered and inspect the next run time:
```bash
hermes cron list
hermes cron show teams-pipeline-maintain-subscriptions
```
#### Option 2: systemd timer (recommended for Linux production deployments)
Create `/etc/systemd/system/hermes-teams-pipeline-maintain.service`:
```ini
[Unit]
Description=Hermes Teams pipeline subscription maintenance
After=network-online.target
[Service]
Type=oneshot
User=hermes
EnvironmentFile=/etc/hermes/env
ExecStart=/usr/local/bin/hermes teams-pipeline maintain-subscriptions
```
And `/etc/systemd/system/hermes-teams-pipeline-maintain.timer`:
```ini
[Unit]
Description=Run Hermes Teams pipeline subscription maintenance every 12 hours
[Timer]
OnBootSec=5min
OnUnitActiveSec=12h
Persistent=true
[Install]
WantedBy=timers.target
```
Enable:
```bash
sudo systemctl daemon-reload
sudo systemctl enable --now hermes-teams-pipeline-maintain.timer
systemctl list-timers hermes-teams-pipeline-maintain.timer
```
#### Option 3: Plain crontab
```cron
0 */12 * * * /usr/local/bin/hermes teams-pipeline maintain-subscriptions >> /var/log/hermes/teams-pipeline-maintain.log 2>&1
```
Make sure the cron environment has the `MSGRAPH_*` credentials. Simplest fix: source `~/.hermes/.env` at the top of a wrapper script that crontab calls.
#### Verifying renewal is working
After you've set up the schedule, check renewal activity after the first scheduled run:
```bash
hermes teams-pipeline subscriptions # should show expirationDateTime advanced
hermes teams-pipeline maintain-subscriptions --dry-run # should show "0 expiring soon" most of the time
```
If you ever see your Graph webhook mysteriously "stop working" after exactly ~72 hours, this is the first thing to check: did the renewal job actually run?
### Inspect recent jobs
```bash
@ -145,6 +228,7 @@ Check:
- [ ] Notion and Linear sinks are configured only if actually needed
- [ ] `hermes teams-pipeline validate` returns an OK snapshot
- [ ] `hermes teams-pipeline token-health --force-refresh` succeeds
- [ ] **`maintain-subscriptions` is scheduled** (Hermes cron, systemd timer, or crontab — see [Automating subscription renewal](#automating-subscription-renewal-required-for-production)). Without this, Graph subscriptions silently expire within 72 hours.
- [ ] a real end-to-end meeting event has produced a stored job
- [ ] at least one summary has reached the intended delivery sink

View file

@ -194,6 +194,12 @@ hermes teams-pipeline subscribe \
--client-state "$MSGRAPH_WEBHOOK_CLIENT_STATE"
```
:::warning Graph subscriptions expire in 72 hours
Microsoft Graph caps webhook subscriptions at 72 hours and will not auto-renew them. You MUST schedule `hermes teams-pipeline maintain-subscriptions` before going live, or notifications will silently stop three days after any manual subscription creation. See [Automating subscription renewal](/docs/guides/operate-teams-meeting-pipeline#automating-subscription-renewal-required-for-production) in the operator runbook — three options (Hermes cron, systemd timer, plain crontab).
:::
For subscription maintenance and day-2 operator flows, continue with the guide: [Operate the Teams Meeting Pipeline](/docs/guides/operate-teams-meeting-pipeline).
## Validation