feat(privacy): redact PII from LLM context when privacy.redact_pii is enabled

Add privacy.redact_pii config option (boolean, default false). When
enabled, the gateway redacts personally identifiable information from
the system prompt before sending it to the LLM provider:

- Phone numbers (user IDs on WhatsApp/Signal) → hashed to user_<sha256>
- User IDs → hashed to user_<sha256>
- Chat IDs → numeric portion hashed, platform prefix preserved
- Home channel IDs → hashed
- Names/usernames → NOT affected (user-chosen, publicly visible)

Hashes are deterministic (same user → same hash) so the model can
still distinguish users in group chats. Routing and delivery use
the original values internally — redaction only affects LLM context.

Inspired by OpenClaw PR #47959.
This commit is contained in:
teknium1 2026-03-16 05:48:45 -07:00
parent 7d2c786acc
commit c51e7b4af7
6 changed files with 252 additions and 6 deletions

View file

@ -832,6 +832,25 @@ display:
| `all` | Every tool call with a short preview (default) |
| `verbose` | Full args, results, and debug logs |
## Privacy
```yaml
privacy:
redact_pii: false # Strip PII from LLM context (gateway only)
```
When `redact_pii` is `true`, the gateway redacts personally identifiable information from the system prompt before sending it to the LLM:
| Field | Treatment |
|-------|-----------|
| Phone numbers (user ID on WhatsApp/Signal) | Hashed to `user_<12-char-sha256>` |
| User IDs | Hashed to `user_<12-char-sha256>` |
| Chat IDs | Numeric portion hashed, platform prefix preserved (`telegram:<hash>`) |
| Home channel IDs | Numeric portion hashed |
| User names / usernames | **Not affected** (user-chosen, publicly visible) |
Hashes are deterministic — the same user always maps to the same hash, so the model can still distinguish between users in group chats. Routing and delivery use the original values internally.
## Speech-to-Text (STT)
```yaml