Commit graph

2 commits

Author SHA1 Message Date
Teknium
6d2fa03837
fix: UTF-8 config encoding, pairing hint, credential_pool key, header normalization (#7174)
Four small fixes: (1) UTF-8 encoding for config open (@zhangchn #7063), (2) pairing hint placeholders (@konsisumer #7057), (3) missing credential_pool in cheap route (@kuishou68 #7025), (4) case-insensitive rate limit headers (@kuishou68 #7019).
2026-04-10 05:33:48 -07:00
Teknium
8dfc96dbbb
feat: capture provider rate limit headers and show in /usage (#6541)
Parse x-ratelimit-* headers from inference API responses (Nous Portal,
OpenRouter, OpenAI-compatible) and display them in the /usage command.

- New agent/rate_limit_tracker.py: parse 12 rate limit headers (RPM/RPH/
  TPM/TPH limits, remaining, reset timers), format as progress bars (CLI)
  or compact one-liner (gateway)
- Hook into streaming path in run_agent.py: stream.response.headers is
  available on the OpenAI SDK Stream object before chunks are consumed
- CLI /usage: appends rate limit section with progress bars + warnings
  when any bucket exceeds 80%
- Gateway /usage: appends compact rate limit summary
- 24 unit tests covering parsing, formatting, edge cases

Headers captured per response:
  x-ratelimit-{limit,remaining,reset}-{requests,tokens}{,-1h}

Example CLI display:
  Nous Rate Limits (captured just now):
    Requests/min [░░░░░░░░░░░░░░░░░░░░]  0.1%  1/800 used  (799 left, resets in 59s)
    Tokens/hr    [░░░░░░░░░░░░░░░░░░░░]  0.0%  49/336.0M   (336.0M left, resets in 52m)
2026-04-09 03:43:14 -07:00