fix(providers): complete NVIDIA NIM parity with other providers

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-20 15:33:54 +00:00

Follow-up on the native NVIDIA NIM provider salvage. The original PR wired
PROVIDER_REGISTRY + HERMES_OVERLAYS correctly but missed several touchpoints
required for full parity with other OpenAI-compatible providers (xai,
huggingface, deepseek, zai).

Gaps closed:

- hermes_cli/main.py:
  - Add 'nvidia' to the _model_flow_api_key_provider dispatch tuple so
    selecting 'NVIDIA NIM' in `hermes model` actually runs the api-key
    provider flow (previously fell through silently).
  - Add 'nvidia' to `hermes chat --provider` argparse choices so the
    documented test command (`hermes chat --provider nvidia --model ...`)
    parses successfully.

- hermes_cli/config.py: Register NVIDIA_API_KEY and NVIDIA_BASE_URL in
  OPTIONAL_ENV_VARS so setup wizard can prompt for them and they're
  auto-added to the subprocess env blocklist.

- hermes_cli/doctor.py: Add NVIDIA NIM row to `_apikey_providers` so
  `hermes doctor` probes https://integrate.api.nvidia.com/v1/models.

- hermes_cli/dump.py: Add NVIDIA_API_KEY → 'nvidia' mapping for
  `hermes dump` credential masking.

- tests/tools/test_local_env_blocklist.py: Extend registry_vars fixture
  with NVIDIA_API_KEY to verify it's blocked from leaking into subprocesses.

- agent/model_metadata.py: Add 'nemotron' → 131072 context-length entry
  so all Nemotron variants get 128K context via substring match (rather
  than falling back to MINIMUM_CONTEXT_LENGTH).

- hermes_cli/models.py: Fix hallucinated model ID
  'nvidia/nemotron-3-nano-8b-a4b' → 'nvidia/nemotron-3-nano-30b-a3b'
  (verified against live integrate.api.nvidia.com/v1/models catalog).
  Expand curated list from 5 to 9 agentic models mapping to OpenRouter
  defaults per provider-guide convention: add qwen3.5-397b-a17b,
  deepseek-v3.2, llama-3.3-nemotron-super-49b-v1.5, gpt-oss-120b.

- cli-config.yaml.example: Document 'nvidia' provider option.

- scripts/release.py: Map asurla@nvidia.com → anniesurla in AUTHOR_MAP
  for CI attribution.

E2E verified: `hermes chat --provider nvidia ...` now reaches NVIDIA's
endpoint (returns 401 with bogus key instead of argparse error);
`hermes doctor` detects NVIDIA NIM when NVIDIA_API_KEY is set.

This commit is contained in:

Teknium

2026-04-17 13:09:14 -07:00

• committed by

Teknium

parent 3b569ff576

commit f362083c64

9 changed files with 35 additions and 5 deletions

1

cli-config.yaml.example

View file

 @ -24,6 +24,7 @@ model:
   #   "minimax"      - MiniMax global (requires: MINIMAX_API_KEY)
   #   "minimax-cn"   - MiniMax China (requires: MINIMAX_CN_API_KEY)
   #   "huggingface"  - Hugging Face Inference (requires: HF_TOKEN)
   #   "nvidia"       - NVIDIA NIM / build.nvidia.com (requires: NVIDIA_API_KEY)
   #   "xiaomi"       - Xiaomi MiMo (requires: XIAOMI_API_KEY)
   #   "arcee"        - Arcee AI Trinity models (requires: ARCEEAI_API_KEY)
   #   "ollama-cloud" - Ollama Cloud (requires: OLLAMA_API_KEY — https://ollama.com/settings)

Rows
Columns

fix(providers): complete NVIDIA NIM parity with other providers

1 cli-config.yaml.example Unescape Escape View file

1

cli-config.yaml.example

View file