mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-18 04:41:56 +00:00
feat: add NovitaAI as LLM provider
Add NovitaAI as a first-class provider with dedicated model selection flow, live pricing, and authoritative context length resolution. - Register provider in PROVIDER_REGISTRY, HERMES_OVERLAYS, and all alias/label maps (ID: novita, aliases: novita-ai, novitaai) - Add dedicated _model_flow_novita() with 3-tier model list fallback: Novita API → models.dev → static curated list - Fetch live pricing from /v1/models with correct unit conversion (input_token_price_per_m is 0.0001 USD per Mtok) - Add Novita-specific context length resolution (step 4b) in get_model_context_length(), prioritized over models.dev/OpenRouter - Register api.novita.ai in _URL_TO_PROVIDER to prevent early return from the custom-endpoint code path - Add models.dev mapping (novita → novita-ai) - Add default auxiliary model (deepseek/deepseek-v3-0324) - Add NOVITA_API_KEY to test isolation (conftest.py) - Update docs: providers page, env vars reference, CLI reference, .env.example, README, and landing page
This commit is contained in:
parent
55ba02befb
commit
c76e879574
12 changed files with 192 additions and 7 deletions
|
|
@ -14,6 +14,14 @@
|
||||||
# LLM_MODEL is no longer read from .env — this line is kept for reference only.
|
# LLM_MODEL is no longer read from .env — this line is kept for reference only.
|
||||||
# LLM_MODEL=anthropic/claude-opus-4.6
|
# LLM_MODEL=anthropic/claude-opus-4.6
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# LLM PROVIDER (NovitaAI)
|
||||||
|
# =============================================================================
|
||||||
|
# NovitaAI — multi-model aggregator with pay-per-use pricing
|
||||||
|
# Get your key at: https://novita.ai/settings/key-management
|
||||||
|
# NOVITA_API_KEY=
|
||||||
|
# NOVITA_BASE_URL=https://api.novita.ai/openai/v1 # Override default base URL
|
||||||
|
|
||||||
# =============================================================================
|
# =============================================================================
|
||||||
# LLM PROVIDER (Google AI Studio / Gemini)
|
# LLM PROVIDER (Google AI Studio / Gemini)
|
||||||
# =============================================================================
|
# =============================================================================
|
||||||
|
|
|
||||||
|
|
@ -14,7 +14,7 @@
|
||||||
|
|
||||||
**The self-improving AI agent built by [Nous Research](https://nousresearch.com).** It's the only agent with a built-in learning loop — it creates skills from experience, improves them during use, nudges itself to persist knowledge, searches its own past conversations, and builds a deepening model of who you are across sessions. Run it on a $5 VPS, a GPU cluster, or serverless infrastructure that costs nearly nothing when idle. It's not tied to your laptop — talk to it from Telegram while it works on a cloud VM.
|
**The self-improving AI agent built by [Nous Research](https://nousresearch.com).** It's the only agent with a built-in learning loop — it creates skills from experience, improves them during use, nudges itself to persist knowledge, searches its own past conversations, and builds a deepening model of who you are across sessions. Run it on a $5 VPS, a GPU cluster, or serverless infrastructure that costs nearly nothing when idle. It's not tied to your laptop — talk to it from Telegram while it works on a cloud VM.
|
||||||
|
|
||||||
Use any model you want — [Nous Portal](https://portal.nousresearch.com), [OpenRouter](https://openrouter.ai) (200+ models), [NVIDIA NIM](https://build.nvidia.com) (Nemotron), [Xiaomi MiMo](https://platform.xiaomimimo.com), [z.ai/GLM](https://z.ai), [Kimi/Moonshot](https://platform.moonshot.ai), [MiniMax](https://www.minimax.io), [Hugging Face](https://huggingface.co), OpenAI, or your own endpoint. Switch with `hermes model` — no code changes, no lock-in.
|
Use any model you want — [Nous Portal](https://portal.nousresearch.com), [OpenRouter](https://openrouter.ai) (200+ models), [NovitaAI](https://novita.ai) (multi-model, pay-per-use), [NVIDIA NIM](https://build.nvidia.com) (Nemotron), [Xiaomi MiMo](https://platform.xiaomimimo.com), [z.ai/GLM](https://z.ai), [Kimi/Moonshot](https://platform.moonshot.ai), [MiniMax](https://www.minimax.io), [Hugging Face](https://huggingface.co), OpenAI, or your own endpoint. Switch with `hermes model` — no code changes, no lock-in.
|
||||||
|
|
||||||
<table>
|
<table>
|
||||||
<tr><td><b>A real terminal interface</b></td><td>Full TUI with multiline editing, slash-command autocomplete, conversation history, interrupt-and-redirect, and streaming tool output.</td></tr>
|
<tr><td><b>A real terminal interface</b></td><td>Full TUI with multiline editing, slash-command autocomplete, conversation history, interrupt-and-redirect, and streaming tool output.</td></tr>
|
||||||
|
|
|
||||||
|
|
@ -47,7 +47,7 @@ def _resolve_requests_verify() -> bool | str:
|
||||||
_PROVIDER_PREFIXES: frozenset[str] = frozenset({
|
_PROVIDER_PREFIXES: frozenset[str] = frozenset({
|
||||||
"openrouter", "nous", "openai-codex", "copilot", "copilot-acp",
|
"openrouter", "nous", "openai-codex", "copilot", "copilot-acp",
|
||||||
"gemini", "ollama-cloud", "zai", "kimi-coding", "kimi-coding-cn", "stepfun", "minimax", "minimax-oauth", "minimax-cn", "anthropic", "deepseek",
|
"gemini", "ollama-cloud", "zai", "kimi-coding", "kimi-coding-cn", "stepfun", "minimax", "minimax-oauth", "minimax-cn", "anthropic", "deepseek",
|
||||||
"opencode-zen", "opencode-go", "ai-gateway", "kilocode", "alibaba",
|
"opencode-zen", "opencode-go", "ai-gateway", "kilocode", "alibaba", "novita",
|
||||||
"qwen-oauth",
|
"qwen-oauth",
|
||||||
"xiaomi",
|
"xiaomi",
|
||||||
"arcee",
|
"arcee",
|
||||||
|
|
@ -66,7 +66,7 @@ _PROVIDER_PREFIXES: frozenset[str] = frozenset({
|
||||||
"gmi-cloud", "gmicloud",
|
"gmi-cloud", "gmicloud",
|
||||||
"xai", "x-ai", "x.ai", "grok",
|
"xai", "x-ai", "x.ai", "grok",
|
||||||
"nvidia", "nim", "nvidia-nim", "nemotron",
|
"nvidia", "nim", "nvidia-nim", "nemotron",
|
||||||
"qwen-portal",
|
"qwen-portal", "novita-ai", "novitaai",
|
||||||
})
|
})
|
||||||
|
|
||||||
|
|
||||||
|
|
@ -104,6 +104,8 @@ def _strip_provider_prefix(model: str) -> str:
|
||||||
|
|
||||||
_model_metadata_cache: Dict[str, Dict[str, Any]] = {}
|
_model_metadata_cache: Dict[str, Dict[str, Any]] = {}
|
||||||
_model_metadata_cache_time: float = 0
|
_model_metadata_cache_time: float = 0
|
||||||
|
_novita_metadata_cache: Dict[str, Dict[str, Any]] = {}
|
||||||
|
_novita_metadata_cache_time: float = 0
|
||||||
_MODEL_CACHE_TTL = 3600
|
_MODEL_CACHE_TTL = 3600
|
||||||
_endpoint_model_metadata_cache: Dict[str, Dict[str, Dict[str, Any]]] = {}
|
_endpoint_model_metadata_cache: Dict[str, Dict[str, Dict[str, Any]]] = {}
|
||||||
_endpoint_model_metadata_cache_time: Dict[str, float] = {}
|
_endpoint_model_metadata_cache_time: Dict[str, float] = {}
|
||||||
|
|
@ -285,6 +287,7 @@ def grok_supports_reasoning_effort(model: str) -> bool:
|
||||||
_CONTEXT_LENGTH_KEYS = (
|
_CONTEXT_LENGTH_KEYS = (
|
||||||
"context_length",
|
"context_length",
|
||||||
"context_window",
|
"context_window",
|
||||||
|
"context_size",
|
||||||
"max_context_length",
|
"max_context_length",
|
||||||
"max_position_embeddings",
|
"max_position_embeddings",
|
||||||
"max_model_len",
|
"max_model_len",
|
||||||
|
|
@ -361,6 +364,7 @@ _URL_TO_PROVIDER: Dict[str, str] = {
|
||||||
"api.xiaomimimo.com": "xiaomi",
|
"api.xiaomimimo.com": "xiaomi",
|
||||||
"xiaomimimo.com": "xiaomi",
|
"xiaomimimo.com": "xiaomi",
|
||||||
"api.gmi-serving.com": "gmi",
|
"api.gmi-serving.com": "gmi",
|
||||||
|
"api.novita.ai": "novita",
|
||||||
"tokenhub.tencentmaas.com": "tencent-tokenhub",
|
"tokenhub.tencentmaas.com": "tencent-tokenhub",
|
||||||
"ollama.com": "ollama-cloud",
|
"ollama.com": "ollama-cloud",
|
||||||
}
|
}
|
||||||
|
|
@ -557,6 +561,16 @@ def _extract_max_completion_tokens(payload: Dict[str, Any]) -> Optional[int]:
|
||||||
|
|
||||||
|
|
||||||
def _extract_pricing(payload: Dict[str, Any]) -> Dict[str, Any]:
|
def _extract_pricing(payload: Dict[str, Any]) -> Dict[str, Any]:
|
||||||
|
novita_input = payload.get("input_token_price_per_m")
|
||||||
|
novita_output = payload.get("output_token_price_per_m")
|
||||||
|
if novita_input is not None or novita_output is not None:
|
||||||
|
pricing: Dict[str, Any] = {}
|
||||||
|
if novita_input is not None:
|
||||||
|
pricing["prompt"] = str(float(novita_input) / 10_000 / 1_000_000)
|
||||||
|
if novita_output is not None:
|
||||||
|
pricing["completion"] = str(float(novita_output) / 10_000 / 1_000_000)
|
||||||
|
return pricing
|
||||||
|
|
||||||
alias_map = {
|
alias_map = {
|
||||||
"prompt": ("prompt", "input", "input_cost_per_token", "prompt_token_cost"),
|
"prompt": ("prompt", "input", "input_cost_per_token", "prompt_token_cost"),
|
||||||
"completion": ("completion", "output", "output_cost_per_token", "completion_token_cost"),
|
"completion": ("completion", "output", "output_cost_per_token", "completion_token_cost"),
|
||||||
|
|
@ -1527,6 +1541,13 @@ def get_model_context_length(
|
||||||
except ImportError:
|
except ImportError:
|
||||||
pass # boto3 not installed — fall through to generic resolution
|
pass # boto3 not installed — fall through to generic resolution
|
||||||
|
|
||||||
|
if provider == "novita" or (base_url and base_url_host_matches(base_url, "api.novita.ai")):
|
||||||
|
ctx = _resolve_endpoint_context_length(model, base_url or "https://api.novita.ai/openai/v1", api_key=api_key)
|
||||||
|
if ctx is not None:
|
||||||
|
if base_url:
|
||||||
|
save_context_length(model, base_url, ctx)
|
||||||
|
return ctx
|
||||||
|
|
||||||
# 2. Active endpoint metadata for truly custom/unknown endpoints.
|
# 2. Active endpoint metadata for truly custom/unknown endpoints.
|
||||||
# Known providers (Copilot, OpenAI, Anthropic, etc.) skip this — their
|
# Known providers (Copilot, OpenAI, Anthropic, etc.) skip this — their
|
||||||
# /models endpoint may report a provider-imposed limit (e.g. Copilot
|
# /models endpoint may report a provider-imposed limit (e.g. Copilot
|
||||||
|
|
|
||||||
|
|
@ -141,6 +141,7 @@ class ProviderInfo:
|
||||||
# Hermes provider names → models.dev provider IDs
|
# Hermes provider names → models.dev provider IDs
|
||||||
PROVIDER_TO_MODELS_DEV: Dict[str, str] = {
|
PROVIDER_TO_MODELS_DEV: Dict[str, str] = {
|
||||||
"openrouter": "openrouter",
|
"openrouter": "openrouter",
|
||||||
|
"novita": "novita-ai",
|
||||||
"anthropic": "anthropic",
|
"anthropic": "anthropic",
|
||||||
"openai": "openai",
|
"openai": "openai",
|
||||||
"openai-codex": "openai",
|
"openai-codex": "openai",
|
||||||
|
|
|
||||||
|
|
@ -4970,6 +4970,37 @@ def _model_flow_api_key_provider(config, provider_id, current_model=""):
|
||||||
)
|
)
|
||||||
if model_list:
|
if model_list:
|
||||||
print(f" Found {len(model_list)} model(s) from Ollama Cloud")
|
print(f" Found {len(model_list)} model(s) from Ollama Cloud")
|
||||||
|
elif provider_id == "novita":
|
||||||
|
from hermes_cli.models import fetch_api_models
|
||||||
|
|
||||||
|
api_key_for_probe = existing_key or (get_env_value(key_env) if key_env else "")
|
||||||
|
curated = _PROVIDER_MODELS.get(provider_id, [])
|
||||||
|
live_models = fetch_api_models(api_key_for_probe, effective_base)
|
||||||
|
if live_models:
|
||||||
|
model_list = live_models
|
||||||
|
print(f" Found {len(model_list)} model(s) from {pconfig.name} API")
|
||||||
|
else:
|
||||||
|
mdev_models: list = []
|
||||||
|
try:
|
||||||
|
from agent.models_dev import list_agentic_models
|
||||||
|
|
||||||
|
mdev_models = list_agentic_models(provider_id)
|
||||||
|
except Exception:
|
||||||
|
pass
|
||||||
|
if mdev_models:
|
||||||
|
seen = {m.lower() for m in mdev_models}
|
||||||
|
model_list = list(mdev_models)
|
||||||
|
for m in curated:
|
||||||
|
if m.lower() not in seen:
|
||||||
|
model_list.append(m)
|
||||||
|
seen.add(m.lower())
|
||||||
|
print(f" Found {len(model_list)} model(s) from models.dev registry")
|
||||||
|
else:
|
||||||
|
model_list = curated
|
||||||
|
if model_list:
|
||||||
|
print(
|
||||||
|
f' Showing {len(model_list)} curated models — use "Enter custom model name" for others.'
|
||||||
|
)
|
||||||
else:
|
else:
|
||||||
curated = _PROVIDER_MODELS.get(provider_id, [])
|
curated = _PROVIDER_MODELS.get(provider_id, [])
|
||||||
|
|
||||||
|
|
@ -9269,7 +9300,7 @@ def _build_provider_choices() -> list[str]:
|
||||||
"auto", "openrouter", "nous", "openai-codex", "copilot-acp", "copilot",
|
"auto", "openrouter", "nous", "openai-codex", "copilot-acp", "copilot",
|
||||||
"anthropic", "gemini", "google-gemini-cli", "xai", "bedrock", "azure-foundry",
|
"anthropic", "gemini", "google-gemini-cli", "xai", "bedrock", "azure-foundry",
|
||||||
"ollama-cloud", "huggingface", "zai", "kimi-coding", "kimi-coding-cn",
|
"ollama-cloud", "huggingface", "zai", "kimi-coding", "kimi-coding-cn",
|
||||||
"stepfun", "minimax", "minimax-cn", "kilocode", "xiaomi", "arcee",
|
"stepfun", "minimax", "minimax-cn", "kilocode", "novita", "xiaomi", "arcee",
|
||||||
"nvidia", "deepseek", "alibaba", "qwen-oauth", "opencode-zen", "opencode-go",
|
"nvidia", "deepseek", "alibaba", "qwen-oauth", "opencode-zen", "opencode-go",
|
||||||
]
|
]
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -445,6 +445,14 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
|
||||||
# Azure Foundry: user-provided endpoint and model.
|
# Azure Foundry: user-provided endpoint and model.
|
||||||
# Empty list because models depend on the endpoint configuration.
|
# Empty list because models depend on the endpoint configuration.
|
||||||
"azure-foundry": [],
|
"azure-foundry": [],
|
||||||
|
"novita": [
|
||||||
|
"moonshotai/kimi-k2.5",
|
||||||
|
"minimax/minimax-m2.7",
|
||||||
|
"zai-org/glm-5",
|
||||||
|
"deepseek/deepseek-v3-0324",
|
||||||
|
"deepseek/deepseek-r1-0528",
|
||||||
|
"qwen/qwen3-235b-a22b-fp8",
|
||||||
|
],
|
||||||
}
|
}
|
||||||
|
|
||||||
# Vercel AI Gateway: derive the bare-model-id catalog from the curated
|
# Vercel AI Gateway: derive the bare-model-id catalog from the curated
|
||||||
|
|
@ -905,6 +913,7 @@ class ProviderEntry(NamedTuple):
|
||||||
CANONICAL_PROVIDERS: list[ProviderEntry] = [
|
CANONICAL_PROVIDERS: list[ProviderEntry] = [
|
||||||
ProviderEntry("nous", "Nous Portal", "Nous Portal (Nous Research subscription)"),
|
ProviderEntry("nous", "Nous Portal", "Nous Portal (Nous Research subscription)"),
|
||||||
ProviderEntry("openrouter", "OpenRouter", "OpenRouter (100+ models, pay-per-use)"),
|
ProviderEntry("openrouter", "OpenRouter", "OpenRouter (100+ models, pay-per-use)"),
|
||||||
|
ProviderEntry("novita", "NovitaAI", "NovitaAI (90+ models, pay-per-use)"),
|
||||||
ProviderEntry("lmstudio", "LM Studio", "LM Studio (local desktop app with built-in model server)"),
|
ProviderEntry("lmstudio", "LM Studio", "LM Studio (local desktop app with built-in model server)"),
|
||||||
ProviderEntry("anthropic", "Anthropic", "Anthropic (Claude models — API key or Claude Code)"),
|
ProviderEntry("anthropic", "Anthropic", "Anthropic (Claude models — API key or Claude Code)"),
|
||||||
ProviderEntry("openai-codex", "OpenAI Codex", "OpenAI Codex"),
|
ProviderEntry("openai-codex", "OpenAI Codex", "OpenAI Codex"),
|
||||||
|
|
@ -1014,6 +1023,8 @@ _PROVIDER_ALIASES = {
|
||||||
"hf": "huggingface",
|
"hf": "huggingface",
|
||||||
"hugging-face": "huggingface",
|
"hugging-face": "huggingface",
|
||||||
"huggingface-hub": "huggingface",
|
"huggingface-hub": "huggingface",
|
||||||
|
"novita-ai": "novita",
|
||||||
|
"novitaai": "novita",
|
||||||
"mimo": "xiaomi",
|
"mimo": "xiaomi",
|
||||||
"xiaomi-mimo": "xiaomi",
|
"xiaomi-mimo": "xiaomi",
|
||||||
"tencent": "tencent-tokenhub",
|
"tencent": "tencent-tokenhub",
|
||||||
|
|
@ -1494,7 +1505,7 @@ def _resolve_nous_pricing_credentials() -> tuple[str, str]:
|
||||||
|
|
||||||
|
|
||||||
def get_pricing_for_provider(provider: str, *, force_refresh: bool = False) -> dict[str, dict[str, str]]:
|
def get_pricing_for_provider(provider: str, *, force_refresh: bool = False) -> dict[str, dict[str, str]]:
|
||||||
"""Return live pricing for providers that support it (openrouter, nous, ai-gateway)."""
|
"""Return live pricing for providers that support it (openrouter, nous, ai-gateway, novita)."""
|
||||||
normalized = normalize_provider(provider)
|
normalized = normalize_provider(provider)
|
||||||
if normalized == "openrouter":
|
if normalized == "openrouter":
|
||||||
return fetch_models_with_pricing(
|
return fetch_models_with_pricing(
|
||||||
|
|
@ -1504,6 +1515,8 @@ def get_pricing_for_provider(provider: str, *, force_refresh: bool = False) -> d
|
||||||
)
|
)
|
||||||
if normalized == "ai-gateway":
|
if normalized == "ai-gateway":
|
||||||
return fetch_ai_gateway_pricing(force_refresh=force_refresh)
|
return fetch_ai_gateway_pricing(force_refresh=force_refresh)
|
||||||
|
if normalized == "novita":
|
||||||
|
return _fetch_novita_pricing()
|
||||||
if normalized == "nous":
|
if normalized == "nous":
|
||||||
api_key, base_url = _resolve_nous_pricing_credentials()
|
api_key, base_url = _resolve_nous_pricing_credentials()
|
||||||
if base_url:
|
if base_url:
|
||||||
|
|
@ -1520,6 +1533,50 @@ def get_pricing_for_provider(provider: str, *, force_refresh: bool = False) -> d
|
||||||
return {}
|
return {}
|
||||||
|
|
||||||
|
|
||||||
|
def _fetch_novita_pricing(timeout: float = 8.0) -> dict[str, dict[str, str]]:
|
||||||
|
"""Fetch pricing from NovitaAI /v1/models.
|
||||||
|
|
||||||
|
NovitaAI returns input/output prices per million tokens in units of
|
||||||
|
0.0001 USD. Convert them to the per-token strings used by the shared
|
||||||
|
pricing formatter.
|
||||||
|
"""
|
||||||
|
api_key = os.getenv("NOVITA_API_KEY", "").strip()
|
||||||
|
if not api_key:
|
||||||
|
return {}
|
||||||
|
|
||||||
|
base_url = os.getenv("NOVITA_BASE_URL", "").strip() or "https://api.novita.ai/openai/v1"
|
||||||
|
url = base_url.rstrip("/") + "/models"
|
||||||
|
headers = {
|
||||||
|
"Authorization": f"Bearer {api_key}",
|
||||||
|
"Accept": "application/json",
|
||||||
|
"User-Agent": _HERMES_USER_AGENT,
|
||||||
|
}
|
||||||
|
|
||||||
|
try:
|
||||||
|
req = urllib.request.Request(url, headers=headers)
|
||||||
|
with urllib.request.urlopen(req, timeout=timeout) as resp:
|
||||||
|
payload = json.loads(resp.read().decode())
|
||||||
|
except Exception:
|
||||||
|
return {}
|
||||||
|
|
||||||
|
result: dict[str, dict[str, str]] = {}
|
||||||
|
for item in payload.get("data", []):
|
||||||
|
if not isinstance(item, dict):
|
||||||
|
continue
|
||||||
|
mid = item.get("id")
|
||||||
|
if not mid:
|
||||||
|
continue
|
||||||
|
inp = item.get("input_token_price_per_m")
|
||||||
|
out = item.get("output_token_price_per_m")
|
||||||
|
if inp is None and out is None:
|
||||||
|
continue
|
||||||
|
result[str(mid)] = {
|
||||||
|
"prompt": str(float(inp or 0) / 10_000 / 1_000_000),
|
||||||
|
"completion": str(float(out or 0) / 10_000 / 1_000_000),
|
||||||
|
}
|
||||||
|
return result
|
||||||
|
|
||||||
|
|
||||||
# All provider IDs and aliases that are valid for the provider:model syntax.
|
# All provider IDs and aliases that are valid for the provider:model syntax.
|
||||||
_KNOWN_PROVIDER_NAMES: set[str] = (
|
_KNOWN_PROVIDER_NAMES: set[str] = (
|
||||||
set(_PROVIDER_LABELS.keys())
|
set(_PROVIDER_LABELS.keys())
|
||||||
|
|
|
||||||
|
|
@ -156,6 +156,11 @@ HERMES_OVERLAYS: Dict[str, HermesOverlay] = {
|
||||||
is_aggregator=True,
|
is_aggregator=True,
|
||||||
base_url_env_var="HF_BASE_URL",
|
base_url_env_var="HF_BASE_URL",
|
||||||
),
|
),
|
||||||
|
"novita": HermesOverlay(
|
||||||
|
transport="openai_chat",
|
||||||
|
is_aggregator=True,
|
||||||
|
base_url_env_var="NOVITA_BASE_URL",
|
||||||
|
),
|
||||||
"xai": HermesOverlay(
|
"xai": HermesOverlay(
|
||||||
transport="codex_responses",
|
transport="codex_responses",
|
||||||
base_url_override="https://api.x.ai/v1",
|
base_url_override="https://api.x.ai/v1",
|
||||||
|
|
@ -309,6 +314,10 @@ ALIASES: Dict[str, str] = {
|
||||||
"hugging-face": "huggingface",
|
"hugging-face": "huggingface",
|
||||||
"huggingface-hub": "huggingface",
|
"huggingface-hub": "huggingface",
|
||||||
|
|
||||||
|
# novita
|
||||||
|
"novita-ai": "novita",
|
||||||
|
"novitaai": "novita",
|
||||||
|
|
||||||
# xiaomi
|
# xiaomi
|
||||||
"mimo": "xiaomi",
|
"mimo": "xiaomi",
|
||||||
"xiaomi-mimo": "xiaomi",
|
"xiaomi-mimo": "xiaomi",
|
||||||
|
|
|
||||||
27
plugins/model-providers/novita/__init__.py
Normal file
27
plugins/model-providers/novita/__init__.py
Normal file
|
|
@ -0,0 +1,27 @@
|
||||||
|
"""NovitaAI provider profile."""
|
||||||
|
|
||||||
|
from providers import register_provider
|
||||||
|
from providers.base import ProviderProfile
|
||||||
|
|
||||||
|
|
||||||
|
novita = ProviderProfile(
|
||||||
|
name="novita",
|
||||||
|
aliases=("novita-ai", "novitaai"),
|
||||||
|
display_name="NovitaAI",
|
||||||
|
description="NovitaAI — multi-model aggregator with pay-per-use pricing",
|
||||||
|
signup_url="https://novita.ai/settings/key-management",
|
||||||
|
env_vars=("NOVITA_API_KEY", "NOVITA_BASE_URL"),
|
||||||
|
base_url="https://api.novita.ai/openai/v1",
|
||||||
|
auth_type="api_key",
|
||||||
|
default_aux_model="deepseek/deepseek-v3-0324",
|
||||||
|
fallback_models=(
|
||||||
|
"moonshotai/kimi-k2.5",
|
||||||
|
"minimax/minimax-m2.7",
|
||||||
|
"zai-org/glm-5",
|
||||||
|
"deepseek/deepseek-v3-0324",
|
||||||
|
"deepseek/deepseek-r1-0528",
|
||||||
|
"qwen/qwen3-235b-a22b-fp8",
|
||||||
|
),
|
||||||
|
)
|
||||||
|
|
||||||
|
register_provider(novita)
|
||||||
5
plugins/model-providers/novita/plugin.yaml
Normal file
5
plugins/model-providers/novita/plugin.yaml
Normal file
|
|
@ -0,0 +1,5 @@
|
||||||
|
name: novita-provider
|
||||||
|
kind: model-provider
|
||||||
|
version: 1.0.0
|
||||||
|
description: NovitaAI multi-model aggregator
|
||||||
|
author: Nous Research
|
||||||
|
|
@ -20,6 +20,7 @@ You need at least one way to connect to an LLM. Use `hermes model` to switch pro
|
||||||
| **GitHub Copilot ACP** | `hermes model` (spawns local `copilot --acp --stdio`) |
|
| **GitHub Copilot ACP** | `hermes model` (spawns local `copilot --acp --stdio`) |
|
||||||
| **Anthropic** | `hermes model` (Claude Max + extra usage credits via OAuth; also supports Anthropic API key or manual setup-token — see note below) |
|
| **Anthropic** | `hermes model` (Claude Max + extra usage credits via OAuth; also supports Anthropic API key or manual setup-token — see note below) |
|
||||||
| **OpenRouter** | `OPENROUTER_API_KEY` in `~/.hermes/.env` |
|
| **OpenRouter** | `OPENROUTER_API_KEY` in `~/.hermes/.env` |
|
||||||
|
| **NovitaAI** | `NOVITA_API_KEY` in `~/.hermes/.env` (provider: `novita`, multi-model aggregator) |
|
||||||
| **AI Gateway** | `AI_GATEWAY_API_KEY` in `~/.hermes/.env` (provider: `ai-gateway`) |
|
| **AI Gateway** | `AI_GATEWAY_API_KEY` in `~/.hermes/.env` (provider: `ai-gateway`) |
|
||||||
| **z.ai / GLM** | `GLM_API_KEY` in `~/.hermes/.env` (provider: `zai`) |
|
| **z.ai / GLM** | `GLM_API_KEY` in `~/.hermes/.env` (provider: `zai`) |
|
||||||
| **Kimi / Moonshot** | `KIMI_API_KEY` in `~/.hermes/.env` (provider: `kimi-coding`) |
|
| **Kimi / Moonshot** | `KIMI_API_KEY` in `~/.hermes/.env` (provider: `kimi-coding`) |
|
||||||
|
|
@ -546,6 +547,29 @@ You can append routing suffixes to model names: `:fastest` (default), `:cheapest
|
||||||
|
|
||||||
The base URL can be overridden with `HF_BASE_URL`.
|
The base URL can be overridden with `HF_BASE_URL`.
|
||||||
|
|
||||||
|
### NovitaAI
|
||||||
|
|
||||||
|
[NovitaAI](https://novita.ai) is a multi-model aggregator with pay-per-use pricing. Access models from DeepSeek, Kimi, MiniMax, GLM, Qwen, and more through a unified OpenAI-compatible API.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Use any available model
|
||||||
|
hermes chat --provider novita --model moonshotai/kimi-k2.5
|
||||||
|
# Requires: NOVITA_API_KEY in ~/.hermes/.env
|
||||||
|
|
||||||
|
# Short alias
|
||||||
|
hermes chat --provider novita-ai --model deepseek/deepseek-v3-0324
|
||||||
|
```
|
||||||
|
|
||||||
|
Or set it permanently in `config.yaml`:
|
||||||
|
```yaml
|
||||||
|
model:
|
||||||
|
provider: "novita"
|
||||||
|
default: "moonshotai/kimi-k2.5"
|
||||||
|
base_url: "https://api.novita.ai/openai/v1"
|
||||||
|
```
|
||||||
|
|
||||||
|
Get your API key at [novita.ai/settings/key-management](https://novita.ai/settings/key-management). The base URL can be overridden with `NOVITA_BASE_URL`.
|
||||||
|
|
||||||
## Custom & Self-Hosted LLM Providers
|
## Custom & Self-Hosted LLM Providers
|
||||||
|
|
||||||
Hermes Agent works with **any OpenAI-compatible API endpoint**. If a server implements `/v1/chat/completions`, you can point Hermes at it. This means you can use local models, GPU inference servers, multi-provider routers, or any third-party API.
|
Hermes Agent works with **any OpenAI-compatible API endpoint**. If a server implements `/v1/chat/completions`, you can point Hermes at it. This means you can use local models, GPU inference servers, multi-provider routers, or any third-party API.
|
||||||
|
|
|
||||||
|
|
@ -91,7 +91,7 @@ Common options:
|
||||||
| `-q`, `--query "..."` | One-shot, non-interactive prompt. |
|
| `-q`, `--query "..."` | One-shot, non-interactive prompt. |
|
||||||
| `-m`, `--model <model>` | Override the model for this run. |
|
| `-m`, `--model <model>` | Override the model for this run. |
|
||||||
| `-t`, `--toolsets <csv>` | Enable a comma-separated set of toolsets. |
|
| `-t`, `--toolsets <csv>` | Enable a comma-separated set of toolsets. |
|
||||||
| `--provider <provider>` | Force a provider: `auto`, `openrouter`, `nous`, `openai-codex`, `copilot-acp`, `copilot`, `anthropic`, `gemini`, `google-gemini-cli`, `huggingface`, `zai`, `kimi-coding`, `kimi-coding-cn`, `minimax`, `minimax-cn`, `minimax-oauth`, `kilocode`, `xiaomi`, `arcee`, `gmi`, `alibaba`, `alibaba-coding-plan` (alias `alibaba_coding`), `deepseek`, `nvidia`, `ollama-cloud`, `xai` (alias `grok`), `qwen-oauth`, `bedrock`, `opencode-zen`, `opencode-go`, `ai-gateway`, `azure-foundry`, `lmstudio`, `stepfun`, `tencent-tokenhub` (alias `tencent`, `tokenhub`). |
|
| `--provider <provider>` | Force a provider: `auto`, `openrouter`, `nous`, `openai-codex`, `copilot-acp`, `copilot`, `anthropic`, `gemini`, `google-gemini-cli`, `huggingface`, `novita`, `zai`, `kimi-coding`, `kimi-coding-cn`, `minimax`, `minimax-cn`, `minimax-oauth`, `kilocode`, `xiaomi`, `arcee`, `gmi`, `alibaba`, `alibaba-coding-plan` (alias `alibaba_coding`), `deepseek`, `nvidia`, `ollama-cloud`, `xai` (alias `grok`), `qwen-oauth`, `bedrock`, `opencode-zen`, `opencode-go`, `ai-gateway`, `azure-foundry`, `lmstudio`, `stepfun`, `tencent-tokenhub` (alias `tencent`, `tokenhub`). |
|
||||||
| `-s`, `--skills <name>` | Preload one or more skills for the session (can be repeated or comma-separated). |
|
| `-s`, `--skills <name>` | Preload one or more skills for the session (can be repeated or comma-separated). |
|
||||||
| `-v`, `--verbose` | Verbose output. |
|
| `-v`, `--verbose` | Verbose output. |
|
||||||
| `-Q`, `--quiet` | Programmatic mode: suppress banner/spinner/tool previews. |
|
| `-Q`, `--quiet` | Programmatic mode: suppress banner/spinner/tool previews. |
|
||||||
|
|
|
||||||
|
|
@ -67,6 +67,8 @@ All variables go in `~/.hermes/.env`. You can also set them with `hermes config
|
||||||
| `DASHSCOPE_BASE_URL` | Custom DashScope base URL (default: `https://dashscope-intl.aliyuncs.com/compatible-mode/v1`; use `https://dashscope.aliyuncs.com/compatible-mode/v1` for mainland-China region) |
|
| `DASHSCOPE_BASE_URL` | Custom DashScope base URL (default: `https://dashscope-intl.aliyuncs.com/compatible-mode/v1`; use `https://dashscope.aliyuncs.com/compatible-mode/v1` for mainland-China region) |
|
||||||
| `DEEPSEEK_API_KEY` | DeepSeek API key for direct DeepSeek access ([platform.deepseek.com](https://platform.deepseek.com/api_keys)) |
|
| `DEEPSEEK_API_KEY` | DeepSeek API key for direct DeepSeek access ([platform.deepseek.com](https://platform.deepseek.com/api_keys)) |
|
||||||
| `DEEPSEEK_BASE_URL` | Custom DeepSeek API base URL |
|
| `DEEPSEEK_BASE_URL` | Custom DeepSeek API base URL |
|
||||||
|
| `NOVITA_API_KEY` | NovitaAI API key — multi-model aggregator ([novita.ai/settings/key-management](https://novita.ai/settings/key-management)) |
|
||||||
|
| `NOVITA_BASE_URL` | Override NovitaAI base URL (default: `https://api.novita.ai/openai/v1`) |
|
||||||
| `NVIDIA_API_KEY` | NVIDIA NIM API key — Nemotron and open models ([build.nvidia.com](https://build.nvidia.com)) |
|
| `NVIDIA_API_KEY` | NVIDIA NIM API key — Nemotron and open models ([build.nvidia.com](https://build.nvidia.com)) |
|
||||||
| `NVIDIA_BASE_URL` | Override NVIDIA base URL (default: `https://integrate.api.nvidia.com/v1`; set to `http://localhost:8000/v1` for a local NIM endpoint) |
|
| `NVIDIA_BASE_URL` | Override NVIDIA base URL (default: `https://integrate.api.nvidia.com/v1`; set to `http://localhost:8000/v1` for a local NIM endpoint) |
|
||||||
| `STEPFUN_API_KEY` | StepFun API key — Step-series models ([platform.stepfun.com](https://platform.stepfun.com)) |
|
| `STEPFUN_API_KEY` | StepFun API key — Step-series models ([platform.stepfun.com](https://platform.stepfun.com)) |
|
||||||
|
|
@ -103,7 +105,7 @@ For native Anthropic auth, Hermes prefers Claude Code's own credential files whe
|
||||||
|
|
||||||
| Variable | Description |
|
| Variable | Description |
|
||||||
|----------|-------------|
|
|----------|-------------|
|
||||||
| `HERMES_INFERENCE_PROVIDER` | Override provider selection: `auto`, `custom`, `openrouter`, `nous`, `openai-codex`, `copilot`, `copilot-acp`, `anthropic`, `huggingface`, `gemini`, `zai`, `kimi-coding`, `kimi-coding-cn`, `minimax`, `minimax-cn`, `minimax-oauth` (browser OAuth login — no API key required; see [MiniMax OAuth guide](../guides/minimax-oauth.md)), `kilocode`, `xiaomi`, `arcee`, `gmi`, `stepfun`, `alibaba`, `alibaba-coding-plan` (alias `alibaba_coding`), `deepseek`, `nvidia`, `ollama-cloud`, `xai` (alias `grok`), `google-gemini-cli`, `qwen-oauth`, `bedrock`, `opencode-zen`, `opencode-go`, `ai-gateway`, `tencent-tokenhub` (default: `auto`) |
|
| `HERMES_INFERENCE_PROVIDER` | Override provider selection: `auto`, `custom`, `openrouter`, `nous`, `openai-codex`, `copilot`, `copilot-acp`, `anthropic`, `huggingface`, `novita`, `gemini`, `zai`, `kimi-coding`, `kimi-coding-cn`, `minimax`, `minimax-cn`, `minimax-oauth` (browser OAuth login — no API key required; see [MiniMax OAuth guide](../guides/minimax-oauth.md)), `kilocode`, `xiaomi`, `arcee`, `gmi`, `stepfun`, `alibaba`, `alibaba-coding-plan` (alias `alibaba_coding`), `deepseek`, `nvidia`, `ollama-cloud`, `xai` (alias `grok`), `google-gemini-cli`, `qwen-oauth`, `bedrock`, `opencode-zen`, `opencode-go`, `ai-gateway`, `tencent-tokenhub` (default: `auto`) |
|
||||||
| `HERMES_PORTAL_BASE_URL` | Override Nous Portal URL (for development/testing) |
|
| `HERMES_PORTAL_BASE_URL` | Override Nous Portal URL (for development/testing) |
|
||||||
| `NOUS_INFERENCE_BASE_URL` | Override Nous inference API URL |
|
| `NOUS_INFERENCE_BASE_URL` | Override Nous inference API URL |
|
||||||
| `HERMES_NOUS_MIN_KEY_TTL_SECONDS` | Min agent key TTL before re-mint (default: 1800 = 30min) |
|
| `HERMES_NOUS_MIN_KEY_TTL_SECONDS` | Min agent key TTL before re-mint (default: 1800 = 30min) |
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue