Update environment configuration and enhance tool definitions

- Modified `.env.example` to set default terminal environment to 'local' and updated Docker, Singularity, and Modal image references to use 'python:3.11-slim'.
- Updated `package.json` to include Node.js engine requirements and modified post-install script for better user guidance.
- Enhanced `pyproject.toml` to reflect new dependencies and optional dependencies for modal and development environments.
- Improved `README.md` with additional setup instructions for Singularity and Node.js dependencies, along with clearer toolset documentation.
- Refactored `model_tools.py` to include new tool definitions and ensure consistency across toolsets.
- Updated architecture documentation to clarify tool structure and registration processes.
This commit is contained in:
teknium 2026-01-29 22:36:07 +00:00
parent f8846f85a1
commit 7ea17bb957
8 changed files with 535 additions and 257 deletions

View file

@ -1,114 +1,121 @@
# Message Graph
# Message Format & Trajectories
```mermaid
graph TD
%% Message nodes
SystemMsg["📋 System Message<br/>Role: System<br/>Content: Messages are nodes in a graph"]
UserMsg["👤 User Message<br/>Role: User<br/>Content: But messages aren't the only thing in the graph"]
subgraph PrevMessages["Previous Messages"]
PrevSystemMsg["📋 System Message<br/>Role: System<br/>Content: Edits are kept in the graph as context"]
PrevUserMsg["👤 User Message<br/>Role: User<br/>Content: So we can ensure they're immutable while keeping them editable"]
end
%% Chat Response as a subgraph
subgraph ChatResponseBox["💬 Chat Response"]
ChatMetadata["📊 Metadata<br/>Temp: 1.0<br/>..."]
ChatResponseText["📝 Response<br/>Hello, Here's a subagent call: &lt;tool&gt;subagent&lt;/tool&gt;"]
ChatContent["Content: Hello, Here's a subagent call..."]
end
%% Tool Response as a subgraph
subgraph ToolResponseBox["🔧 Tool Response"]
subgraph ToolMetadata["📊 Tool Metadata"]
ToolMetadataLength["Length: 3"]
subgraph ToolChat["💭 Subagent Chat"]
SubagentSystem["📋 System<br/>Content: Subagent call received"]
SubagentUser["👤 User<br/>Content: Process this request"]
SubagentAssistant["🤖 Assistant<br/>Content: Processing..."]
SubagentSystem --> SubagentUser
SubagentUser --> SubagentAssistant
end
end
ToolContent["Content: Subagent call output"]
end
%% Graph flow connections
SystemMsg --> UserMsg
PrevSystemMsg --> PrevUserMsg
PrevMessages -.-> UserMsg
UserMsg --> ChatResponseBox
ChatResponseBox --> ToolResponseBox
class SystemMsg,UserMsg messageNode
class ChatResponseBox responseNode
class ToolResponseBox responseNode
class ChatMetadata,ChatResponseText,ChatContent,ToolMetadata,ToolChat,ToolContent,ToolMetadataLength metadataNode
```
Hermes Agent uses two message formats: the **API format** for LLM calls and the **trajectory format** for training data export.
Messages should be a graph (DAG, specifically) of immutable elements.
## API Message Format
## Why immutable elements?
We want to train on policy
- This means the context cannot change after we call a response.
## Why a graph?
Nodes and connections are a natural way to represent the flow of information in an agent conversation.
## Will this be annoying to deal with?
It shouldn't be! While there will be internal stuff that may look ???, for the interface, it should be as simple as your
normal context window edits, so `message_history[2]['content'] = my_edit`, but internally we'll deal with the recordkeeping
and how this ends up parsing into on policy training data, if requested.
## Edges
Edges are the connections between nodes, and there are two types we are concerned with:
- **Sequential edges**: These represent the flow of conversation, connecting messages in the order they were sent. For example, a user message followed by an assistant response.
- **Parallel edges**: These represent versioning, e.g. edit history, context squishing, etc.
We, however, are only concerned about parallel edges when we break the prefix, and ignore any other parallel edges.
## So what does this look like in practice?
Standard OpenAI chat format used during execution:
```python
import copy
messages = [
# System prompt
{"role": "system", "content": "You are a helpful assistant with tools..."},
# User query
{"role": "user", "content": "Search for Python tutorials"},
# Assistant with tool call
{
"role": "assistant",
"content": None,
"tool_calls": [{
"id": "call_abc123",
"type": "function",
"function": {
"name": "web_search",
"arguments": "{\"query\": \"Python tutorials\"}"
}
}]
},
# Tool result
{
"role": "tool",
"tool_call_id": "call_abc123",
"content": "{\"results\": [...]}"
},
# Final response
{"role": "assistant", "content": "Here's what I found..."}
]
```
## Trajectory Format (ShareGPT)
class MessageGraph:
def __init__(self):
self.messages = []
self.prev_graph = None
Exported for training in ShareGPT format:
def append(self, message):
self.messages.append(message)
```json
{
"conversations": [
{"from": "system", "value": "You are a helpful assistant..."},
{"from": "human", "value": "Search for Python tutorials"},
{"from": "gpt", "value": "<tool_call>\n{\"name\": \"web_search\", \"arguments\": {\"query\": \"Python tutorials\"}}\n</tool_call>"},
{"from": "tool", "value": "<tool_response>\n{\"results\": [...]}\n</tool_response>"},
{"from": "gpt", "value": "Here's what I found..."}
],
"tools": "[{\"type\": \"function\", \"function\": {...}}]",
"source": "hermes-agent"
}
```
def __getitem__(self, index):
return self.messages[index]
## Reasoning Content
def __setitem__(self, key, value):
# check if an assistant message is after this indx
needs_new_graph = False
first_idx = -1
for i in range(key, len(self.messages)):
if (i == key) and (value['role'] == 'assistant') and (value['content'] == self.messages[i]['content']):
# no op
return
needs_new_graph = needs_new_graph or (self.messages[i]['role'] == 'assistant')
if needs_new_graph and first_idx == -1:
first_idx = i
if needs_new_graph:
self.prev_graph = copy.deepcopy(self)
self.messages[key] = value
For models that output reasoning/chain-of-thought:
def __len__(self):
return len(self.messages)
**During execution** (API format):
```python
# Stored internally but not sent back to model in content
assistant_msg = {
"role": "assistant",
"content": "Here's what I found...",
"reasoning": "Let me think about this step by step..." # Internal only
}
```
def __eq__(self, other):
return "\n\n".join(f"{msg['role']}: {msg['content']}" for msg in self) == "\n\n".join(
f"{msg['role']}: {msg['content']}" for msg in other)
**In trajectory export** (reasoning wrapped in tags):
```json
{
"from": "gpt",
"value": "<think>\nLet me think about this step by step...\n</think>\nHere's what I found..."
}
```
## Conversion Flow
# in use
messages = MessageGraph()
messages.append({'role': 'system', 'content': 'Hello, I am a system message'})
messages[0] = {'role': 'user', 'content': 'Hello, I am a user message'}
```
```
API Response → Internal Storage → Trajectory Export
↓ ↓ ↓
tool_calls reasoning field <tool_call> tags
reasoning_content <think> tags
```
The conversion happens in `_convert_to_trajectory_format()` in `run_agent.py`.
## Ephemeral System Prompts
Batch processing supports ephemeral system prompts that guide behavior during execution but are NOT saved to trajectories:
```python
# During execution: full system prompt + ephemeral guidance
messages = [
{"role": "system", "content": SYSTEM_PROMPT + "\n\n" + ephemeral_prompt},
...
]
# In saved trajectory: only the base system prompt
trajectory = {
"conversations": [
{"from": "system", "value": SYSTEM_PROMPT}, # No ephemeral
...
]
}
```
## Trajectory Compression
Long trajectories can be compressed for training using `trajectory_compressor.py`:
- Protects first/last N turns
- Summarizes middle turns with LLM
- Targets specific token budget
- See `configs/trajectory_compression.yaml` for settings