mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-04-25 00:51:20 +00:00

No description

Find a file

Jai Suphavadeeprasit c2d5a28d15 Modularize frontend		2025-10-13 11:53:13 -04:00
api_endpoint	logging work	2025-10-13 10:44:42 -04:00
ui	Modularize frontend	2025-10-13 11:53:13 -04:00
.gitignore	add better logging when requests fail	2025-09-10 00:51:41 -07:00
image_generation_tool.py	update fal requirements	2025-08-21 08:10:54 -07:00
mixture_of_agents_tool.py	changes	2025-10-11 17:52:23 -04:00
mock_web_tools.py	Modularize frontend	2025-10-13 11:53:13 -04:00
model_tools.py	Update to use toolsets and make them easy to create and configure	2025-09-10 00:43:55 -07:00
output.txt	Modularize frontend	2025-10-13 11:53:13 -04:00
README.md	changes	2025-10-10 18:04:22 -04:00
requirements.txt	changes	2025-10-10 18:04:22 -04:00
run_agent.py	Modularize frontend	2025-10-13 11:53:13 -04:00
terminal_tool.py	changes	2025-10-10 18:04:22 -04:00
test_mock_mode.sh	changes	2025-10-10 18:04:22 -04:00
test_parallel_execution.py	changes	2025-10-10 18:04:22 -04:00
test_run.sh	Update to use toolsets and make them easy to create and configure	2025-09-10 00:43:55 -07:00
test_ui_flow.py	changes	2025-10-10 18:04:22 -04:00
test_web_tools.py	Fix Web Tools, Upgrade MoA to GPT5, Add Trajectory Saving	2025-08-31 03:04:10 -07:00
toolsets.py	Update to use toolsets and make them easy to create and configure	2025-09-10 00:43:55 -07:00
vision_tools.py	changes	2025-10-11 17:52:23 -04:00
web_tools.py	changes	2025-10-11 17:52:23 -04:00

README.md

Hermes Agent

AI Agent with advanced tool calling capabilities, real-time logging, and extensible toolsets.

Features

🤖 Multi-model Support: Works with Claude, GPT-4, and other OpenAI-compatible models
🔧 Rich Tool Library: Web search, content extraction, vision analysis, terminal execution, and more
📊 Real-time Logging: WebSocket-based logging system for monitoring agent execution
🖥️ Desktop UI: Modern PySide6 frontend with real-time event streaming
🎯 Flexible Toolsets: Predefined toolset combinations for different use cases
💾 Trajectory Saving: Save conversation flows for training and analysis
🔄 Auto-retry: Built-in error handling and retry logic

Quick Start

Installation

pip install -r requirements.txt

Basic Usage

python run_agent.py \
  --enabled_toolsets web \
  --query "Search for the latest AI news"

With Real-time Logging

# Terminal 1: Start API endpoint server
python api_endpoint/logging_server.py

# Terminal 2: Run agent
python run_agent.py \
  --enabled_toolsets web \
  --enable_websocket_logging \
  --query "Your question here"

With Desktop UI (Recommended)

The easiest way to use Hermes Agent is through the desktop UI:

# One-command launch (starts server + UI)
cd ui && ./start_hermes_ui.sh

# Or manually:
# Terminal 1: Start server
python api_endpoint/logging_server.py

# Terminal 2: Start UI
python ui/hermes_ui.py

The UI provides:

🖱️ Point-and-click query submission
🎛️ Easy model and tool selection
📊 Real-time event visualization
🔄 Automatic WebSocket connection
📝 Session history

Project Structure

Hermes-Agent/
├── run_agent.py              # Main agent runner
├── model_tools.py            # Tool definitions and handling
├── toolsets.py               # Predefined toolset combinations
├── requirements.txt          # Python dependencies
│
├── ui/                      # Desktop UI ⭐ NEW
│   ├── hermes_ui.py         # PySide6 desktop application
│   ├── start_hermes_ui.sh   # UI launcher script
│   └── test_ui_flow.py      # UI integration tests
│
├── tools/                    # Tool implementations
│   ├── web_tools.py         # Web search, extract, crawl
│   ├── vision_tools.py      # Image analysis
│   ├── terminal_tool.py     # Command execution
│   ├── image_generation_tool.py
│   └── ...
│
├── api_endpoint/            # FastAPI + WebSocket logging endpoint
│   ├── logging_server.py    # WebSocket server + Agent API ⭐ ENHANCED
│   ├── websocket_logger.py  # Client library
│   ├── README.md           # API endpoint docs
│   └── ...
│
├── logs/                    # Log files
│   └── realtime/           # WebSocket session logs
│
└── tests/                   # Test files

Available Toolsets

Basic Toolsets

web: Web search, extract, and crawl
terminal: Command execution
vision: Image analysis
creative: Image generation
reasoning: Mixture of agents

Composite Toolsets

research: Web + vision tools
development: Web + terminal + vision
analysis: Web + vision + reasoning
full_stack: All tools enabled

Usage Examples

# Research with web and vision
python run_agent.py --enabled_toolsets research --query "..."

# Development with terminal access
python run_agent.py --enabled_toolsets development --query "..."

# Combine multiple toolsets
python run_agent.py --enabled_toolsets web,vision --query "..."

Real-time Logging System

Monitor your agent's execution in real-time with the FastAPI WebSocket endpoint using a persistent connection pool architecture.

Architecture

The logging system uses a singleton WebSocket connection that persists across multiple agent runs:

✅ No timeouts - connection stays alive indefinitely
✅ No reconnection overhead - connect once, reuse forever
✅ Parallel execution - multiple agents share one connection
✅ Production-ready - graceful shutdown with signal handlers

See api_endpoint/PERSISTENT_CONNECTION_GUIDE.md for technical details.

Features

Track all API calls and responses
Persistent connection - one WebSocket for all sessions
Monitor tool executions with parameters and timing
Capture errors and completion status
REST API for querying sessions
Real-time WebSocket broadcasting

Documentation

See api_endpoint/README.md for complete documentation.

Quick Start

# Start API endpoint server
python api_endpoint/logging_server.py

# Run agent with logging
python run_agent.py --enable_websocket_logging --query "..."

# View logs
curl http://localhost:8000/sessions

Configuration

Environment Variables

Create a .env file in the project root:

# API Keys
ANTHROPIC_API_KEY=your_key_here
FIRECRAWL_API_KEY=your_key_here
NOUS_API_KEY=your_key_here
FAL_KEY=your_key_here

# Optional
WEB_TOOLS_DEBUG=true  # Enable web tools debug logging

Command-Line Options

python run_agent.py --help

Key options:

--query: Your question/task
--model: Model to use (default: claude-sonnet-4-5-20250929)
--enabled_toolsets: Toolsets to enable
--max_turns: Maximum conversation turns
--enable_websocket_logging: Enable real-time logging
--verbose: Verbose debug output
--save_trajectories: Save conversation trajectories

Parallel Execution

The persistent connection pool enables true parallel agent execution. Multiple agents can run simultaneously, all sharing the same WebSocket connection for logging.

Test Parallel Execution

python test_parallel_execution.py

This script runs three tests:

Sequential - baseline (3 queries one after another)
Parallel - 3 queries simultaneously
High Concurrency - 10 queries simultaneously

Expected Results:

⚡ ~3x speedup with parallel execution
✅ All queries logged to same connection
✅ No connection timeouts or errors

Custom Parallel Code

import asyncio
from run_agent import AIAgent

async def main():
    agent1 = AIAgent(enable_websocket_logging=True)
    agent2 = AIAgent(enable_websocket_logging=True)
    
    # Run in parallel - both use shared connection!
    results = await asyncio.gather(
        agent1.run_conversation("Query 1"),
        agent2.run_conversation("Query 2")
    )

asyncio.run(main())

Examples

Investment Research

python run_agent.py \
  --enabled_toolsets web \
  --query "Find publicly traded companies in renewable energy"

Code Analysis

python run_agent.py \
  --enabled_toolsets development \
  --query "Analyze the codebase and suggest improvements"

Image Analysis

python run_agent.py \
  --enabled_toolsets vision \
  --query "Analyze this chart and explain the trends"

Development

Adding New Tools

Create tool in tools/ directory
Register in model_tools.py
Add to appropriate toolset in toolsets.py

Running Tests

# Test web tools
python tests/test_web_tools.py

# Test API endpoint / logging
cd api_endpoint
./test_websocket_logging.sh

License

MIT License - see LICENSE file for details

Contributing

Contributions welcome! Please open an issue or PR.

Support

For questions or issues:

Check documentation in api_endpoint/
Review example usage in this README
Open a GitHub issue

Built with ❤️ for advanced AI agent workflows