No description
Find a file
Jai Suphavadeeprasit c2d5a28d15 Modularize frontend
2025-10-13 11:53:13 -04:00
api_endpoint logging work 2025-10-13 10:44:42 -04:00
ui Modularize frontend 2025-10-13 11:53:13 -04:00
.gitignore add better logging when requests fail 2025-09-10 00:51:41 -07:00
image_generation_tool.py update fal requirements 2025-08-21 08:10:54 -07:00
mixture_of_agents_tool.py changes 2025-10-11 17:52:23 -04:00
mock_web_tools.py Modularize frontend 2025-10-13 11:53:13 -04:00
model_tools.py Update to use toolsets and make them easy to create and configure 2025-09-10 00:43:55 -07:00
output.txt Modularize frontend 2025-10-13 11:53:13 -04:00
README.md changes 2025-10-10 18:04:22 -04:00
requirements.txt changes 2025-10-10 18:04:22 -04:00
run_agent.py Modularize frontend 2025-10-13 11:53:13 -04:00
terminal_tool.py changes 2025-10-10 18:04:22 -04:00
test_mock_mode.sh changes 2025-10-10 18:04:22 -04:00
test_parallel_execution.py changes 2025-10-10 18:04:22 -04:00
test_run.sh Update to use toolsets and make them easy to create and configure 2025-09-10 00:43:55 -07:00
test_ui_flow.py changes 2025-10-10 18:04:22 -04:00
test_web_tools.py Fix Web Tools, Upgrade MoA to GPT5, Add Trajectory Saving 2025-08-31 03:04:10 -07:00
toolsets.py Update to use toolsets and make them easy to create and configure 2025-09-10 00:43:55 -07:00
vision_tools.py changes 2025-10-11 17:52:23 -04:00
web_tools.py changes 2025-10-11 17:52:23 -04:00

Hermes Agent

AI Agent with advanced tool calling capabilities, real-time logging, and extensible toolsets.

Features

  • 🤖 Multi-model Support: Works with Claude, GPT-4, and other OpenAI-compatible models
  • 🔧 Rich Tool Library: Web search, content extraction, vision analysis, terminal execution, and more
  • 📊 Real-time Logging: WebSocket-based logging system for monitoring agent execution
  • 🖥️ Desktop UI: Modern PySide6 frontend with real-time event streaming
  • 🎯 Flexible Toolsets: Predefined toolset combinations for different use cases
  • 💾 Trajectory Saving: Save conversation flows for training and analysis
  • 🔄 Auto-retry: Built-in error handling and retry logic

Quick Start

Installation

pip install -r requirements.txt

Basic Usage

python run_agent.py \
  --enabled_toolsets web \
  --query "Search for the latest AI news"

With Real-time Logging

# Terminal 1: Start API endpoint server
python api_endpoint/logging_server.py

# Terminal 2: Run agent
python run_agent.py \
  --enabled_toolsets web \
  --enable_websocket_logging \
  --query "Your question here"

The easiest way to use Hermes Agent is through the desktop UI:

# One-command launch (starts server + UI)
cd ui && ./start_hermes_ui.sh

# Or manually:
# Terminal 1: Start server
python api_endpoint/logging_server.py

# Terminal 2: Start UI
python ui/hermes_ui.py

The UI provides:

  • 🖱️ Point-and-click query submission
  • 🎛️ Easy model and tool selection
  • 📊 Real-time event visualization
  • 🔄 Automatic WebSocket connection
  • 📝 Session history

Project Structure

Hermes-Agent/
├── run_agent.py              # Main agent runner
├── model_tools.py            # Tool definitions and handling
├── toolsets.py               # Predefined toolset combinations
├── requirements.txt          # Python dependencies
│
├── ui/                      # Desktop UI ⭐ NEW
│   ├── hermes_ui.py         # PySide6 desktop application
│   ├── start_hermes_ui.sh   # UI launcher script
│   └── test_ui_flow.py      # UI integration tests
│
├── tools/                    # Tool implementations
│   ├── web_tools.py         # Web search, extract, crawl
│   ├── vision_tools.py      # Image analysis
│   ├── terminal_tool.py     # Command execution
│   ├── image_generation_tool.py
│   └── ...
│
├── api_endpoint/            # FastAPI + WebSocket logging endpoint
│   ├── logging_server.py    # WebSocket server + Agent API ⭐ ENHANCED
│   ├── websocket_logger.py  # Client library
│   ├── README.md           # API endpoint docs
│   └── ...
│
├── logs/                    # Log files
│   └── realtime/           # WebSocket session logs
│
└── tests/                   # Test files

Available Toolsets

Basic Toolsets

  • web: Web search, extract, and crawl
  • terminal: Command execution
  • vision: Image analysis
  • creative: Image generation
  • reasoning: Mixture of agents

Composite Toolsets

  • research: Web + vision tools
  • development: Web + terminal + vision
  • analysis: Web + vision + reasoning
  • full_stack: All tools enabled

Usage Examples

# Research with web and vision
python run_agent.py --enabled_toolsets research --query "..."

# Development with terminal access
python run_agent.py --enabled_toolsets development --query "..."

# Combine multiple toolsets
python run_agent.py --enabled_toolsets web,vision --query "..."

Real-time Logging System

Monitor your agent's execution in real-time with the FastAPI WebSocket endpoint using a persistent connection pool architecture.

Architecture

The logging system uses a singleton WebSocket connection that persists across multiple agent runs:

  • No timeouts - connection stays alive indefinitely
  • No reconnection overhead - connect once, reuse forever
  • Parallel execution - multiple agents share one connection
  • Production-ready - graceful shutdown with signal handlers

See api_endpoint/PERSISTENT_CONNECTION_GUIDE.md for technical details.

Features

  • Track all API calls and responses
  • Persistent connection - one WebSocket for all sessions
  • Monitor tool executions with parameters and timing
  • Capture errors and completion status
  • REST API for querying sessions
  • Real-time WebSocket broadcasting

Documentation

See api_endpoint/README.md for complete documentation.

Quick Start

# Start API endpoint server
python api_endpoint/logging_server.py

# Run agent with logging
python run_agent.py --enable_websocket_logging --query "..."

# View logs
curl http://localhost:8000/sessions

Configuration

Environment Variables

Create a .env file in the project root:

# API Keys
ANTHROPIC_API_KEY=your_key_here
FIRECRAWL_API_KEY=your_key_here
NOUS_API_KEY=your_key_here
FAL_KEY=your_key_here

# Optional
WEB_TOOLS_DEBUG=true  # Enable web tools debug logging

Command-Line Options

python run_agent.py --help

Key options:

  • --query: Your question/task
  • --model: Model to use (default: claude-sonnet-4-5-20250929)
  • --enabled_toolsets: Toolsets to enable
  • --max_turns: Maximum conversation turns
  • --enable_websocket_logging: Enable real-time logging
  • --verbose: Verbose debug output
  • --save_trajectories: Save conversation trajectories

Parallel Execution

The persistent connection pool enables true parallel agent execution. Multiple agents can run simultaneously, all sharing the same WebSocket connection for logging.

Test Parallel Execution

python test_parallel_execution.py

This script runs three tests:

  1. Sequential - baseline (3 queries one after another)
  2. Parallel - 3 queries simultaneously
  3. High Concurrency - 10 queries simultaneously

Expected Results:

  • ~3x speedup with parallel execution
  • All queries logged to same connection
  • No connection timeouts or errors

Custom Parallel Code

import asyncio
from run_agent import AIAgent

async def main():
    agent1 = AIAgent(enable_websocket_logging=True)
    agent2 = AIAgent(enable_websocket_logging=True)
    
    # Run in parallel - both use shared connection!
    results = await asyncio.gather(
        agent1.run_conversation("Query 1"),
        agent2.run_conversation("Query 2")
    )

asyncio.run(main())

Examples

Investment Research

python run_agent.py \
  --enabled_toolsets web \
  --query "Find publicly traded companies in renewable energy"

Code Analysis

python run_agent.py \
  --enabled_toolsets development \
  --query "Analyze the codebase and suggest improvements"

Image Analysis

python run_agent.py \
  --enabled_toolsets vision \
  --query "Analyze this chart and explain the trends"

Development

Adding New Tools

  1. Create tool in tools/ directory
  2. Register in model_tools.py
  3. Add to appropriate toolset in toolsets.py

Running Tests

# Test web tools
python tests/test_web_tools.py

# Test API endpoint / logging
cd api_endpoint
./test_websocket_logging.sh

License

MIT License - see LICENSE file for details

Contributing

Contributions welcome! Please open an issue or PR.

Support

For questions or issues:

  1. Check documentation in api_endpoint/
  2. Review example usage in this README
  3. Open a GitHub issue

Built with ❤️ for advanced AI agent workflows