hermes-agent/spec.md
alt-glitch 2f2eeffb96 Add container detection utility to hermes_constants
Extract `is_container()` detection logic from scattered locations
(`config.py`, `voice_mode.py`) into a centralized, cached function in
`hermes_constants.py`. This follows the same pattern as `is_wsl()` and
`is_termux()` — checking `/.dockerenv`, `/run/.containerenv`, and cgroup
markers.

Update gateway status detection (`status.py`, `dump.py`) to use the new
utility and handle Docker/Podman differently from systemd-based systems.
Update setup guidance (`setup.py`) to show Docker restart instructions
when running in a container.

Add Dockerfile.test for CI integration testing and spec.md as a Python
module taste guide for contributors.
2026-04-12 13:08:15 -07:00

13 KiB

Python Module Taste Guide

Opinionated notes on structuring Python projects where many people (and agents) contribute. Not a style guide — a taste document.


1. The file is the unit of understanding

Every .py file should be explainable in one sentence. If you can't say "this file handles X" without using the word "and", split it.

# Good: one sentence each
backends/docker.py      → "Docker execution backend"
backends/ssh.py         → "SSH execution backend"
agent/planner.py        → "Step planning and decomposition"
agent/executor.py       → "Tool dispatch and result collection"

# Bad: needs "and"
agent/core.py           → "Planning and execution and tool dispatch and error handling"
utils/helpers.py        → "Retry logic and string formatting and path resolution"

A 400-line file with one clear purpose is better than four 100-line files with fuzzy purposes.


2. Directory = namespace = concept boundary

A directory exists to group files that share a concept AND need to import each other. If the files don't need each other, they don't need a directory — they can be siblings.

# This directory earns its existence:
backends/
├── __init__.py     # re-exports Backend, DockerBackend, etc.
├── base.py         # Protocol/ABC
├── docker.py       # imports base
├── ssh.py          # imports base
└── local.py        # imports base

# This directory shouldn't exist:
utils/
├── __init__.py
├── retry.py        # used by backends
├── formatting.py   # used by cli
└── paths.py        # used by config
# These have nothing to do with each other. Just put them where they're used.

Test: if you delete the __init__.py and the directory, would each file work as a top-level module? If yes, the directory is probably just cosmetic grouping, not a real namespace.


3. Flat until it hurts (the two-level rule)

Start with at most two levels of nesting under src/. Add a third level only when a directory has 7+ files AND they cluster into obvious sub-groups.

# Good: two levels
src/hermes_agent/
├── backends/
├── agent/
├── tools/
├── config/
└── cli/

# Premature: three levels when you only have 2 files
src/hermes_agent/
└── backends/
    └── docker/
        ├── __init__.py
        ├── container.py    # only 80 lines
        └── image.py        # only 60 lines
        # Just keep this as backends/docker.py until it's 300+ lines

Depth costs cognitive overhead. Every nested directory is a question: "do I look in docker/ or docker/container/?" Flat trees answer questions faster.


4. __init__.py is your public API

Treat __init__.py as the only file external consumers should import from. Everything else is internal.

# backends/__init__.py
from .base import Backend, ExecResult
from .docker import DockerBackend
from .local import LocalBackend
from .ssh import SSHBackend

__all__ = ["Backend", "ExecResult", "DockerBackend", "LocalBackend", "SSHBackend"]

This gives you freedom to refactor internals. You can split docker.py into docker_container.py + docker_network.py without changing any external imports — because everyone imports from backends, not from backends.docker.

Rule: if you see from hermes_agent.backends.docker import DockerBackend in the agent code, that's a smell. It should be from hermes_agent.backends import DockerBackend.


5. Dependency arrows flow one way

Draw the import graph. It should be a DAG with clear layers:

cli
 ↓
agent
 ↓  ↘
tools  backends
 ↓       ↓
config  config

Hard rules:

  • config imports nothing from the project (it's the leaf)
  • backends never imports from agent
  • tools never imports from agent
  • agent imports from tools and backends
  • cli imports from agent (and maybe config)

If you're tempted to create a circular import, you're missing an interface. Extract the shared type into config or a types.py at the appropriate level.

# Bad: circular
# agent/executor.py imports backends.docker
# backends/docker.py imports agent.state  ← circular!

# Fix: extract the shared type
# types.py (or config/types.py)
@dataclass
class AgentState:
    ...

# Now both agent and backends can import from types

6. One file owns each type

Every important class/dataclass/Protocol should live in exactly one file, and that file should be obvious from the type name.

BackendConfig      → config/backends.py   (or config.py if config is flat)
DockerBackend      → backends/docker.py
AgentLoop          → agent/loop.py
ToolRegistry       → tools/registry.py

Anti-pattern: putting BackendConfig in backends/base.py because "it's related to backends." No — config lives in config. Backends use config, they don't define config. This keeps the dependency arrows clean.


7. Protocols over ABCs for external contracts

Use typing.Protocol when you want to define "what shape does this thing have" without forcing inheritance. Use ABCs when you want to share implementation.

# Protocol: structural subtyping, no inheritance needed
# Good for: interfaces consumed by other packages
@runtime_checkable
class Backend(Protocol):
    async def execute(self, cmd: str) -> ExecResult: ...
    async def upload(self, local: Path, remote: str) -> None: ...

# ABC: nominal subtyping, shared implementation
# Good for: when backends share 50+ lines of common logic
class BaseBackend(ABC):
    def __init__(self, config: BackendConfig):
        self.config = config
        self._setup_logging()  # shared

    @abstractmethod
    async def execute(self, cmd: str) -> ExecResult: ...

    def _setup_logging(self):  # shared implementation
        ...

Default to Protocol. Reach for ABC only when you have real shared code, not just shared signatures.


8. Config is typed, loaded once, passed explicitly

# config.py
from pydantic import BaseModel

class BackendConfig(BaseModel):
    type: str = "local"
    docker_image: str | None = None
    ssh_host: str | None = None
    timeout: float = 30.0

class AgentConfig(BaseModel):
    model: str = "gpt-4o"
    max_steps: int = 50
    backend: BackendConfig = BackendConfig()

# Loading happens once, at the edge
def load_config(path: Path) -> AgentConfig:
    raw = yaml.safe_load(path.read_text())
    return AgentConfig(**raw)

Rules:

  • Config classes import nothing from the project
  • Config is loaded in cli/ or main(), never inside library code
  • No module-level globals like CONFIG = load_config(). Pass it through constructors.
  • No os.environ.get() scattered through library code. Read env vars in config loading only.

9. The "where does new code go?" test

Before committing to a structure, simulate these scenarios:

Scenario Should be obvious where to add it
New execution backend (e.g., Kubernetes) backends/kubernetes.py + register in __init__.py
New CLI subcommand cli/new_command.py or a function in existing cli file
New tool for the agent tools/new_tool.py + register in tool registry
New config option Add field to existing config model in config.py
Bug fix in SSH execution backends/ssh.py, nowhere else
New eval benchmark eval/new_benchmark.py

If any of these require touching 5+ files or the answer is "I'm not sure," the structure needs work.


10. Files that earn their existence

Every file in the project should pass one of these tests:

  1. It's the single home for a concept (e.g., docker.py owns DockerBackend)
  2. It's a boundary (e.g., __init__.py defines the public API)
  3. It's an entrypoint (e.g., __main__.py, CLI commands)
  4. It's config/constants (e.g., config.py, defaults.py)

Files that don't pass: helpers.py, misc.py, common.py, base.py (when it has no ABC/Protocol), types.py (when it has 2 types that belong in their respective modules).


11. Naming that communicates

Files: noun or noun_phrase, lowercase_snake. The name should tell you what's in the file, not what it does.

# Good: tells you what's inside
registry.py         # contains ToolRegistry
docker.py           # contains DockerBackend
planner.py          # contains Planner, PlanStep

# Bad: tells you what it does (vague)
run.py              # run what?
process.py          # process what?
handle.py           # handle what?

Directories: plural nouns for collections, singular for a single concern.

backends/           # plural: collection of backend implementations
config/             # singular: one concern
tools/              # plural: collection of tools
agent/              # singular: one agent system

12. Tests: organize by confidence, not by source

tests/
├── unit/              # fast, isolated, mock everything external
│   ├── test_planner.py
│   └── test_config.py
├── integration/       # real backends, real I/O, but controlled
│   ├── test_docker_backend.py
│   └── test_ssh_backend.py
├── e2e/               # full agent runs, slow, CI-only
│   ├── test_ctf_solve.py
│   └── test_migration.py
└── fixtures/          # shared test data
    ├── sample_config.yaml
    └── mock_responses/

Don't mirror src/ 1:1. Test files group by what you're verifying, not which source file they exercise. test_agent_can_recover_from_backend_failure.py might touch agent/, backends/, and config/ — and that's fine.


13. The import order tells a story

Within a file, imports should read top-down as: stdlib → third-party → project internals, with project internals going from "far away" to "nearby."

# stdlib
import asyncio
from pathlib import Path

# third-party
from pydantic import BaseModel

# project: far away (config is a leaf, used everywhere)
from hermes_agent.config import BackendConfig

# project: nearby (same package)
from .base import Backend, ExecResult

This isn't just aesthetics — it makes dependency direction visible at a glance.


14. When to split a file

Split when ANY of these are true:

  • File exceeds ~400 lines AND has 2+ distinct responsibilities
  • Two people frequently have merge conflicts in the same file
  • You find yourself adding # --- Section: X --- comments to navigate
  • The file has internal classes/functions that another module wants to import

Do NOT split just because:

  • The file is "long" (a 600-line file with one clear purpose is fine)
  • You "might need to" someday
  • A linter told you to

15. Module-level code is a liability

Every line that runs at import time is a line that can break import hermes_agent.

# Bad: runs at import time
import docker
client = docker.from_env()  # crashes if Docker isn't running

# Good: lazy, runs when needed
def get_docker_client():
    import docker
    return docker.from_env()

# Also good: runs in __init__, not at module level
class DockerBackend:
    def __init__(self, config: BackendConfig):
        import docker
        self._client = docker.from_env()

Rule: module-level code should only be: imports, type definitions, constants, and function/class definitions. Never side effects.


Reading list

These are worth reading not for rules but for calibrating your taste:

  • Hynek Schlawack — Testing & Packaging: Best single article on src layout and why it matters.
  • Python Packaging Guide — src layout vs flat layout: The official take.
  • Brandon Rhodes — The Clean Architecture in Python (PyCon talk): Good for understanding dependency direction without going full enterprise.
  • Cosmicpython — Architecture Patterns with Python: Free online book. Chapters 1-4 on repository pattern and dependency inversion are relevant; skip the CQRS/event-sourcing stuff unless you need it.
  • Hatch documentation: Modern Python project management. Reading how Hatch structures things will passively teach you good layout conventions.
  • Any well-structured open source project: httpx, pydantic, ruff (Rust but the Python wrapper layout is instructive), textual. Read their src/ trees and __init__.py files.

The goal is not elegance. The goal is that a new contributor — human or agent — can go from "I need to change X" to "I know which file to open" in under 10 seconds.