hermes-agent/website/docs/user-guide/skills/optional/devops/devops-docker-management.md
Teknium 252d68fd45
docs: deep audit — fix stale config keys, missing commands, and registry drift (#22784)
* docs: deep audit — fix stale config keys, missing commands, and registry drift

Cross-checked ~80 high-impact docs pages (getting-started, reference, top-level
user-guide, user-guide/features) against the live registries:

  hermes_cli/commands.py    COMMAND_REGISTRY (slash commands)
  hermes_cli/auth.py        PROVIDER_REGISTRY (providers)
  hermes_cli/config.py      DEFAULT_CONFIG (config keys)
  toolsets.py               TOOLSETS (toolsets)
  tools/registry.py         get_all_tool_names() (tools)
  python -m hermes_cli.main <subcmd> --help (CLI args)

reference/
- cli-commands.md: drop duplicate hermes fallback row + duplicate section,
  add stepfun/lmstudio to --provider enum, expand auth/mcp/curator subcommand
  lists to match --help output (status/logout/spotify, login, archive/prune/
  list-archived).
- slash-commands.md: add missing /sessions and /reload-skills entries +
  correct the cross-platform Notes line.
- tools-reference.md: drop bogus '68 tools' headline, drop fictional
  'browser-cdp toolset' (these tools live in 'browser' and are runtime-gated),
  add missing 'kanban' and 'video' toolset sections, fix MCP example to use
  the real mcp_<server>_<tool> prefix.
- toolsets-reference.md: list browser_cdp/browser_dialog inside the 'browser'
  row, add missing 'kanban' and 'video' toolset rows, drop the stale
  '38 tools' count for hermes-cli.
- profile-commands.md: add missing install/update/info subcommands, document
  fish completion.
- environment-variables.md: dedupe GMI_API_KEY/GMI_BASE_URL rows (kept the
  one with the correct gmi-serving.com default).
- faq.md: Anthropic/Google/OpenAI examples — direct providers exist (not just
  via OpenRouter), refresh the OpenAI model list.

getting-started/
- installation.md: PortableGit (not MinGit) is what the Windows installer
  fetches; document the 32-bit MinGit fallback.
- installation.md / termux.md: installer prefers .[termux-all] then falls
  back to .[termux].
- nix-setup.md: Python 3.12 (not 3.11), Node.js 22 (not 20); fix invalid
  'nix flake update --flake' invocation.
- updating.md: 'hermes backup restore --state pre-update' doesn't exist —
  point at the snapshot/quick-snapshot flow; correct config key
  'updates.pre_update_backup' (was 'update.backup').

user-guide/
- configuration.md: api_max_retries default 3 (not 2); display.runtime_footer
  is the real key (not display.runtime_metadata_footer); checkpoints defaults
  enabled=false / max_snapshots=20 (not true / 50).
- configuring-models.md: 'hermes model list' / 'hermes model set ...' don't
  exist — hermes model is interactive only.
- tui.md: busy_indicator -> tui_status_indicator with values
  kaomoji|emoji|unicode|ascii (not kawaii|minimal|dots|wings|none).
- security.md: SSH backend keys (TERMINAL_SSH_HOST/USER/KEY) live in .env,
  not config.yaml.
- windows-wsl-quickstart.md: there is no 'hermes api' subcommand — the
  OpenAI-compatible API server runs inside hermes gateway.

user-guide/features/
- computer-use.md: approvals.mode (not security.approval_level); fix broken
  ./browser-use.md link to ./browser.md.
- fallback-providers.md: top-level fallback_providers (not
  model.fallback_providers); the picker is subcommand-based, not modal.
- api-server.md: API_SERVER_* are env vars — write to per-profile .env,
  not 'hermes config set' which targets YAML.
- web-search.md: drop web_crawl as a registered tool (it isn't); deep-crawl
  modes are exposed through web_extract.
- kanban.md: failure_limit default is 2, not '~5'.
- plugins.md: drop hard-coded '33 providers' count.
- honcho.md: fix unclosed quote in echo HONCHO_API_KEY snippet; document
  that 'hermes honcho' subcommand is gated on memory.provider=honcho;
  reconcile subcommand list with actual --help output.
- memory-providers.md: legacy 'hermes honcho setup' redirect documented.

Verified via 'npm run build' — site builds cleanly; broken-link count went
from 149 to 146 (no regressions, fixed a few in passing).

* docs: round 2 audit fixes + regenerate skill catalogs

Follow-up to the previous commit on this branch:

Round 2 manual fixes:
- quickstart.md: KIMI_CODING_API_KEY mentioned alongside KIMI_API_KEY;
  voice-mode and ACP install commands rewritten — bare 'pip install ...'
  doesn't work for curl-installed setups (no pip on PATH, not in repo
  dir); replaced with 'cd ~/.hermes/hermes-agent && uv pip install -e
  ".[voice]"'. ACP already ships in [all] so the curl install includes it.
- cli.md / configuration.md: 'auxiliary.compression.model' shown as
  'google/gemini-3-flash-preview' (the doc's own claimed default);
  actual default is empty (= use main model). Reworded as 'leave empty
  (default) or pin a cheap model'.
- built-in-plugins.md: added the bundled 'kanban/dashboard' plugin row
  that was missing from the table.

Regenerated skill catalogs:
- ran website/scripts/generate-skill-docs.py to refresh all 163 per-skill
  pages and both reference catalogs (skills-catalog.md,
  optional-skills-catalog.md). This adds the entries that were genuinely
  missing — productivity/teams-meeting-pipeline (bundled),
  optional/finance/* (entire category — 7 skills:
  3-statement-model, comps-analysis, dcf-model, excel-author, lbo-model,
  merger-model, pptx-author), creative/hyperframes,
  creative/kanban-video-orchestrator, devops/watchers,
  productivity/shop-app, research/searxng-search,
  apple/macos-computer-use — and rewrites every other per-skill page from
  the current SKILL.md. Most diffs are tiny (one line of refreshed
  metadata).

Validation:
- 'npm run build' succeeded.
- Broken-link count moved 146 -> 155 — the +9 are zh-Hans translation
  shells that lag every newly-added skill page (pre-existing pattern).
  No regressions on any en/ page.
2026-05-09 13:19:51 -07:00

11 KiB

title sidebar_label description
Docker Management Docker Management Manage Docker containers, images, volumes, networks, and Compose stacks — lifecycle ops, debugging, cleanup, and Dockerfile optimization

{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}

Docker Management

Manage Docker containers, images, volumes, networks, and Compose stacks — lifecycle ops, debugging, cleanup, and Dockerfile optimization.

Skill metadata

Source Optional — install with hermes skills install official/devops/docker-management
Path optional-skills/devops/docker-management
Version 1.0.0
Author sprmn24
License MIT
Platforms linux, macos, windows
Tags docker, containers, devops, infrastructure, compose, images, volumes, networks, debugging

Reference: full SKILL.md

:::info The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active. :::

Docker Management

Manage Docker containers, images, volumes, networks, and Compose stacks using standard Docker CLI commands. No additional dependencies beyond Docker itself.

When to Use

  • Run, stop, restart, remove, or inspect containers
  • Build, pull, push, tag, or clean up Docker images
  • Work with Docker Compose (multi-service stacks)
  • Manage volumes or networks
  • Debug a crashing container or analyze logs
  • Check Docker disk usage or free up space
  • Review or optimize a Dockerfile

Prerequisites

  • Docker Engine installed and running
  • User added to the docker group (or use sudo)
  • Docker Compose v2 (included with modern Docker installations)

Quick check:

docker --version && docker compose version

Quick Reference

Task Command
Run container (background) docker run -d --name NAME IMAGE
Stop + remove docker stop NAME && docker rm NAME
View logs (follow) docker logs --tail 50 -f NAME
Shell into container docker exec -it NAME /bin/sh
List all containers docker ps -a
Build image docker build -t TAG .
Compose up docker compose up -d
Compose down docker compose down
Disk usage docker system df
Cleanup dangling docker image prune && docker container prune

Procedure

1. Identify the domain

Figure out which area the request falls into:

  • Container lifecycle → run, stop, start, restart, rm, pause/unpause
  • Container interaction → exec, cp, logs, inspect, stats
  • Image management → build, pull, push, tag, rmi, save/load
  • Docker Compose → up, down, ps, logs, exec, build, config
  • Volumes & networks → create, inspect, rm, prune, connect
  • Troubleshooting → log analysis, exit codes, resource issues

2. Container operations

Run a new container:

# Detached service with port mapping
docker run -d --name web -p 8080:80 nginx

# With environment variables
docker run -d -e POSTGRES_PASSWORD=secret -e POSTGRES_DB=mydb --name db postgres:16

# With persistent data (named volume)
docker run -d -v pgdata:/var/lib/postgresql/data --name db postgres:16

# For development (bind mount source code)
docker run -d -v $(pwd)/src:/app/src -p 3000:3000 --name dev my-app

# Interactive debugging (auto-remove on exit)
docker run -it --rm ubuntu:22.04 /bin/bash

# With resource limits and restart policy
docker run -d --memory=512m --cpus=1.5 --restart=unless-stopped --name app my-app

Key flags: -d detached, -it interactive+tty, --rm auto-remove, -p port (host:container), -e env var, -v volume, --name name, --restart restart policy.

Manage running containers:

docker ps                        # running containers
docker ps -a                     # all (including stopped)
docker stop NAME                 # graceful stop
docker start NAME                # start stopped container
docker restart NAME              # stop + start
docker rm NAME                   # remove stopped container
docker rm -f NAME                # force remove running container
docker container prune           # remove ALL stopped containers

Interact with containers:

docker exec -it NAME /bin/sh          # shell access (use /bin/bash if available)
docker exec NAME env                   # view environment variables
docker exec -u root NAME apt update    # run as specific user
docker logs --tail 100 -f NAME         # follow last 100 lines
docker logs --since 2h NAME            # logs from last 2 hours
docker cp NAME:/path/file ./local      # copy file from container
docker cp ./file NAME:/path/           # copy file to container
docker inspect NAME                    # full container details (JSON)
docker stats --no-stream               # resource usage snapshot
docker top NAME                        # running processes

3. Image management

# Build
docker build -t my-app:latest .
docker build -t my-app:prod -f Dockerfile.prod .
docker build --no-cache -t my-app .              # clean rebuild
DOCKER_BUILDKIT=1 docker build -t my-app .       # faster with BuildKit

# Pull and push
docker pull node:20-alpine
docker login ghcr.io
docker tag my-app:latest registry/my-app:v1.0
docker push registry/my-app:v1.0

# Inspect
docker images                          # list local images
docker history IMAGE                   # see layers
docker inspect IMAGE                   # full details

# Cleanup
docker image prune                     # remove dangling (untagged) images
docker image prune -a                  # remove ALL unused images (careful!)
docker image prune -a --filter "until=168h"   # unused images older than 7 days

4. Docker Compose

# Start/stop
docker compose up -d                   # start all services detached
docker compose up -d --build           # rebuild images before starting
docker compose down                    # stop and remove containers
docker compose down -v                 # also remove volumes (DESTROYS DATA)

# Monitoring
docker compose ps                      # list services
docker compose logs -f api             # follow logs for specific service
docker compose logs --tail 50          # last 50 lines all services

# Interaction
docker compose exec api /bin/sh        # shell into running service
docker compose run --rm api npm test   # one-off command (new container)
docker compose restart api             # restart specific service

# Validation
docker compose config                  # validate and view resolved config

Minimal compose.yml example:

services:
  api:
    build: .
    ports:
      - "3000:3000"
    environment:
      - DATABASE_URL=postgres://user:pass@db:5432/mydb
    depends_on:
      db:
        condition: service_healthy

  db:
    image: postgres:16-alpine
    environment:
      POSTGRES_USER: user
      POSTGRES_PASSWORD: pass
      POSTGRES_DB: mydb
    volumes:
      - pgdata:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U user"]
      interval: 10s
      timeout: 5s
      retries: 5

volumes:
  pgdata:

5. Volumes and networks

# Volumes
docker volume ls                       # list volumes
docker volume create mydata            # create named volume
docker volume inspect mydata           # details (mount point, etc.)
docker volume rm mydata                # remove (fails if in use)
docker volume prune                    # remove unused volumes

# Networks
docker network ls                      # list networks
docker network create mynet            # create bridge network
docker network inspect mynet           # details (connected containers)
docker network connect mynet NAME      # attach container to network
docker network disconnect mynet NAME   # detach container
docker network rm mynet                # remove network
docker network prune                   # remove unused networks

6. Disk usage and cleanup

Always start with a diagnostic before cleaning:

# Check what's using space
docker system df                       # summary
docker system df -v                    # detailed breakdown

# Targeted cleanup (safe)
docker container prune                 # stopped containers
docker image prune                     # dangling images
docker volume prune                    # unused volumes
docker network prune                   # unused networks

# Aggressive cleanup (confirm with user first!)
docker system prune                    # containers + images + networks
docker system prune -a                 # also unused images
docker system prune -a --volumes       # EVERYTHING — named volumes too

Warning: Never run docker system prune -a --volumes without confirming with the user. This removes named volumes with potentially important data.

Pitfalls

Problem Cause Fix
Container exits immediately Main process finished or crashed Check docker logs NAME, try docker run -it --entrypoint /bin/sh IMAGE
"port is already allocated" Another process using that port docker ps or lsof -i :PORT to find it
"no space left on device" Docker disk full docker system df then targeted prune
Can't connect to container App binds to 127.0.0.1 inside container App must bind to 0.0.0.0, check -p mapping
Permission denied on volume UID/GID mismatch host vs container Use --user $(id -u):$(id -g) or fix permissions
Compose services can't reach each other Wrong network or service name Services use service name as hostname, check docker compose config
Build cache not working Layer order wrong in Dockerfile Put rarely-changing layers first (deps before source code)
Image too large No multi-stage build, no .dockerignore Use multi-stage builds, add .dockerignore

Verification

After any Docker operation, verify the result:

  • Container started?docker ps (check status is "Up")
  • Logs clean?docker logs --tail 20 NAME (no errors)
  • Port accessible?curl -s http://localhost:PORT or docker port NAME
  • Image built?docker images | grep TAG
  • Compose stack healthy?docker compose ps (all services "running" or "healthy")
  • Disk freed?docker system df (compare before/after)

Dockerfile Optimization Tips

When reviewing or creating a Dockerfile, suggest these improvements:

  1. Multi-stage builds — separate build environment from runtime to reduce final image size
  2. Layer ordering — put dependencies before source code so changes don't invalidate cached layers
  3. Combine RUN commands — fewer layers, smaller image
  4. Use .dockerignore — exclude node_modules, .git, __pycache__, etc.
  5. Pin base image versionsnode:20-alpine not node:latest
  6. Run as non-root — add USER instruction for security
  7. Use slim/alpine basespython:3.12-slim not python:3.12