mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-05-30 06:41:51 +00:00

History

kshitijk4poor 66827f8947 chore: prune unused imports and duplicate import redefinitions Remove unused imports (F401) and duplicate/shadowed import redefinitions (F811) across the codebase using ruff's safe autofixes. No behavioral changes -- imports only. - ~1400 safe autofixes applied across 644 files (net -1072 lines) - __init__.py re-exports preserved (excluded from F401 removal so public re-export surfaces stay intact) - Re-exports that are imported or monkeypatched by tests but look unused in their defining module are kept with explicit # noqa: F401 (gateway/run.py load_dotenv; run_agent re-exports from agent.message_sanitization, agent.context_compressor, agent.retry_utils, agent.prompt_builder, agent.process_bootstrap, agent.codex_responses_adapter) - Unsafe F841 (unused-variable) fixes deliberately skipped -- those can change behavior when the RHS has side effects - ruff lints remain disabled in pyproject.toml (only PLW1514 is selected); this is a one-time cleanup, not a config change Verification: - python -m compileall: clean - pytest --collect-only: all 27161 tests collect (zero import errors) - core entry points import clean (run_agent, model_tools, cli, toolsets, hermes_state, batch_runner, gateway) - static scan: every name any test imports directly from an edited module still resolves		2026-05-28 22:26:25 -07:00
..
_fake_worker.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
conftest.py	feat(kanban): durable multi-profile collaboration board (#17805 )	2026-04-30 13:36:47 -07:00
README.md	feat(kanban): durable multi-profile collaboration board (#17805 )	2026-04-30 13:36:47 -07:00
test_atypical_scenarios.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_benchmarks.py	feat(kanban): durable multi-profile collaboration board (#17805 )	2026-04-30 13:36:47 -07:00
test_concurrency.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_concurrency_mixed.py	feat(kanban): durable multi-profile collaboration board (#17805 )	2026-04-30 13:36:47 -07:00
test_concurrency_parent_gate.py	fix(kanban): gate claim + unblock on parent completion	2026-05-09 11:07:37 -07:00
test_concurrency_reclaim_race.py	feat(kanban): durable multi-profile collaboration board (#17805 )	2026-04-30 13:36:47 -07:00
test_property_fuzzing.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_subprocess_e2e.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00

README.md

Stress / battle-test suite

Long-running tests that exercise the Kanban kernel under adversarial conditions. Not run by scripts/run_tests.sh because they can take 30+ seconds each and spawn real subprocesses.

Run manually:

./venv/bin/python -m pytest tests/stress/ -v -s
# or individual files:
./venv/bin/python tests/stress/test_concurrency.py
./venv/bin/python tests/stress/test_subprocess_e2e.py
./venv/bin/python tests/stress/test_property_fuzzing.py
./venv/bin/python tests/stress/test_benchmarks.py

What's covered

test_concurrency.py — 5 workers, 100 tasks, race-for-claim. Asserts no double-claims, no orphan runs, no SQLite errors escape retry.
test_concurrency_mixed.py — 10 workers + 1 reclaimer, 500 tasks, random ops (claim/complete/block/unblock/archive). Same invariants under adversarial scheduling.
test_concurrency_reclaim_race.py — TTL < work duration so the reclaimer intentionally yanks tasks mid-work; verifies the worker's late-complete is refused cleanly (CAS guard works).
test_subprocess_e2e.py — dispatcher spawns real Python subprocess workers that heartbeat + complete via the CLI; crash detection against a real dead PID.
test_property_fuzzing.py — 500 random operation sequences, ~40k operations total, 9 invariant checks after each step.
test_atypical_scenarios.py — 28 scenarios covering atypical user inputs: unicode/emoji/RTL, 1 MB strings, SQL injection attempts, cycles, self-parents, wide fan-in/out, clock skew, HERMES_HOME with spaces/unicode/symlinks, 1000 runs on one task, idempotency-key race across processes, terminal-state resurrection attempts, dashboard REST with weird JSON.
test_benchmarks.py — latency at 100/1k/10k tasks for dispatch, recompute_ready, list_tasks, build_worker_context, etc. Results saved to JSON for regression diffing.