Multi-Agent Orchestration Patterns with Hermes Agent
Why Multi-Agent?
A single AI agent is powerful. Multiple agents working together are transformative. Multi-agent orchestration lets you:
- Divide complex tasks β Each agent focuses on what it does best
- Parallel processing β Multiple agents work simultaneously
- Quality control β Agents review each other's work
- Specialization β Dedicated agents for code, writing, research, etc.
Think of it like a team of specialists vs. one generalist. You wouldn't ask a single person to be a designer, developer, and QA tester all at once.
Core Concepts
Agents
An agent is a Hermes instance with a specific role, model, and skills:
# Define an agent
agents:
coder:
model: "anthropic/claude-3.5-sonnet"
system_prompt: "You are an expert software engineer..."
skills: ["code-runner", "github-integration"]
reviewer:
model: "openai/gpt-4o"
system_prompt: "You are a senior code reviewer..."
skills: ["github-integration"]
writer:
model: "openai/gpt-4o-mini"
system_prompt: "You are a technical writer..."
skills: ["file-manager"]
Messages
Agents communicate through messages:
interface AgentMessage {
from: string; // Agent name
to: string; // Target agent
content: string; // Message content
metadata?: {
task_id: string;
iteration: number;
attachments?: string[];
};
}
Orchestrators
An orchestrator manages the workflow between agents:
hermes orchestrate --config workflow.yaml
Pattern 1: Supervisor
One agent delegates tasks and reviews results.
βββββββββββββββ
β Supervisor β
β (GPT-4o) β
ββββββββ¬ββββββββ
βββββββββββββΌββββββββββββ
βΌ βΌ βΌ
ββββββββββββ ββββββββββββ ββββββββββββ
β Coder β β Writer β β Tester β
β (Claude) β β(GPT-Mini)β β (Claude) β
ββββββββββββ ββββββββββββ ββββββββββββ
# supervisor-workflow.yaml
orchestration:
pattern: supervisor
supervisor:
model: "openai/gpt-4o"
system_prompt: |
You are a project manager. Break tasks into subtasks,
assign them to the right agent, and verify the results.
max_iterations: 10
workers:
coder:
model: "anthropic/claude-3.5-sonnet"
system_prompt: "You write production-quality code."
skills: ["code-runner", "file-manager"]
writer:
model: "openai/gpt-4o-mini"
system_prompt: "You write clear documentation."
skills: ["file-manager"]
tester:
model: "anthropic/claude-3.5-sonnet"
system_prompt: "You write and run comprehensive tests."
skills: ["code-runner"]
Use when: Building complete features that need coding, testing, and docs.
Running the Supervisor Pattern
hermes orchestrate --config supervisor-workflow.yaml \
"Build a REST API for a todo list with CRUD operations, tests, and docs"
π― Supervisor: Breaking down the task...
π Plan:
1. [Coder] Build the Express API with CRUD endpoints
2. [Tester] Write unit tests for all endpoints
3. [Writer] Create API documentation
4. [Supervisor] Review all deliverables
π§ [Coder] Building API...
β
Created: src/app.ts, src/routes/todos.ts, src/models/todo.ts
π§ͺ [Tester] Writing tests...
β
Created: tests/todos.test.ts (12 tests)
β
All tests passing
π [Writer] Writing documentation...
β
Created: docs/API.md
π [Supervisor] Reviewing...
β οΈ Missing input validation on POST /todos
β Sending back to Coder...
π§ [Coder] Adding input validation...
β
Updated: src/routes/todos.ts
π [Supervisor] Final review...
β
All deliverables approved!
π¦ Output:
src/app.ts
src/routes/todos.ts
src/models/todo.ts
tests/todos.test.ts
docs/API.md
Pattern 2: Pipeline
Agents process data in sequence, each adding value.
[Input] β [Research] β [Draft] β [Edit] β [Format] β [Output]
# pipeline-workflow.yaml
orchestration:
pattern: pipeline
stages:
- name: research
model: "openai/gpt-4o"
system_prompt: "Research the topic thoroughly."
skills: ["web-search"]
- name: draft
model: "anthropic/claude-3.5-sonnet"
system_prompt: "Write a detailed technical article."
- name: edit
model: "openai/gpt-4o"
system_prompt: "Edit for clarity, accuracy, and flow."
- name: format
model: "openai/gpt-4o-mini"
system_prompt: "Format as markdown with proper headings."
Use when: Content creation pipelines, data processing workflows.
Pattern 3: Debate
Two agents argue opposing positions, producing a balanced result.
ββββββββββββ
β Judge β
β (GPT-4o) β
ββββββ¬ββββββ
βββββββ΄βββββββ
βΌ βΌ
ββββββββββ ββββββββββ
β Pro β β Con β
β(Claude)β β(Claude)β
ββββββββββ ββββββββββ
# debate-workflow.yaml
orchestration:
pattern: debate
rounds: 3
agents:
proponent:
model: "anthropic/claude-3.5-sonnet"
system_prompt: "Argue IN FAVOR of the proposition."
opponent:
model: "anthropic/claude-3.5-sonnet"
system_prompt: "Argue AGAINST the proposition."
judge:
model: "openai/gpt-4o"
system_prompt: |
Evaluate both arguments fairly. Synthesize the best
points from each side into a balanced conclusion.
Use when: Decision making, evaluating tradeoffs, exploring complex topics.
Example: Technology Decision
hermes orchestrate --config debate-workflow.yaml \
"Should we use MongoDB or PostgreSQL for our e-commerce platform?"
Pattern 4: Map-Reduce
Parallelize work across multiple agents, then combine results.
[Task]
β
ββββββββββββββΌβββββββββββββ
βΌ βΌ βΌ
[Worker 1] [Worker 2] [Worker 3]
β β β
ββββββββββββββΌβββββββββββββ
βΌ
[Reducer]
# map-reduce-workflow.yaml
orchestration:
pattern: map-reduce
mapper:
model: "openai/gpt-4o-mini"
system_prompt: "Analyze the assigned code file."
parallel: true
max_workers: 5
reducer:
model: "anthropic/claude-3.5-sonnet"
system_prompt: |
Combine all analysis results into a single report.
Prioritize critical issues first.
Use when: Code review across many files, bulk data analysis, parallel research.
Configuring Multi-Agent Workflows
CLI Approach
# Run with a config file
hermes orchestrate --config workflow.yaml "Your task here"
# Quick two-agent setup
hermes orchestrate --pattern supervisor \
--worker coder:claude-3.5-sonnet \
--worker tester:claude-3.5-sonnet \
"Build and test a binary search function"
Programmatic Approach
import { Orchestrator, Agent } from 'hermes-agent/orchestration';
const coder = new Agent({
name: 'coder',
model: 'anthropic/claude-3.5-sonnet',
systemPrompt: 'You write clean, tested code.',
});
const reviewer = new Agent({
name: 'reviewer',
model: 'openai/gpt-4o',
systemPrompt: 'You review code for bugs and improvements.',
});
const orchestrator = new Orchestrator({
pattern: 'supervisor',
supervisor: reviewer,
workers: [coder],
});
const result = await orchestrator.run(
'Implement a rate limiter using the token bucket algorithm'
);
Cost Management
Multi-agent workflows multiply API costs. Here's how to manage them:
| Strategy | How |
|----------|-----|
| Use cheap models for simple agents | Writer/formatter β GPT-4o Mini |
| Limit iterations | max_iterations: 5 |
| Set budgets | max_cost: 0.50 (per workflow run) |
| Cache intermediate results | cache: true |
| Use local models for testing | model: ollama/llama3.1:8b |
orchestration:
cost_control:
max_cost_per_run: 1.00 # USD
warn_at: 0.50 # Warn at 50%
model_fallback: "openai/gpt-4o-mini" # Fall back when budget is tight
Monitoring Multi-Agent Runs
# Watch a running orchestration
hermes orchestrate status
# View agent communication log
hermes orchestrate log --last-run
# Performance metrics
hermes orchestrate metrics
Orchestration Metrics (last run):
Pattern: supervisor
Duration: 4m 23s
Total tokens: 45,230
Total cost: $0.34
Agent Performance:
supervisor: 3 decisions, $0.08
coder: 2 tasks, $0.18
tester: 1 task, $0.06
writer: 1 task, $0.02
Iterations: 4 (max: 10)
Result: β
Success
Next Steps
- MCP Integration β Connect agents to external tools
- Memory System Deep Dive β How agents share context
- Cron scheduling β Run multi-agent workflows on a schedule
Last updated: April 18, 2026 Β· Hermes Agent v0.8