Learning

Hermes Agent Learning Guide

Concise guide to Hermes Agent, the key difference from OpenClaw, and the Git-first doc publishing flow.

Hermes Agent Learning Guide

Key difference

Core Functional Differences

Learning & Skill Creation

  • Hermes Agent features a unique closed learning loop. It autonomously writes, refines, and patches its own skills based on task outcomes, effectively building its own instruction manual over time.
  • OpenClaw relies on static, human-authored skills (over 5,700 available via ClawHub) that users must manually install and manage.

Illustration: how Hermes learns from repetitive work

User: Please publish the project docs again.
Hermes: I will sync Markdown, build the DocUs site, deploy to Vercel, and verify the live URL.

User, later: Publish the docs again.
Hermes: I recognize this as the same publishing workflow. I will reuse the saved Git-first doc publishing skill.

Hermes, after completing the task: This workflow was repeated and had several exact steps. I saved it as a skill so future sessions can run it consistently without asking you to restate the process.

Future session:
User: Publish the guide.
Hermes: I load the publishing skill, update the Markdown, build the site, deploy, and verify the public URL.

The important point is that Hermes does not only complete the task. When a workflow becomes repeatable, it can turn the workflow into reusable operating knowledge.

Presence & Orchestration

  • OpenClaw excels at gateway functions, acting as a single hub to route tasks across 50+ channels including Telegram, WhatsApp, Discord, and iMessage.
  • Hermes Agent focuses on deep individual productivity, often running as a 24/7 background service with a strong emphasis on CLI and sandboxed tool integration for research and debugging.

Infrastructure & Security

  • Hermes is built for single-operator privacy, using a SQLite database for memory and offering built-in prompt injection scanning.
  • OpenClaw is designed for multi-user/team environments with enterprise-grade access controls and governance.

OpenClaw and Hermes both do agent work. Hermes is more self-learning operationally because it remembers, turns good workflows into skills, and reuses them with less manual re-explaining.

Major function setup

Three main skills are set up or in progress:

  • Git-first doc publishing model
  • STT and TTS with Cantonese
  • Linear project management (in progress)

Git-first doc publishing model

The goal is to keep important documents in Markdown, store them in Git, and use Docusaurus / DocUs to publish readable sites for project and personal knowledge.

  • Git is the source of truth
  • Markdown/MDX is the authoring format
  • DocUs / Docusaurus is the reading layer
  • live docs stay current
  • frozen releases preserve reviewed snapshots

Directory structure

~/workspace/projects/pNN-slug/
  docs/       # project source documents
  src/        # project implementation files
  scripts/    # project automation scripts
  tests/      # project tests

~/workspace/sites/project/
  docs/       # generated Docusaurus input
  build/      # generated project site output
  scripts/    # sync and build scripts

~/workspace/personal/
  blogs/      # personal blog Markdown
  travels/    # travel Markdown
  learning/   # learning-guide Markdown

~/workspace/sites/personal/
  dist/       # generated personal site output
  scripts/    # personal site build scripts

System flow

  1. Write docs in the Git repo.
  2. Sync project Markdown into the DocUs site.
  3. Build the site.
  4. Publish the generated output.
  5. Keep both live docs and frozen snapshots available.

This gives us a clean path from source docs to published docs without treating the website as the source of truth.

Why this matters

  • docs stay versioned in Git
  • publishing is repeatable
  • leadership can read stable snapshots
  • the team can keep working without re-authoring the same guidance

Setup prompt for a fresh Hermes Agent

Can give a fresh Hermes Agent a prompt like this:

Set up a Git-first documentation publishing workflow for my workspace.

Use this operating model:
1. Git is the source of truth.
2. Markdown/MDX is the authoring format.
3. Docusaurus is the reading layer.
4. Keep live docs for current work and frozen snapshots for reviewed releases.
5. Commit meaningful documentation, site, build-script, and deploy changes.

Create or follow this directory structure:
- ~/workspace/projects/pNN-slug/docs for project documents
- ~/workspace/sites/project for the project DocUs site
- ~/workspace/personal/blogs, travels, and learning for personal Markdown
- ~/workspace/sites/personal for the personal publishing site

Build the project publishing flow:
- scan project docs from ~/workspace/projects/*/docs
- generate the project site input under ~/workspace/sites/project/docs
- build the site
- publish it to Vercel
- verify the public URL returns HTTP 200

Build the personal publishing flow:
- scan Markdown from ~/workspace/personal/blogs, travels, and learning
- generate the personal site output under ~/workspace/sites/personal/dist
- publish it to Vercel
- verify the public URL returns HTTP 200

After the workflow works, save the process as a reusable Hermes skill so future sessions can publish without me restating every step.

STT and TTS with Cantonese

Voice support lets Hermes accept spoken input and generate spoken output when useful.

Current Hermes setup:

  • STT means speech-to-text. It turns user audio into text that Hermes can process.
  • Current STT service: local transcription.
  • Current STT model: base.
  • Practical meaning: Hermes is configured to transcribe voice locally through a Whisper-style local STT path rather than sending every voice message to a cloud STT provider.
  • TTS means text-to-speech. It turns Hermes text output into audio.
  • Current active TTS service: Microsoft Edge TTS.
  • Current active TTS voice: en-US-AriaNeural.

Why this matters:

  • local STT keeps the transcription path more private and less dependent on cloud services

Linear Project Management (in progress)

The objective is to make project execution visible and controllable outside chat. Linear acts as the project-management layer where work can be assigned, reviewed, blocked, and completed.

Why it is important:

  • chat is not enough for durable project tracking
  • Linear gives each task an owner, state, and history
  • human review can happen through clear workflow states
  • agents can pick up structured work instead of relying on long chat context

Very brief workflow:

  1. Create or update a Linear issue for the work.
  2. Assign agent-owned work to the agent when supported.
  3. Move active work to In Progress.
  4. Add progress comments from the assistant with the 🤖 Ryo: prefix.
  5. Move completed work to Review for human approval.
  6. Move blocked work to Blocked with a clear reason.
  7. Keep project docs and Linear state aligned.

This is still work in progress. The next step is to test the best multi-agent orchestration model: Linear assignment, Telegram group handoff, or ACP-based agent-to-agent coordination.

Reflection from Alger

  • Tried several methods to set up mission control in OpenClaw so different agents could work without too much human intervention, and to introduce human-in-the-loop review rather than continuous chat with the agent. This did not work well enough.
  • Linear is JIRA-like and has out-of-the-box integration.
  • Multi-agent orchestration is still work in progress. A few directions still need testing:
  • - use Linear ticket assignment to enable agent-to-agent information flow - put two agents in a Telegram chat group and let them pick up messages - enable ACP so different agents can share information; the first use case is one agent helping set up another agent with specific skills