AetherClaw The Distributed Lightweight AI Agent Runtime

Roadmap

Build the runtime first. Expand without losing the operating model.

This page tracks the actual roadmap published in the core repository. The sequence is deliberate: ship a strong runtime center, then extend it without losing deployability or operator clarity.

Roadmap intent

Each phase expands scope only after the previous layer has a coherent operator story. The target is not feature count. The target is a deployable AI agent runtime that remains understandable under load, across hardware profiles, and across teams.

  • Protect a minimal, single-binary runtime core.
  • Keep distributed features optional until the local path is solid.
  • Define performance, observability, and extensibility as product requirements.
  1. Phase 1

    Foundation

    Finish multimodal input, add MCP server and client support, and land high-leverage runtime wins.

  2. Phase 2

    Differentiation

    Use Go's strengths for distributed mesh networking, ACP interoperability, stronger memory, and safer execution.

  3. Phase 3

    Surpass

    Turn AetherClaw into a bidirectional MCP hub with production-grade observability and polyglot extension paths.

  4. Phase 4

    Ecosystem

    Expand into browser automation, voice, WebAssembly plugins, more channels, and embeddable application use.

Current state

What exists today.

The roadmap starts from a real runtime that already ships as a single Go binary.

27 tools

File operations, shell, web search and fetch, messaging, image generation, TTS, memory, cron scheduling, approvals, and more.

14 messaging channels

Telegram, Discord, Slack, WhatsApp, Feishu, DingTalk, LINE, QQ, OneBot, WeCom, MaixCam, Pico WebSocket, and others.

20+ LLM providers

Anthropic, Gemini, Groq, DeepSeek, Ollama, OpenRouter, Mistral, Qwen, and additional HTTP-compatible provider integrations.

Multi-agent runtime

Agent registry, seven-level priority routing, spawn and subagent delegation, cross-agent sessions, and model fallback chains.

Operational infrastructure

Cron service, heartbeat, device event monitoring, skills marketplace integration, and prompt-context caching.

Design principles

The non-negotiables behind the roadmap.

These are the constraints the roadmap is preserving while the system gets more capable.

  1. Principle 1

    Single binary, zero dependencies

    `scp AetherClaw server:` and you are running.

  2. Principle 2

    Runs anywhere

    Targets x86, ARM, and RISC-V across cloud VMs, local machines, and embedded hardware.

  3. Principle 3

    MCP-native

    Model Context Protocol support is first-class, not layered on later.

  4. Principle 4

    Extend in any language

    MCP servers replace language-locked plugin SDKs.

  5. Principle 5

    Minimal resource footprint

    Designed around roughly 20 MB idle RAM and sub-100 ms startup.

  6. Principle 6

    Embeddable

    The runtime should work as a Go library inside other applications.

Phases

The main repository plan, expanded.

Each phase from the source roadmap is broken into concrete initiatives and deliverables.

Phase 1

Foundation

Core capabilities that unlock real-world usage.

Inbound Vision / Multimodal

  • Wire inbound media files into multimodal user content parts.
  • Add base64 and MIME-aware encoding for HTTP providers.
  • Support Anthropic-native image blocks.
  • Fix Telegram photo and document lifecycle handling.
  • Provide CLI fallback by saving images and referencing paths in prompts.

MCP Server

  • Ship stdio transport for IDE integrations such as VS Code, Cursor, and Claude Code.
  • Add HTTP + SSE transport for network clients.
  • Expose registered tools as MCP tools.
  • Expose agent sessions as MCP resources.
  • Publish prompt templates for common workflows.

MCP Client

  • Launch local MCP server processes over stdio.
  • Connect to remote MCP servers over HTTP + SSE.
  • Discover and register tools dynamically.
  • Support per-agent MCP server configuration.
  • Manage server lifecycle with start, stop, and restart paths.

Quick Wins

  • Wire EnrichMessageWithLinks into the message pipeline.
  • Register BlackboardTool and HandoffTool from the multiagent package.
  • Add auth rotation into the provider factory with cooldown-aware round robin.
  • Ship Edge TTS as a free provider option.

Phase 2

Differentiation

Features that set AetherClaw apart through Go's strengths.

Multi-Node Mesh

  • Use mDNS for local node auto-discovery.
  • Support WireGuard tunnels for remote nodes.
  • Advertise device capabilities such as camera, screen, location, and notifications.
  • Define a companion protocol for mobile apps.
  • Delegate tasks across nodes based on capabilities.

ACP

  • Implement the Agent Control Protocol server.
  • Spawn ACP sessions as subagents.
  • Persist ACP session state across runtime restarts.

Memory Upgrade

  • Move from JSON files to SQLite using modernc.org/sqlite.
  • Add sqlite-vec for vector similarity search.
  • Use FTS5 instead of custom BM25.
  • Add MMR for result diversity and temporal decay scoring.
  • Index session transcripts incrementally.

Subagent Management

  • Create a subagents tool to list, kill, and steer running subagents.
  • Enforce depth limits to prevent infinite recursion.
  • Bind subagent replies to specific channel threads.
  • Define timeout and cleanup policies.

Security Hardening

  • Protect web_fetch against SSRF and internal IP access.
  • Offer an optional Docker sandbox for exec.
  • Add audit logging for tool execution.
  • Enforce configurable workspace-only file access guards.

Phase 3

Surpass

Capabilities that go beyond conventional AI agent platforms.

MCP Hub

  • Run server and client MCP modes simultaneously.
  • Manage namespaces across MCP tool sources.
  • Monitor MCP health and auto-reconnect on failures.
  • Declare MCP servers through configuration.

Plugin Architecture via MCP

  • Support Python MCP servers for data workflows.
  • Support JavaScript and TypeScript MCP servers for web tooling.
  • Support Rust MCP servers for performance-critical tasks.
  • Add plugin marketplace discovery and installation.

Observability

  • Expose Prometheus metrics at /metrics.
  • Trace tool executions and LLM calls with OpenTelemetry.
  • Finish structured JSON logging work already underway.
  • Add health endpoints with dependency status.

Phase 4

Ecosystem

Build the surrounding platform without bloating the core runtime.

Browser Automation (CDP Native)

  • Implement a pure-Go CDP WebSocket client.
  • Support navigate, click, type, screenshot, and JavaScript evaluation.
  • Generate AI-oriented snapshots via accessibility tree extraction.
  • Manage tabs and profiles.
  • Optionally relay the user's live tabs through a browser extension.

Speech-to-Text

  • Add Whisper-compatible API support.
  • Add Groq Whisper for fast free-tier transcription.
  • Add Deepgram.
  • Reuse provider fallback chains for STT.

WebAssembly Plugin Runtime

  • Integrate wazero as a pure-Go Wasm runtime.
  • Add WASI support for filesystem and network access.
  • Enable hot-loading tools without restarting agents.
  • Enforce per-plugin memory and CPU limits.

Streaming Voice Pipeline

  • Build a goroutine-based STT to LLM to TTS pipeline.
  • Support WebSocket audio streaming for web clients.
  • Add voice activity detection for natural turn-taking.
  • Integrate telephony via SIP or WebRTC.

Web Dashboard

  • Use templ and htmx for a server-rendered admin panel.
  • Expose agent configuration and status monitoring.
  • Add a session browser and message history.
  • Show tool execution logs and metrics.
  • Manage channel connection state.

Terminal UI

  • Build a bubbletea-based TUI.
  • Visualize live tool calls.
  • Support multi-agent session switching.
  • Render inline images in compatible terminals.

Additional Channels

  • Add Matrix, Microsoft Teams, Google Chat, Nostr, IRC, Mattermost, and Twitch.
  • Provide a generic webhook channel for arbitrary HTTP POST inbound messages.

Embedded Mode

  • Expose a clean public API surface such as AetherClaw.New(), agent.Chat(), and tools.Register().
  • Remove global state and rely on injectable dependencies.
  • Ship embedding examples for Go web servers and CLI tools.

Why AetherClaw

What the project is trying to preserve as it grows.

The roadmap closes with a concrete claim: the runtime should stay materially lighter and easier to deploy than typical agent stacks.

Dimension Typical AI Agent (Node.js) AetherClaw
Binary node_modules, 500 MB+ Single file, ~30 MB
RAM (idle) ~150 MB+ ~20 MB
Startup 3-5 s <100 ms
Deploy npm install, configure, PM2 scp + run
Platforms Linux, macOS (x86/ARM) Linux, macOS, Windows, ARM, RISC-V
Edge / Embedded Not practical Raspberry Pi Zero, routers, NAS
Extensions npm packages (JS/TS only) MCP in any language
Embedding Not embeddable Import as a Go library