Development

Building ReachyArchi: A Voice-Driven Robotic AWS Solutions Architect

How we combined a Reachy humanoid robot with Amazon Bedrock Nova Sonic to create an AI-powered Solutions Architect for AWS Summits

Alexandre Agius

AWS Solutions Architect

January 29, 2026 5 min read

AWS Robotics Amazon Bedrock Strands Agents BidiAgent Nova Sonic Tutorial

Table of Contents

The Problem
The Solution
Architecture Overview
Why Single BidiAgent?
The Integration Challenge
Implementation Deep Dive
Robot Tools: Cloud-to-Hardware Bridge
System Prompt Engineering
Barge-In Handling
6-Phase Conversation Flow
Demo in Action
Key Takeaways
Try It Yourself
What’s Next

The Problem

At AWS Summits, 10,000+ attendees compete for time with only 12 “Ask an Architect” booths. Wait times of 30+ minutes are common, and many attendees leave without getting architecture guidance.

What if we could scale the Solutions Architect experience using AI and robotics?

The Solution

ReachyArchi is an AI-powered robotic Solutions Architect that combines:

A Reachy humanoid robot from Pollen Robotics
Amazon Bedrock’s Nova Sonic model for real-time voice conversations
The Strands Agents SDK BidiAgent framework for bidirectional streaming
Real-time architecture diagram generation

The result: instant, personalized AWS architecture consultations with an engaging robotic presence.

Architecture Overview

Why Single BidiAgent?

After evaluating multiple agentic patterns, we chose Single BidiAgent:

Pattern	Verdict	Rationale
Single BidiAgent	Chosen	Best voice performance, no handoff latency
Graph	Rejected	Overkill for mostly linear conversation flow
Swarm	Rejected	No parallel independent agents needed
Hierarchy	Rejected	Single agent handles all phases efficiently

The bidirectional streaming model is essential for voice interactions - it allows continuous audio input/output while the agent reasons and calls tools concurrently.

The Integration Challenge

Tools run on Amazon Bedrock AgentCore Runtime, but the Reachy SDK runs locally on the robot. The solution: WebSocket Command Events.

┌─────────────────────┐     WebSocket      ┌─────────────────────┐
│  AgentCore (AWS)    │◄──────────────────►│  Reachy Mini        │
│  - BidiAgent        │   robot_command    │  - SDK Control      │
│  - Robot Tools      │   {action, params} │  - Audio I/O        │
│  - Arch Tools       │                    │  - Motor Execution  │
└─────────────────────┘                    └─────────────────────┘

Tools send JSON command events; the client translates them to SDK calls.

Implementation Deep Dive

Robot Tools: Cloud-to-Hardware Bridge

Each tool is an async function that sends commands via WebSocket - never importing the Reachy SDK directly:

@tool
async def nod_yes() -> str:
    """Nod to show agreement or understanding."""
    await _send({
        "type": "robot_command",
        "action": "animation",
        "params": {"name": "nod_yes"}
    })
    return "Nodded yes"

This pattern cleanly separates cloud reasoning from local hardware control.

System Prompt Engineering

The system prompt ensures ReachyArchi is expressive, not just a voice assistant:

CRITICAL: TOOL CALLING
ALWAYS call robot tools - NEVER just say their names. Every response needs movement!
- WRONG: Saying "wave_hello" or "I'm nodding"
- CORRECT: Actually invoke wave_hello() as function calls

MOVEMENT RULES - EVERY RESPONSE!
Call at least one robot tool per response to feel alive.

Barge-In Handling

Voice UX requires handling interruptions gracefully:

model = BidiNovaSonicModel(
    provider_config={
        "turn_detection": {
            "endpointingSensitivity": "HIGH"  # Fast barge-in for booth demos
        }
    }
)

When a user speaks mid-response, the client clears the audio buffer immediately and processes new input.

6-Phase Conversation Flow

ReachyArchi follows a state machine for structured interactions:

IDLE → GREETING → INCEPTION → DESIGN → ITERATION → DELIVERY → FAREWELL
                     ↑           │
                     └───────────┘ (needs_more_info)

GREETING: Wave hello, introduce self in French
INCEPTION: Ask 1-2 targeted questions (tilt head with look_curious())
DESIGN: Generate PNG + JSON diagrams in parallel
ITERATION: Refine based on feedback
DELIVERY: Generate QR code for companion app
FAREWELL: Wave goodbye, reset session

Demo in Action

Watch ReachyArchi in action: Demo Video on YouTube

A 90-second interaction showcases the full experience:

[User] "Bonjour Reachy!"
[Reachy] *waves and responds in French*

[User] "I want to build a mobile app with a REST API and a database"
[Reachy] "What type of workload? Serverless or containers?"

[User] "Serverless, high traffic"
[Reachy] "I recommend [API Gateway](https://aws.amazon.com/api-gateway/), [Lambda](https://aws.amazon.com/lambda/), and [DynamoDB](https://aws.amazon.com/dynamodb/). Generate the diagram?"

[User] "Oui!"
[Reachy] *generates architecture diagram - React frontend updates live*

[User] "Reachy, tu connais Werner Vogels?"
[Reachy] *dances* "Everything fails, all the time!"

The companion app updates in real-time via AWS AppSync as diagrams are generated.

Key Takeaways

WebSocket command events: Cleanly separate cloud AI from local hardware. Tools send JSON, clients execute - no SDK imports in cloud code.
Explicit tool invocation prompts: LLMs may “describe” tool calls instead of executing them. Be explicit: “ALWAYS call tools, NEVER just say their names.”
HIGH barge-in sensitivity: Essential for natural booth conversations. Users will interrupt - handle it gracefully.

Try It Yourself

The project is open source! Check out the code and try it yourself:

GitHub: github.com/agiusalexandre/reachyarchi
Demo Video: Watch on YouTube

Tech stack:

Strands Agents SDK for BidiAgent framework
Amazon Bedrock with Nova Sonic model
Reachy SDK for robot control
React Flow for diagram visualization

Check out the Strands Agents documentation to build your own voice-driven agent.

What’s Next

Multi-language support (French/English/German)
Human SA escalation notification system
Load testing for AWS Summit booth capacity

Target Metrics:

Metric	Target
Interactions per summit	500+
Wait time reduction	30 min → <2 min
Customer satisfaction	4.0+/5.0

Have questions? Connect with me on LinkedIn or check out more posts on agiusalexandre.com.

Alexandre Agius

AWS Solutions Architect

Passionate about AI & Security. Building scalable cloud solutions and helping organizations leverage AWS services to innovate faster. Specialized in Generative AI, serverless architectures, and security best practices.

LinkedIn GitHub

Development

Building ReachyArchi: A Voice-Driven Robotic AWS Solutions Architect

The Problem

The Solution

Architecture Overview

Why Single BidiAgent?

The Integration Challenge

Implementation Deep Dive

Robot Tools: Cloud-to-Hardware Bridge

System Prompt Engineering

Barge-In Handling

6-Phase Conversation Flow

Demo in Action

Key Takeaways

Try It Yourself

What’s Next

Alexandre Agius

Related Posts

Managing Local Storage in the AI Development Era

Browser Automation Agents - Amazon Bedrock AgentCore

AWS Backup Cost Analysis