Skip to content
☁️

AWS Architecture

Patterns, services, and architectural decisions for building on AWS — including Bedrock, SageMaker, Transfer Family, and more.

49 articles in this series
AI/ML

SCOT vs Chronos: Two Philosophies of Forecasting at Amazon

Amazon built two radically different approaches to predicting the future — a proprietary supply chain optimization pipeline (SCOT) and an open-source time series foundation model (Chronos). This post compares their architectures, trade-offs, and when each philosophy applies.

·7 MIN READ Read →
AI Engineering

AWS DevOps Agent: Build vs Buy for Enterprise AIOps

AWS DevOps Agent is GA and included with Support plans. But it doesn't replace your custom agents -- it complements them. Here's the hybrid pattern: what to buy, what to build, and how MCP bridges the gap.

·13 MIN READ Read →
AI Engineering

The Agent Memory Problem: Why 5+ Solutions Exist and None Won

Mem0, Letta, Zep, graph-RAG, Neptune Memory, HiveMemory, Obsidian steering files -- the agent memory space is fragmenting faster than it's converging. Here's a landscape analysis of why no single solution wins, the four types of memory agents actually need, and a decision framework for choosing your architecture.

·10 MIN READ Read →
AI Engineering

MCP Gateway as Policy Enforcement Point: RBAC for Your Agent's Tool Access

Your AI agent has access to tools that perform real actions -- approving expenses, querying databases, modifying infrastructure. Prompt-based guardrails don't survive adversarial inputs. Here's how AgentCore Gateway + Cedar policies create a deterministic enforcement layer that operates independently of the agent's reasoning.

·9 MIN READ Read →
AI Engineering

AWS Agent Toolkit GA: How I Gave an Agent 15,000 AWS APIs Without Losing Sleep

AWS released the Agent Toolkit for AWS on May 6, 2026 -- a managed MCP server exposing the full AWS API surface to autonomous agents. I shipped an infrastructure agent the same week. Here's the two-phase safety pattern that lets you hand an agent the keys to your account without waking up to a $10K bill.

·9 MIN READ Read →
AI Engineering

When Your AI Agent Runs Away: 204 PRs, $900 Wasted, and the 3-Layer Fix

I woke up to 204 pull requests from a single autonomous agent running overnight. 12 hours, ~$900 in Bedrock tokens, 509 failed builds, zero features shipped. Prompt-only safeguards all failed. Here's the 3-layer fix — hard kill switch, atomic circuit breakers, drift observability — that now prevents runaway agents.

·13 MIN READ Read →
AI

Vector Search vs Semantic Search: They're Not the Same Thing

Vector search, semantic search, keyword search, hybrid search — these terms get used interchangeably but they mean different things. This post breaks down what each actually does, when each matters, and why hybrid search wins for RAG.

·12 MIN READ Read →
Cloud

When Your Keys Get Locked In: Navigating AWS KMS Import Limitations

AWS KMS doesn't allow key material export by design. When an external PKI partner generates keys but doesn't retain them, you're stuck. Here are the four AWS alternatives — CloudHSM, XKS, Private CA, and fixing the process — with a decision framework to pick the right one.

·14 MIN READ Read →
AI

LLM Architecture Explained Simply: 10 Questions From Prompt to Token

A beginner-friendly walkthrough of how an LLM actually works end-to-end: from typing a prompt to receiving a response — covering tokenization, embeddings, Transformer layers, KV cache, the training loop, embeddings for search, and why decoder-only models won.

·17 MIN READ Read →
AI

Transformer Anatomy: Attention + FFN Demystified

A deep dive into the Transformer architecture — how attention connects tokens and why the Feed-Forward Network is the real brain of the model. Plus the key to understanding Mixture of Experts (MoE).

·15 MIN READ Read →
AI

RAG on AWS: Which Vector Store Is Right for You?

AWS now offers 9 different ways to store and search vectors for RAG workloads. This guide compares every option through the Well-Architected Framework to help you pick the right one.

·22 MIN READ Read →
Development

AWS Backup Cost Analysis

EBS snapshot costs were growing month-over-month with no clear explanation or optimization strategy.

·4 MIN READ Read →

Explore More Series

Back to All Series Back to Blog