Skip to content
AI Engineering

The Agent Memory Stack: Shipping Parallel Projects with Kiro CLI and Obsidian

AI agents forget everything between sessions. Your calendar lies. Your brain doesn't scale past a handful of active projects. Here's the three-layer memory stack I built with Kiro CLI + Obsidian to ship a dozen-plus projects in parallel without cross-contamination.

Alexandre Agius

Alexandre Agius

AWS Solutions Architect

8 min read
Share:

I’m an AWS engineer running 15+ projects in parallel. Not sequentially. Not “context-switching between two things.” A quick count of my Obsidian vault right now: over a dozen active project folders. Different customers, different codebases, different deadlines, different states of done. The real bottleneck stopped being “can the AI write code” months ago. The bottleneck is context. Chat windows forget everything the moment you close them. Your calendar says you worked on Project X last Thursday but tells you nothing about where you left off. Your brain doesn’t scale to that many active working sets. I needed something durable — a system where the agent could pick up any project cold and be productive in under a minute.

Agents Are Amnesiac by Design

Every AI agent session starts at zero. Blank slate. No memory of yesterday. You had a brilliant 3-hour session on Monday — architecture decisions, tricky debugging, a plan for the next sprint — and on Tuesday morning the agent greets you like a stranger. This is not a bug in the product. It’s the architecture. LLMs are stateless functions. Session state is ephemeral by design.

Here’s what makes this worse than working with a human junior engineer: a junior at least remembers last week. They remember the decision you made together about the database schema. They remember that the customer changed their mind about the API contract. They carry implicit context in their head, imperfect but present. An agent carries nothing. Every session is a cold start, and cold starts are expensive — not in tokens, but in the 10-15 minutes you spend re-explaining what the project is, what you decided, what’s blocked, and what “done” looks like.

I stopped accepting that tax. If the agent can’t remember, I’ll give it a memory system it can read on boot.

Obsidian Is the Durable Layer

Obsidian is a local markdown editor. That’s it. No cloud sync required, no proprietary format, no database. Flat files on disk. Grep-friendly. Git-friendly. And critically: agent-readable. An AI agent can ingest a markdown file in milliseconds and extract structured context from it with zero ambiguity.

I keep one folder per project under a standard scaffold:

5 - Projects/<slug>/
├── README.md
├── TODO.md
├── PLAN.md
├── LOG.md
├── progress.md
├── session-context.md
├── decisions/
├── deliverables/
└── specs/

The scaffold IS the memory. README.md is the project’s identity — what it is, who it’s for, current status. TODO.md is the current action list. PLAN.md is phases and milestones. LOG.md is a chronological event stream. And session-context.md is the critical one — it captures non-obvious decisions, traps, correctly-spelled names, things that would bite a new reader:

## Non-obvious facts
- The budget figure in old drafts is outdated — current number is smaller
- The approval process is lighter than assumed — verified
- A key API endpoint has a non-obvious URL format — verified
- Stakeholder X prefers async updates over meetings

When the agent opens a project, it reads session-context.md first. Cold start drops from ~15 minutes of re-onboarding to under a minute of loading files.

The Three-Layer Architecture

The system composes three layers, and the analogy to a Unix system is deliberate.

Layer 1 — Steering (~/.kiro/steering/*.md). These files load into every agent session regardless of which project I’m working on. They contain cross-project conventions: how to structure Obsidian notes, production safety rules for AWS credentials, accumulated session learnings (patterns that worked, anti-patterns to avoid), design system references. This is the /etc/ of the agent — global configuration that shapes behavior everywhere.

Layer 2 — Per-project memory (Obsidian vault). Each project’s 5 - Projects/<slug>/ folder is its durable state. README, TODO, PLAN, LOG, session-context, decisions, deliverables. The agent reads these at session start to reconstruct the full project context. This is var/lib/ — persistent application state that survives reboots. If I delete the session, the project memory remains intact.

Layer 3 — Session state (~/.omkc/sessions/). Hooks write a JSON record of what the agent did in each session — which project, what was accomplished, where it stopped. This is the write-ahead log. If a session gets interrupted or the context window fills up and compacts, the agent can read the last session record and resume without asking me “where were we?” It enables crash recovery.

The three layers are independent. I can blow away Layer 3 without losing Layer 2. I can work on a new machine with only Layer 1 and bootstrap Layer 2 from a git clone. Each layer has a different durability guarantee, just like a real storage system.

One Day, Three Contexts

A typical day: morning, I’m deep in a large industrial customer’s proof-of-concept. Multi-service integration, a tight deadline, opinionated stakeholders. The agent loads that project’s session-context.md, sees the current blockers in TODO.md, reads the latest decision record, and we’re productive immediately. No “remind me what this project is about.”

Midday, I switch to a bootstrapped product I’m launching on the side. Completely different domain — SaaS, Stripe integration, landing page copy, user onboarding flows. The agent loads a different session-context.md. Zero bleed from the morning’s enterprise architecture concerns. The product’s TODO says “wire up webhook handler,” and we start there.

Evening, I’m drafting this very blog post. Different project folder, different context, different voice. The agent knows this is a personal blog, knows the tone I write in (because the steering layer has style conventions), and produces a first draft that sounds like me rather than a corporate press release.

Three contexts. One agent. No cross-contamination. The mechanism is simple: each project has its own state directory, and the agent reads the right one. It’s the same principle as process isolation — separate address spaces prevent corruption.

Parallel Subagents Are Not Optional

A single-threaded agent is fine for writing a function. It’s inadequate for real project work where you need research, drafting, and review happening concurrently. Kiro CLI’s orchestrator fans out work to specialist subagents: an explorer for codebase investigation, a librarian for documentation lookup, a reviewer for red-teaming drafts, a document-writer for producing artifacts.

When I need three engineering specs drafted, they go out in parallel — not sequentially. When I need a draft reviewed for factual accuracy AND stylistic consistency, two reviewers run simultaneously. The wall-clock reduction is 3-5x on any research-draft-review cycle.

This post itself was produced that way. The orchestrator delegated exploration of my existing notes, then writing, then review — overlapping where dependencies allowed. I’m not manually copy-pasting between chat windows. The system handles fan-out and collation.

If you’re using an AI agent and doing everything sequentially in one thread, you’re leaving most of the throughput on the table. Parallelism isn’t a luxury. It’s how you make agent-assisted work actually faster than doing it yourself.

What Actually Bites

Brutal honesty about what breaks:

session-context.md rots. If you don’t update it at the end of a session, the next session starts with stale context. I enforce an end-of-session audit checklist: are TODO and PLAN aligned with the latest decisions? Are there commitments that exist only in conversation memory? Is session-context current? Skip this and you pay double next time.

Subagents hallucinate. Names, version numbers, internal processes — they’ll assert confidently and be wrong. I learned this the hard way when a subagent claimed a publication process required VP approval. It didn’t. Now the rule is: any factual claim about processes, people, or versions gets verified by a librarian subagent before I act on it. Trust but verify, emphasis on verify.

Red-teamers can themselves be wrong. I run a reviewer subagent that critiques drafts. It once flagged a correctly-spelled name as a typo. If I’d auto-applied the fix, I’d have introduced the error. Rule: factual findings from reviewers get independently verified. Style findings can be applied directly.

TODO.md and PLAN.md rot during pivots. A project changes direction — new scope, new audience, new timeline — and the planning artifacts still reflect the old strategy. Rule: update them in the same session as the pivot. Never leave stale plans for your future self to trip over.

The Honest Recommendation

If you’re solo-managing 5+ parallel projects and living entirely in chat windows, you’re running a distributed system with no persistent storage. Every session is a cache miss. Every Monday morning is a full rebuild from memory.

One afternoon of scaffolding fixes this. Create the folder structure. Write the steering files. Set up the session hooks. The pattern — call it an “agent memory stack” — pays back within a week. By my rough estimate, I went from spending a meaningful chunk of each session on context reconstruction to spending almost none.

The tool name is interchangeable. Replace Kiro CLI with Claude Code, Cursor, Cline, or whatever ships next quarter. The pattern holds because it’s not about the agent — it’s about giving any stateless system a durable memory layer. Same reason databases need a WAL regardless of which query engine sits on top.

Fork the scaffold. Adapt the steering rules to your conventions. The agent doesn’t care about your folder names. It cares that the information exists, is structured, and is readable on cold start. Give it that, and it stops being amnesiac.

Alexandre Agius

Alexandre Agius

AWS Solutions Architect

Passionate about AI & Security. Building scalable cloud solutions and helping organizations leverage AWS services to innovate faster. Specialized in Generative AI, serverless architectures, and security best practices.

Never miss a post

Get notified when I publish new articles about AI, Cloud, and AWS.

No spam, unsubscribe anytime.

Comments

Sign in to leave a comment

Related Posts

AI Engineering

The Agent Memory Problem: Why 5+ Solutions Exist and None Won

Mem0, Letta, Zep, graph-RAG, Neptune Memory, HiveMemory, Obsidian steering files -- the agent memory space is fragmenting faster than it's converging. Here's a landscape analysis of why no single solution wins, the four types of memory agents actually need, and a decision framework for choosing your architecture.

10 min