The RISEN Framework: Writing System Prompts That Actually Work for AI Agents
A 5-component framework for writing effective system prompts for any AI agent — Bedrock Agents, Claude Code, LangChain, Strands, or custom builds. With a practical Claude Code implementation.
Table of Contents
Most system prompts for AI agents are written as unstructured wish lists. RISEN is a 5-component framework that fixes that — for any agent platform, from Bedrock Agents to Claude Code to custom builds.
The Problem
Every AI agent needs a system prompt. Whether you’re building a Bedrock Agent, configuring Claude Code’s CLAUDE.md, writing a Strands agent system prompt, or setting up a LangChain chain — somewhere, there’s a block of text telling the AI who it is and what to do.
The problem is that most system prompts are written like stream-of-consciousness notes. A mix of preferences, scattered rules, half-explained workflows, and no clear boundaries. They read like this:
You are a helpful assistant. Be concise. Use Python best practices.
Don't break things. Follow the coding standards. Make sure tests pass.
If you're unsure, ask. Be careful with infrastructure changes.
Vague. Unstructured. Incomplete. The AI follows some rules, ignores others, and fills the gaps with assumptions. You end up correcting it constantly, which defeats the purpose of having a system prompt in the first place.
What’s missing is a structure — a checklist that ensures every critical dimension is covered regardless of which agent framework you’re using.
The Solution
RISEN is a prompt engineering framework for structuring AI agent system prompts. It breaks down into 5 components, each addressing a different failure mode:
| Letter | Component | Question it answers | Failure when missing |
|---|---|---|---|
| R | Role | Who is the AI in this context? | Generic, unfocused behavior |
| I | Instructions | What rules must it follow? | Inconsistent outputs |
| S | Steps | What are the workflows? | Improvised, unreliable processes |
| E | End Goal | What does “done” look like? | Vague, incomplete deliverables |
| N | Narrowing | What must it NOT do? | Scope creep, dangerous actions |
The framework is platform-agnostic. It works for:
- Bedrock Agents — the system prompt field in agent configuration
- Claude Code — the
CLAUDE.mdsteering document - Strands Agents — the
system_promptparameter - LangChain / LangGraph — the system message in chain definitions
- OpenAI Assistants — the instructions field
- Custom agents — any system prompt, anywhere
The principle is the same everywhere: if any of the 5 components is missing from your system prompt, the agent has a blind spot.
How It Works
R — Role: Define the AI’s Identity
The Role tells the agent who it is. Without it, you get a generic assistant that tries to be everything. With it, you get a specialist that makes domain-appropriate decisions.
Bedrock Agent:
You are a customer support agent for AcmeCorp. You specialize in
order tracking, returns, and product troubleshooting for industrial
sensors. You have access to the order database and shipping API.
Claude Code (CLAUDE.md):
# Role
You are a Python backend engineer working on a FastAPI microservice
that processes IoT sensor data from manufacturing equipment.
You understand DynamoDB single-table design and EventBridge patterns.
Strands Agent:
agent = Agent(
system_prompt="""You are a security compliance auditor for AWS
environments. You review IAM policies, security group rules,
and encryption configurations against CIS benchmarks."""
)
Be specific about the domain, the tech stack, and the expertise level. “Python engineer” is too broad. “Python backend engineer working on IoT data pipelines with DynamoDB” gives the AI the right mental model for every decision it makes.
I — Instructions: Set the Rules
Instructions define the behavioral constraints — coding conventions, tool preferences, communication style, and hard rules that should never be violated.
# Instructions
- Write Python 3.12+ with type hints on all functions
- Use pydantic v2 for all data models
- Follow the existing project structure in src/
- All API endpoints must have OpenAPI docstrings
- Tests use pytest with moto for AWS mocking
- Format with black, lint with ruff
- Respond in French when the user writes in French
The key is specificity. “Follow best practices” is useless — it means different things to different models at different temperatures. “Use pydantic v2 for all data models” is verifiable.
For Bedrock Agents or Strands agents processing customer requests, Instructions might look different:
Instructions:
- Always verify the customer's order ID before looking up details
- Never disclose internal pricing or margin information
- Escalate to a human agent if the customer mentions legal action
- Use formal language, no slang or abbreviations
- Include the ticket reference number in every response
Same principle: concrete, verifiable rules that leave no room for interpretation.
S — Steps: Define the Workflows
Steps describe the multi-step processes the agent should follow. This is where most system prompts fail — they tell the AI what to do but not how to sequence it.
For a coding agent (Claude Code, Codex, etc.):
# Steps
## New feature
1. Read the relevant existing code in src/ first
2. Check if similar patterns exist in the codebase
3. Write the implementation following existing conventions
4. Add unit tests in tests/ mirroring the src/ structure
5. Run `make test` to validate
6. Update CHANGELOG.md if user-facing
For a customer support agent (Bedrock Agent, etc.):
Steps for handling a return request:
1. Verify the order ID and confirm the customer's identity
2. Check if the item is within the 30-day return window
3. Look up the product category for return eligibility
4. If eligible: generate a return shipping label via the Returns API
5. If not eligible: explain the policy and offer alternatives
6. Log the interaction in the CRM with the resolution code
For a data analysis agent (Strands, LangGraph, etc.):
Steps for generating a report:
1. Query the data warehouse for the requested time range
2. Validate that the dataset is complete (no missing dates)
3. Calculate the requested metrics with proper aggregation
4. Generate visualizations using the standard chart templates
5. Write a summary highlighting anomalies and trends
6. Format the output as a markdown report with embedded charts
Steps prevent the AI from skipping critical checks. A customer support agent without Steps might approve a return without checking eligibility. A coding agent without Steps might jump straight to implementation without reading existing code first.
E — End Goal: Define Success
The End Goal specifies what a good output looks like. Without it, the AI optimizes for completion rather than quality.
For a coding agent:
# End Goal
- All functions have type hints and docstrings
- No new dependencies without explicit approval
- API changes are backwards-compatible
- Response times under 200ms for DynamoDB queries
- Test coverage does not decrease
For a support agent:
End Goal:
- Customer's issue is resolved or properly escalated
- Resolution time under 5 minutes for standard requests
- Customer satisfaction prompt sent after resolution
- All required fields populated in the CRM record
For an analysis agent:
End Goal:
- Report covers the full requested time range with no gaps
- All metrics include year-over-year comparison
- Anomalies are flagged with confidence intervals
- Output format matches the executive summary template
The distinction between Instructions and End Goal is intent. Instructions say “how to work.” End Goal says “what the result should look like.” You can follow every instruction perfectly and still produce a poor result if the success criteria aren’t defined.
N — Narrowing: Draw the Boundaries
Narrowing is the most important section and the one people skip most often. It defines what the agent must NOT do. This is where you prevent the most expensive mistakes.
For a coding agent:
# Narrowing
- DO NOT modify infrastructure code in cdk/ without asking
- DO NOT add new pip dependencies without approval
- DO NOT touch the database schema — it is frozen
- Only target Python 3.12 — no 3.13 features
- Never use boto3 directly — use the wrapper in src/aws/clients.py
- Stay within eu-west-1 region constraints
For a support agent:
Narrowing:
- NEVER process refunds above $500 without human approval
- NEVER share customer data with third parties
- DO NOT make promises about delivery dates for out-of-stock items
- DO NOT access customer payment details — route to PCI-compliant flow
- Only support products from the current catalog — no discontinued items
For a data agent:
Narrowing:
- DO NOT query production databases directly — use the read replica
- DO NOT include PII in reports — aggregate or anonymize
- Limit queries to the last 24 months of data
- DO NOT extrapolate trends beyond the available data
Every “DO NOT” prevents a class of failures. Without Narrowing, a coding agent might refactor your infrastructure while fixing a bug. A support agent might promise a delivery date it can’t keep. A data agent might query production and degrade performance.
Putting It All Together
A complete RISEN system prompt for a Bedrock Agent:
Role: You are a procurement assistant for AcmeCorp's manufacturing
division. You help engineers find, compare, and order industrial
components from approved suppliers.
Instructions:
- Always check the approved supplier list before quoting prices
- Use metric units (mm, kg, celsius) unless the user specifies imperial
- Include lead time in every quote
- Flag any component that requires export compliance review
Steps for processing a component request:
1. Identify the component type and specifications
2. Search the approved supplier catalog
3. Compare at least 3 options on price, lead time, and availability
4. Present a comparison table with recommendation
5. If the user approves, generate a purchase requisition
End Goal:
- Component recommendation includes datasheet links
- Price comparison is within 24 hours of current pricing
- Purchase requisition follows the standard template
- Total cost includes shipping and handling
Narrowing:
- DO NOT recommend suppliers not on the approved list
- DO NOT process orders above €10,000 without manager approval
- DO NOT share supplier pricing with external parties
- Only source components certified for EU industrial standards
And the same framework applied to a Claude Code CLAUDE.md:
# Role
Python data engineer working on an ETL pipeline using
AWS Glue and Step Functions.
# Instructions
- Python 3.12, black formatter, ruff linter
- All Glue jobs must use Glue 4.0 runtime
- Use aws-cdk-lib v2 for infrastructure
- Structured logging with aws-lambda-powertools
# Steps
## Adding a new ETL job
1. Create job script in src/glue_jobs/
2. Add CDK construct in infra/constructs/
3. Add integration test in tests/integration/
4. Update the Step Functions state machine
5. Run `cdk diff` and paste output for review
# End Goal
- All Glue jobs have retry logic and DLQ
- Step Functions have error handling on every state
- Tests mock AWS with moto, no real AWS calls in unit tests
- CDK synth produces valid CloudFormation
# Narrowing
- DO NOT modify shared VPC config in infra/network.py
- DO NOT add new IAM policies without asking
- Stay within Glue 4.0 capabilities
- Budget: G.1X workers, max 10 DPUs per job
- Only eu-west-1 — no cross-region resources
Same framework, same 5 sections, different platforms. The structure is universal.
Automating RISEN for Claude Code: The /risen-init Skill
For Claude Code specifically, writing a RISEN config from scratch for every project gets tedious. The /risen-init skill automates it with a 4-phase workflow:
Phase 1 — Detect: Scans the project directory for package.json, pyproject.toml, Cargo.toml, Makefile, .github/workflows/, cdk.json, and other configuration files. Identifies language, framework, test runner, linter, build tool, and AWS context automatically.
Phase 2 — Ask: Walks through each RISEN component with interactive prompts. Pre-fills suggestions based on what was detected. You accept, modify, or override each section.
Phase 3 — Generate: Writes a .claude/CLAUDE.md file using the collected answers, formatted in RISEN structure.
Phase 4 — Confirm: Shows the generated file, asks for adjustments, and saves.
Navigate to any project and run /risen-init. In under two minutes, you have a complete steering document that Claude Code loads automatically on every session.
A meta-instruction in the global ~/.claude/CLAUDE.md ensures RISEN validation happens automatically:
## RISEN Framework for Project Configuration
When creating or updating any project .claude/CLAUDE.md,
validate against the RISEN framework:
- R — Role: Is the AI's identity clearly defined?
- I — Instructions: Are coding conventions specified?
- S — Steps: Are workflows for common tasks documented?
- E — End Goal: Are success criteria defined?
- N — Narrowing: Are constraints and forbidden actions explicit?
If any component is missing, flag it and suggest additions.
Even when you manually edit a CLAUDE.md, Claude Code checks it against RISEN and flags gaps.
What I Learned
-
Structure beats length in system prompts. A 50-line RISEN-structured prompt outperforms a 200-line unstructured one. The AI needs clear categories, not walls of text. Each section answers a specific question, which makes instructions easier to follow and harder to misinterpret. This holds true whether you’re configuring a Bedrock Agent or a Claude Code project.
-
Narrowing prevents the most expensive mistakes. Role and Instructions improve quality. Narrowing prevents disasters. A support agent without Narrowing might process a $50,000 refund autonomously. A coding agent might break your infrastructure during a bug fix. Every “DO NOT” you write is a class of failures you’ll never have to debug.
-
The framework is platform-agnostic but the implementation varies. RISEN applies to Bedrock Agents (system prompt field), Claude Code (CLAUDE.md files), Strands (system_prompt parameter), LangChain (system messages), and custom agents alike. The 5 components are always the same. What changes is the syntax and where the prompt lives.
-
Steps are the most underrated component. Most people write Role and Instructions but skip Steps entirely. The result is an AI that knows what it should be and what rules to follow, but improvises how to get there. Explicit workflows produce dramatically more consistent results — especially for multi-turn agent interactions.
-
Auto-detection makes adoption practical for coding agents. For Claude Code specifically, the
/risen-initskill detects 80% of the configuration from existing project files (linters, formatters, build tools, CI config). You only need to fill in the domain-specific 20%. This same principle applies to any agent: extract what you can from context, ask for the rest.
What’s Next
- Build RISEN templates for common agent archetypes (customer support, data analysis, coding assistant, DevOps automation)
- Create a
/risen-auditskill that scores any system prompt against the 5 components - Experiment with RISEN for Bedrock Agent configurations — test whether structured prompts measurably reduce hallucination rates
- Apply RISEN to MCP server tool descriptions — can the same structure improve tool-use accuracy?
- Build a library of Narrowing patterns for common domains (finance, healthcare, infrastructure) where guardrails matter most
Related Posts
How to Build an AI Executive Assistant That Never Forgets with Claude Code
Turn Claude Code into a persistent executive assistant with morning briefings, auto-logging, context-aware reminders, complex skills, and a memory that compounds over time — using only markdown files.
AIOpenClaw vs NanoBot vs PicoClaw vs TinyClaw: Four Approaches to Self-Hosted AI Assistants
A deep architectural comparison of four open-source frameworks that turn messaging apps into AI assistant interfaces — from a 349-file TypeScript monolith to a 10MB Go binary that runs on a $10 board.
AIFine-Tuning Mistral with Transformers and Serving with vLLM on AWS
End-to-end guide: fine-tune Mistral models with LoRA using Hugging Face Transformers, then deploy at scale with vLLM on AWS — from training to production serving on SageMaker, ECS, or Bedrock.
