Agentic Data Governance — Building a Multi-Platform Access Broker with Bedrock Agents
Your data lives in AWS, Databricks, and Microsoft Fabric. Your business glossary is in Collibra. Users just want to find data and get access. Here is an agentic architecture that makes governance the default instead of the blocker.
Every large enterprise with a serious analytics estate eventually ends up with the same three problems:
- Catalog sprawl — Business metadata in Collibra, technical metadata in each platform’s catalog (Glue, Unity Catalog, Purview), lineage in a fourth tool
- Access-request drag — Finding data takes days, requesting access takes weeks, getting it provisioned takes longer
- Policy drift — Permissions live inside each platform, so the same business rule is re-implemented (and re-broken) in three places
The question is no longer “how do we centralize?” Centralizing a multi-platform estate into one tool has been tried and failed for twenty years. The question is: how do we make governance invisible to the user and consistent across platforms?
Agents are the right abstraction. This post walks through the architecture.
The Non-Negotiable Design Principle
The permissions source of truth must live outside the catalog and outside the data platforms.
This is the single most important design decision. If permissions live in Collibra, you cannot enforce them on Databricks. If they live in Lake Formation, you cannot enforce them on Fabric. Every platform has its own policy engine — and none of them can be the authority.
The answer is Policy-as-Code (PaC) — either Cedar (AWS-native, integrates with Verified Permissions) or OPA (CNCF, platform-agnostic). Both let you express “user X from business unit Y can read dataset Z if classification ≤ Confidential” as code, version-control it, test it, and distribute it to every platform as the binding decision.
The Three Masters Pattern
Three systems of record, each owning exactly one thing:
| System | Owns | Example |
|---|---|---|
| Business catalog (Collibra) | Meaning, ownership, glossary, stewardship | ”Customer” means X, owned by BU Y |
| Platform catalog (Glue / Unity / Purview) | Technical schema, lineage, partitions | 47 columns, refreshed daily, joined from 3 sources |
| Policy-as-Code (Cedar / OPA) | Who can do what, when, why | Role R → dataset D → action READ if classification ≤ X |
Everyone violates this at some point — usually by putting permissions in the catalog because “it’s easier.” That always rots. Keep them apart.
Agent Architecture
Five specialized agents, one orchestrator.
User: "I need Q1 sales data for the Germany BU for my revenue model"
↓
Orchestrator Agent (Plan-and-Execute)
├── Catalog Agent → queries Collibra + platform catalogs
├── Discovery Agent → crawls platforms for undocumented datasets
├── Access Request Agent → drafts the request with justification
├── Provisioning Agent → applies PaC decision to each platform
└── Audit Agent → logs the decision + traceable chain
Orchestrator Agent
A Plan-and-Execute agent (Bedrock Agent or Strands) that decomposes the user’s intent into a plan, dispatches specialized agents, and aggregates results. This is the expensive one — every user interaction costs 5–15k tokens of orchestration overhead. Budget accordingly.
Catalog Agent
Queries both Collibra (business metadata) and each platform’s technical catalog. Critically, it joins them on asset identifiers, not on names. Names collide across business units; identifiers don’t.
Discovery Agent
Crawls platforms on a schedule looking for datasets that exist but aren’t in the business catalog. When it finds one, it drafts a Collibra entry (name, description, owner inferred from IAM, initial classification) and queues it for steward approval. This is the biggest ROI lever — manual cataloging is why catalogs rot.
Access Request Agent
Turns “I need Q1 sales data” into a structured request: dataset ID, justification, business context, requested permissions, business owner for approval. It uses the glossary to map the user’s words to the actual asset.
Provisioning Agent
This is where Policy-as-Code pays off. The agent evaluates the PaC policy, and if the decision is “allow,” it provisions access on the correct platform:
- AWS — Lake Formation grant, or IAM role mapping, or S3 Access Grants
- Databricks — Unity Catalog grant via REST API
- Fabric — Purview + Fabric role assignment
Same policy, three implementations. The agent handles the translation.
Audit Agent
Writes the full decision chain (who asked, what the PaC said, who approved, what was provisioned, when) to an immutable log. S3 + Object Lock is fine. Your auditors will love this — every access has a one-click traceable justification.
MVP Scope (What to Build First)
Do not build all five agents on day one. MVP is two:
- Discovery Agent → pre-fills business catalog. Immediate ROI — your catalog coverage jumps from 20% to 80% in weeks.
- End-to-end “search → request → provision → access” flow — one platform only (pick whichever is biggest), one agent chain. Prove the pattern works before spreading it.
Everything else (Cedar/OPA, multi-platform provisioning, lineage integration) is Phase 2. Build for the demo, not for the perfect architecture.
Risks and Gotchas
- Token cost of Plan-and-Execute orchestration — can easily hit $0.50–$2 per user interaction. Cache aggressively, and consider a cheaper router agent that only escalates to the Plan-and-Execute when the query is genuinely ambiguous.
- Collibra module overlap — Collibra Data Intelligence Platform (DIP) already includes a “request access” workflow. Make sure your agent either integrates with it or clearly replaces it. Running both is worst-of-both.
- Schema drift detection — if the Discovery Agent finds a schema change, is that a new dataset or an evolution? Cheap heuristic: >30% column overlap = evolution, < 30% = new.
- Who owns PaC? — Policy is code, code needs an owner. If it’s the data team, security hates it. If it’s security, the data team hates it. Make it a joint repo with clear CODEOWNERS and a fast PR process.
Key Takeaways
- Permissions belong in Policy-as-Code, not in the catalog and not in the platforms.
- Three masters (business catalog, platform catalog, PaC) — keep them separate.
- The highest-ROI agent is Discovery — it fixes the catalog coverage problem that blocks every downstream use case.
- MVP is two agents and one platform. Resist the temptation to boil the ocean.
- Plan-and-Execute orchestration is expensive. Measure token cost per interaction from day one.
Never miss a post
Get notified when I publish new articles about AI, Cloud, and AWS.
No spam, unsubscribe anytime.
Comments
Sign in to leave a comment
Related Posts
From RPA Bots to AI Agents — A 5-Criterion Scoring Framework for Enterprise Migration
Your RPA estate has 50 bots. Some should become AI agents, some should stay as bots, some need a hybrid pattern. Here is a repeatable, weighted scoring rubric — and the 5 migration patterns it maps to.
The RISEN Framework: Writing System Prompts That Actually Work for AI Agents
A 5-component framework for writing effective system prompts for any AI agent — Bedrock Agents, Claude Code, LangChain, Strands, or custom builds. With a practical Claude Code implementation.
How the Orchestrator Works — The AI Agent That Never Works Alone
The Orchestrator is an AI agent that delegates ALL work to specialized subagents. Here's the architectural philosophy behind why specialization beats generalization in AI agent systems.
