Threat Intelligence 7 min read

LLM-Written Access Policies Are Creating Silent Security Gaps

Kyanite Blue Labs, Threat Intelligence·31 March 2026

The Access Control Problem Nobody Is Talking About

Developers are under pressure to ship fast. Large language models (LLMs) like GitHub Copilot and ChatGPT have become standard tools in that workflow — capable of generating complex access control policies in seconds. Hand an LLM a natural language description of a permission requirement and it will produce syntactically valid Rego or Cedar code before you have finished your coffee. The problem is not the speed. The problem is what gets missed in that code. LLM-generated access control policies carry a specific and underappreciated risk: hallucinated attributes, missing conditions, and logically incomplete rules that pass code review and enter production without triggering a single error. There is no runtime exception. There is no alert. The policy simply fails to enforce the boundary it was written to create — and access drifts outward, silently, until something goes wrong. This is the access control risk that security teams need to understand right now.

What Is 'Policy Drift' and Why Does It Matter?

Policy drift describes the gradual erosion of intended access controls over time. Traditionally, drift happened through manual configuration errors, stale permissions left after role changes, or inherited access that nobody remembered to revoke. These are well-documented problems with well-established detection methods. LLM-introduced drift is different. It enters the codebase during development, not during operational changes. It looks like intentional policy code because the syntax is correct. And it often passes automated testing because the tests themselves were written with the same LLM that introduced the error. Here is a concrete example of how this occurs. A developer asks an LLM to write a Rego policy enforcing that only users with the attribute `role == "admin"` and `department == "finance"` can access payroll records. The LLM generates a policy that checks the role correctly but references a department attribute using a field name that does not match the actual data schema — say, `dept` instead of `department`. In OPA (Open Policy Agent), an undefined reference typically evaluates to `undefined`, which can cause the condition to be skipped rather than denied. The policy compiles. It deploys. Finance records are now accessible to any admin, regardless of department. That is not a hypothetical. It reflects the structural behaviour of how Rego handles undefined references — and it is the kind of logical error LLMs produce with regularity when generating policy code against schemas they have not been grounded in.

Why LLMs Are Particularly Dangerous for Policy-as-Code

LLMs are trained to produce plausible output, not correct output. In most software development contexts, plausible and correct overlap enough that the productivity gains outweigh the risk. Access control policy is different for three reasons. First, access control errors are not self-evidencing. A broken function in application code tends to produce an observable failure — a page that does not load, an API call that returns an error. A broken access policy tends to produce a permission that is broader than intended. The application keeps working. Users who should not have access get it anyway, and nobody notices. Second, policy languages like Rego (used with Open Policy Agent) and Cedar (used in Amazon Verified Permissions and AWS services) are niche enough that LLM training data is sparse compared to mainstream languages like Python or JavaScript. Sparse training data correlates with higher hallucination rates on technical specifics — including attribute names, built-in function behaviour, and the precise semantics of how undefined values are handled. Third, policy code is rarely subject to the same adversarial testing as application code. Security teams test for what attackers might do from the outside. They are less likely to audit whether an internally generated policy is logically complete. The gap between what a policy says and what it actually enforces can persist for months. Combined, these factors mean that LLM-assisted policy development introduces a class of access control error that is systematic, hard to detect, and aligned with the exact boundaries organisations care most about protecting.

How This Connects to Wider Attack Surface Exposure

From a threat intelligence perspective, this matters beyond the immediate development risk. Access control policies are the enforcement layer beneath identity. When those policies have unintended gaps, the attack surface expands in ways that do not appear in conventional vulnerability scans. An attacker who compromises a low-privilege account does not need to exploit a zero-day if a quietly misconfigured policy grants that account read access to sensitive data. They need only to query what they now have access to. In a cloud-native environment where fine-grained attribute-based access control (ABAC) policies govern access to dozens of services, a single hallucinated attribute in a Cedar policy can mean the difference between minimal blast radius and a significant data exposure. This is precisely the kind of exposure that continuous attack surface management is designed to surface. Hadrian, for instance, continuously maps and tests an organisation's external and internal attack surface — identifying access misconfigurations and over-permissive controls before an attacker does. When policy-as-code is part of the development workflow, integrating that policy verification into continuous security testing is no longer optional. Meanwhile, organisations using platforms like Coro — which provides unified visibility across endpoint, email, and cloud environments — gain contextual telemetry that can surface anomalous access patterns indicative of policy gaps. An account accessing resources it should not be able to reach based on its stated role is a signal worth catching early.

What a Secure LLM-Assisted Policy Workflow Looks Like

Banning LLMs from policy development is not a practical response. The productivity gain is real, and developers will continue to use these tools regardless of formal guidance. The question is how to structure the workflow so that LLM-generated policy code cannot enter production with undetected logical errors. The following controls address the core risk: Schema grounding before generation: LLMs should be provided with the exact attribute schema — field names, types, and valid values — as part of the prompt context before any policy code is generated. This does not eliminate hallucination, but it significantly reduces the likelihood that a generated policy will reference attributes that do not exist in the actual data. Separate policy testing from policy generation: If the same LLM that writes a policy also writes the tests for it, the tests will reflect the same assumptions and errors. Policy tests should be written independently — either by a different team member or using a deterministic test harness built against real schema fixtures. Formal verification tooling: For Rego, OPA's `opa test` and `opa check` commands provide some validation, but they do not catch logical incompleteness. Tools like Styra DAS and commercial policy analysis platforms provide more thorough coverage. Cedar has formal semantics that enable mathematical verification of policy behaviour — this capability should be exercised, not treated as optional. Least-privilege review gates: Before any policy change deploys to production, a human reviewer with access control expertise should evaluate whether the policy achieves the minimum necessary permissions, not just whether it is syntactically valid. Continuous permission monitoring: Access control is not a deploy-and-forget concern. Monitoring for access patterns that deviate from role expectations provides a detection layer when policy gaps exist.

  • Ground LLM prompts in the actual attribute schema before generating policy code
  • Write policy tests independently of the LLM that generated the policy
  • Use formal verification tooling — OPA's built-in checks are a starting point, not an endpoint
  • Mandate human review for access control policy changes before production deployment
  • Monitor live access patterns for deviations from expected role behaviour

The Supply Chain Dimension

There is a third-party risk angle here that is easy to miss. Many organisations depend on SaaS vendors, cloud service providers, and managed platforms that themselves use policy-as-code frameworks to control access to customer data. If those vendors are using LLM tooling in their own development workflows without adequate policy verification controls, then the access control errors exist inside your supply chain — not just in your own codebase. You have no direct visibility into whether your payroll provider's Cedar policies correctly restrict access to your employee records. You cannot audit your cloud storage vendor's Rego rules. What you can do is ask the right questions during vendor assessment and require evidence of policy verification practices as part of your supplier due diligence process. Platforms like Panorays provide continuous third-party risk monitoring, including assessment of vendor security practices. As LLM-assisted development becomes the industry norm, the adequacy of a vendor's policy code review process is a legitimate and material security question — one that belongs in the questionnaire alongside patch cadence and encryption standards. For businesses in both the UK and New Zealand markets, supply chain security obligations are increasingly codified. The UK Cyber Essentials Plus scheme and the New Zealand Information Security Manual both address access control as a foundational control. A vendor whose access policies drift due to LLM-introduced errors is a vendor that may pull your compliance posture down alongside their own.

What Security Teams Should Do Right Now

The shift to LLM-assisted development is not a future trend — it is current practice across the majority of development teams. Security functions that treat this as a speculative risk will find themselves responding to access control incidents that were preventable. Three actions are worth prioritising immediately. First, find out whether your development teams are using LLMs to write access control policies. This is not about prohibition — it is about understanding where your policy code is coming from and whether adequate verification controls exist around it. Second, request a sample of recently deployed Rego or Cedar policies and subject them to manual review against the actual attribute schemas they reference. The goal is to identify undefined attribute references, missing conditions, and overly permissive default behaviours before an attacker does. If your organisation uses Hadrian for attack surface management, this internal policy verification can be complemented by continuous external testing to confirm whether policy gaps translate into reachable exposures. Third, raise access control policy verification as a question in your next vendor review cycle. Whether you use Panorays to manage that process or a manual framework, the question is the same: how does your organisation verify that LLM-generated policy code correctly enforces least-privilege access before it reaches production? Access control is the boundary between authorised and unauthorised. When LLMs quietly introduce errors into that boundary, the result is not a noisy incident — it is silent drift. Catching it requires deliberate effort, and the time to build that capability is before the exposure.

Frequently Asked Questions

Can LLMs be trusted to write access control policies like Rego or Cedar?

LLMs can generate syntactically correct Rego and Cedar code quickly, but they frequently hallucinate attribute names and produce logically incomplete conditions. These errors do not cause application failures — they create unintended permission gaps that expand access beyond what the policy intended. LLM-generated access control policies require independent testing and human review before production deployment.

What is policy drift in access control and how does it happen?

Policy drift is the gradual erosion of intended access controls. In LLM-assisted development, it occurs when generated policy code references incorrect attributes or omits conditions — errors that pass code review because the syntax is valid. The policy deploys and enforces less than intended, granting access that should be denied, without producing any observable error or alert.

How does LLM-introduced access control risk affect third-party and supply chain security?

Vendors and SaaS providers that use LLM tooling in their own development workflows may have access control errors in the policies governing your data. You cannot audit their internal policies directly, but you can require evidence of policy verification practices during supplier assessments. Tools like Panorays support continuous third-party risk monitoring that surfaces these security practice gaps.

access controlLLM securityAI riskleast privilegeidentity security

Want to discuss this with our team?

Book a free 20-minute call with David or Max.

Book a call