Key takeaways

  • An FP Certified assessment produces the risk management, technical documentation, and human oversight records that Articles 9-17 require from providers and that Article 26 requires deployers to receive and act on.
  • Each of the seven dimensions maps to one or more specific articles. The mapping is deliberate, not incidental: the dimensions were designed around what operators actually need to document and control.
  • Insurance underwriters, including Munich Re aiSure and Armilla, evaluate AI governance evidence beyond what the Act requires at its minimum. FP Certified scores Autonomy Envelope and Product Maturity at a higher standard than the statutory floor for this reason.
  • The assessment report is structured as insurance-readable evidence, not a compliance checklist. Each dimension produces scored, substantiated findings against named criteria.
  • ISO/IEC 42001:2023 governs the organisational management system. FP Certified operates at the agent and deployment level. The two are complementary, not interchangeable.

Why This Mapping Matters

The EU AI Act does not require every enterprise deploying an AI agent to undergo third-party certification. What it does require, for high-risk systems under Annex III, is that providers and deployers produce and maintain a defined body of documentation: risk management records, technical documentation, logging systems, quality management procedures, transparency information, and human oversight measures. Article 26 requires deployers to verify they have received adequate documentation from providers, to designate a point of contact, and to report incidents.

The practical problem is that most organisations beginning an AI governance programme do not know what constitutes adequate evidence for these obligations. They know they need a "risk register" and a "quality management system," but the Act does not tell them what depth of evidence is sufficient. The result is frequently documentation that names the right categories without providing the content that regulators, auditors, or insurers would accept as demonstrating actual control.

FP Certified addresses this by structuring the assessment around what operators need to produce, rather than what they need to name. The seven dimensions are not headings on a checklist. They are evaluation frameworks, each with scoring criteria that reflect the depth of control required to satisfy the corresponding statutory obligation at a standard beyond the minimum. An organisation that completes an FP Certified assessment will have systematically documented its position against the specific controls that Articles 9 through 17 and Article 26 examine.

This matters beyond compliance. Insurance underwriters evaluating AI liability exposure use the same categories, because the risk factors that create regulatory exposure are also the risk factors that create insurable loss events. The AIUC-1 framework (Artificial Intelligence Underwriting Company, 2025) structures underwriting submissions around governance quality, oversight mechanisms, and incident response capacity. Munich Re aiSure and Armilla, both active in the European AI insurance market, similarly require evidence of an organised AI governance framework before offering meaningful coverage terms. A FP Certified assessment report is structured to be readable as an underwriting submission, not merely as a regulatory file.

Dimension 1: Trust and Safety (Weight 18)

What FP Certified assesses: Trust and Safety carries the highest weight in the methodology at 18 points. The dimension evaluates whether an operator has implemented measurable controls to prevent unsafe outputs, and whether detection and remediation processes are in place when unsafe outputs occur. Assessment criteria include the existence of defined output safety thresholds, the operational status of monitoring systems, the documented process for identifying harmful or erroneous outputs in production, and the speed and completeness of remediation when a safety event is detected. Scoring distinguishes between controls that exist on paper and controls that demonstrably operate in production.

EU AI Act mapping: Trust and Safety maps primarily to Article 15, which requires high-risk AI systems to achieve appropriate levels of accuracy, robustness, and cybersecurity. Article 15(3) specifically addresses resilience to attempts by third parties to alter the system's outputs, a concern directly relevant to output safety monitoring. The dimension also maps to Article 9(3) and Article 9(4), which require providers to identify risks associated with the AI system's purpose and implement risk evaluation measures, including residual risk assessments. An operator's Trust and Safety documentation constitutes its Article 9 risk evaluation record and its Article 15 conformity evidence for accuracy and robustness controls.

The weight of 18 reflects the Act's own emphasis: accuracy, robustness, and safety controls are among the most substantive technical requirements in Chapter III, and they are the controls most directly connected to the harms the Act is designed to prevent. An agent that scores poorly on Trust and Safety will not reach certification level regardless of its performance on other dimensions.

Dimension 2: Context Integrity (Weight 14)

What FP Certified assesses: Context Integrity evaluates the quality of information over which the agent reasons. The dimension covers data governance, the currency of the agent's knowledge base, documented procedures for identifying and addressing data staleness, and controls against data poisoning risks. Assessment criteria examine whether the operator has a defined data provenance policy, whether retrieval sources are documented and monitored, whether there are mechanisms to detect when the agent is reasoning over stale or corrupted information, and whether the data governance practices meet a standard commensurate with the stakes of the agent's decisions.

EU AI Act mapping: Context Integrity maps directly to Article 10, which imposes detailed data governance and management obligations on high-risk AI systems. Article 10(2) requires that training, validation, and testing data meet quality criteria, and that providers examine data for possible biases. Article 10(3) requires that data be relevant, representative, and free of errors. While Article 10 is framed around training data, its underlying principle, that the information an AI system uses to reason must be of sufficient quality to produce reliable outputs, extends to the runtime context that autonomous agents use. An agent reasoning over a corrupted or outdated knowledge base presents the same category of risk as one trained on biased data. FP Certified's Context Integrity dimension captures this runtime dimension of Article 10's data quality obligation.

Dimension 3: Distribution Control (Weight 12)

What FP Certified assesses: Distribution Control evaluates who can invoke the agent, under what authority, and how downstream actions are bounded. The dimension covers identity verification and authorisation mechanisms, the scope of actions the agent is permitted to take on behalf of different classes of user, and the blast radius of the agent's actions if authorisation boundaries fail. Assessment criteria examine whether invocation rights are explicitly defined, whether those definitions are enforced technically rather than through policy alone, and whether there is a documented analysis of what could go wrong if an unauthorised or over-privileged actor invoked the agent at scale.

EU AI Act mapping: Distribution Control maps to Article 26(1), which requires deployers to use the AI system within the terms of the provider's instructions for use and not to deploy the system for purposes other than those for which it was intended. Deployment within authorisation boundaries is precisely what Distribution Control measures. The dimension also maps to Article 9(2), which requires providers to identify foreseeable risks from the interaction of the AI system with other systems and from its use by reasonably foreseeable users. The blast radius analysis that a Distribution Control assessment produces is the Article 9(2) foreseeable risk identification for interaction and misuse scenarios.

Dimension 4: Product Maturity (Weight 14)

What FP Certified assesses: Product Maturity evaluates whether the agent is production-grade in a substantive sense. The dimension examines whether prompts are versioned under change control, whether regression evaluation procedures exist and are followed before updates are deployed, whether observability extends to the reasoning trace level (not just input/output logging), and whether the operator has documented procedures for managing the system's development and change lifecycle. Scoring reflects the Act's implicit expectation that high-risk AI systems are managed with the rigour appropriate to their potential consequences.

EU AI Act mapping: Product Maturity maps to Articles 11 and 12 on technical documentation and automatic logging, and to Article 17 on quality management systems. Article 11 requires providers to draw up technical documentation before placing a high-risk system on the market, covering design specifications, development process, and validation procedures. Article 12 requires automatic logging sufficient to allow the identification of situations throughout the system's lifetime that may present a risk. A production system without reasoning-trace observability cannot produce the Article 12 logs that incident investigation requires. Article 17(1)(f) requires quality management systems to include documentation on data management and examination procedures; Article 17(1)(g) requires systems for version control and change management. Product Maturity assessment evidence directly corresponds to these Article 17 sub-requirements.

FP Certified assesses Product Maturity at a higher standard than the Act's documentation minimums, because minimum documentation can coexist with genuinely fragile production systems. An assessment that only confirmed the existence of a change log would not distinguish between an operator with disciplined versioning practices and one that created a log after the fact. Scoring criteria require evidence of actual process adherence.

Dimension 5: Governance (Weight 16)

What FP Certified assesses: Governance evaluates the organisational infrastructure around the agent. The dimension requires a named senior owner with documented accountability for the system's risk posture, a current AI risk policy, an active risk register for the system, an audit trail that demonstrates the risk register is reviewed and updated, and evidence of vendor and model supplier due diligence. Assessment criteria examine not only whether these elements exist but whether they operate: whether the risk register reflects recent events, whether the senior owner can demonstrate awareness of the system's current risk profile, and whether supplier due diligence extends to the model layers on which the agent depends.

EU AI Act mapping: Governance maps to Article 17, the quality management system obligation, which requires senior accountability, comprehensive documentation, and staff awareness. Article 17(1)(a) requires a strategy for regulatory compliance; Article 17(1)(i) requires an accountability framework. Governance assessment evidence constitutes the Article 17 quality management record. The dimension also maps to Article 26(2), which requires deployers to designate a point of contact with the provider for communications about incidents, corrective actions, and ongoing conformity. The named senior owner identified in Governance assessment is the Article 26(2) designated contact. Article 26(6) requires deployers to inform providers of incidents and corrective actions; the audit trail and risk register in Governance assessment provide the operational infrastructure through which Article 26(6) obligations are fulfilled.

Vendor and model supplier due diligence is an area where the Act's text does not fully anticipate the architectural reality of modern AI agents, which typically depend on foundation model providers, tool providers, and orchestration infrastructure that the deployer does not control. FP Certified's Governance dimension requires that this dependency chain be documented and that due diligence on each material supplier be on record. This goes beyond the Act's minimum, but it reflects what auditors and insurers will ask for when a loss event occurs at a dependency layer the operator claimed not to have evaluated.

Dimension 6: AI Integration (Weight 12)

What FP Certified assesses: AI Integration evaluates how the agent sits inside an organisation's existing systems of record, identity infrastructure, approval workflows, and escalation paths. The dimension examines whether the agent's actions are attributable to specific users or processes in downstream audit logs, whether there are defined escalation routes when the agent encounters a scenario outside its designed parameters, and whether the integration architecture preserves the organisation's ability to monitor the agent's behaviour without relying solely on the agent's own self-reporting. Assessment criteria also examine whether integration points have been documented as part of the system's change management process, so that changes to connected systems trigger a review of the agent's behaviour.

EU AI Act mapping: AI Integration maps to Articles 13 and 14 on transparency and human oversight, and to Article 26(5) on deployer monitoring obligations. Article 13 requires that high-risk AI systems be designed to allow deployers to understand what the system is doing and why, at a level sufficient for operational oversight. Article 14 requires that deployers implement measures enabling human supervisors to monitor the system's operation and intervene when necessary. An agent that is not properly integrated into the organisation's audit and monitoring infrastructure is not Article 14-compliant in practice, even if its technical documentation nominally includes an oversight section. Article 26(5) requires deployers to monitor the system's operation under their control throughout its operational lifetime. AI Integration assessment evidence constitutes the Article 26(5) monitoring architecture record.

Dimension 7: Autonomy Envelope (Weight 14)

What FP Certified assesses: The Autonomy Envelope is the most operationally specific of the seven dimensions. It evaluates whether the operator has defined an explicit boundary between what the agent may do without human confirmation and what requires a human decision or approval. Assessment criteria examine the precision of this definition, whether it is technically enforced or only policy-stated, whether the boundary is proportionate to the risk profile of the agent's actions, and whether the conditions under which the boundary can be extended or contracted are documented and subject to change control. Scoring also examines what happens at the boundary: whether escalation paths are functional, whether the agent behaves predictably when it reaches an action requiring confirmation, and whether the boundary has been tested under realistic conditions.

EU AI Act mapping: The Autonomy Envelope maps most directly to Article 14, the human oversight provision. Article 14(4) requires that high-risk AI systems be designed to allow natural persons to whom oversight is assigned to understand the system's capabilities and limitations, monitor its operation, interpret its outputs, and decide to not use the system, override it, or halt it in specific situations. An imprecise or unenforced autonomy boundary makes all of these Article 14(4) oversight rights difficult to exercise in practice. The dimension also maps to Article 26(2), which requires deployers to ensure that human oversight is operationally implemented as required, and to Article 9(4)(b), which requires that risk mitigation measures include human intervention design where the residual risk cannot be eliminated through technical controls alone.

FP Certified assesses the Autonomy Envelope at a materially higher standard than the Act requires, for two reasons. First, Article 14 describes outcomes (the ability to override, the ability to halt) without specifying how precise the autonomy boundary must be to make those outcomes achievable. An agent with a vaguely defined autonomy boundary may appear to meet Article 14 while in practice making human oversight difficult or slow to exercise. Second, an imprecise autonomy envelope is among the most significant indicators of uninsurable AI risk. Underwriters evaluating AI liability exposure treat the question of what an agent can do without asking permission as a primary exposure indicator. An operator that cannot answer this question with specificity is presenting a risk profile that is difficult to price and therefore unlikely to receive favourable terms.

Minimum Compliance Versus an Insurable Risk Profile

The EU AI Act is designed as a floor for market participation, not as a quality standard for risk management. An organisation that meets the Act's documentation requirements for a high-risk AI system has demonstrated that it can name the right categories. It has not necessarily demonstrated that its controls operate with the depth and rigour that would allow a loss event to be bounded, attributed, and remediated efficiently.

Insurance underwriters make this distinction systematically. Munich Re aiSure, which underwrites AI-related liability as part of its broader technology insurance products, evaluates the quality of an operator's AI governance framework as a material underwriting factor. Armilla, which provides AI reliability and performance insurance to enterprise customers, structures its underwriting process around evidence of deployment controls, not compliance documentation. The AIUC-1 standard (Artificial Intelligence Underwriting Company, 2025) explicitly distinguishes between compliance-level documentation and the risk management evidence required to support an AI liability policy with meaningful coverage limits.

The FP Certified methodology was designed with this distinction in mind. Dimensions 4 (Product Maturity) and 7 (Autonomy Envelope) are assessed at a higher standard than the Act's minimums because they are the two dimensions where the gap between minimum compliance and an insurable risk profile is largest. A system that meets Article 11's technical documentation requirement and Article 14's human oversight requirement at the minimum threshold may still present an uninsurable risk profile because its versioning practices are informal and its autonomy boundary is undefined. FP Certified scoring identifies this gap explicitly, so that operators understand not just whether they are compliant but whether they are insurable.

The assessment report format reflects this dual audience. Each dimension's findings are presented as scored evidence against named criteria, with a narrative explanation of what the evidence demonstrates and where gaps exist. This format is readable by a legal or compliance function evaluating regulatory conformity, and it is readable by an underwriter evaluating whether to extend AI liability coverage and on what terms. An operator holding a current FP Certified assessment report has, in most cases, produced the primary documentation its broker will need to initiate an AI underwriting submission with a major carrier.

For a detailed examination of the methodology's design and the full scoring framework, see the FP Certified methodology documentation and the companion article on the seven dimensions of AI agent certification. For the distinction between regulatory compliance and certification as a risk management standard, see compliance versus certification under the EU AI Act. For how certification evidence feeds the underwriting submission process, see the article on preparing an AI agent underwriting submission in Europe at agentinsured.eu. The full Article 26 deployer obligations are covered in the master guide at agentliability.eu.

Frequently asked questions

Does completing an FP Certified assessment satisfy EU AI Act Article 9 requirements?

An FP Certified assessment produces the risk identification, evaluation, and residual risk documentation that Article 9 requires deployers and providers to maintain. The assessment report is structured as a risk management record, not a checklist. It does not replace a legal conformity assessment for Annex III high-risk systems where a notified body is required, but it generates the underlying evidence that any such assessment would examine.

How does the Autonomy Envelope dimension relate to Article 14 human oversight requirements?

Article 14 of Regulation (EU) 2024/1689 requires deployers to ensure human oversight is technically and operationally implemented. The Autonomy Envelope dimension (weight 14) specifically evaluates whether an explicit boundary exists between actions the agent may take without human confirmation and those that require human review. Assessment scoring examines the precision and enforceability of this boundary, not merely its existence in a policy document.

Why does FP Certified go beyond EU AI Act minimum requirements in some dimensions?

Minimum regulatory compliance is a floor, not a risk management standard. Insurance underwriters including Munich Re aiSure and Armilla evaluate the quality of an operator's AI governance framework, not just its existence. Dimensions such as Autonomy Envelope and Product Maturity are assessed at a higher standard than the Act's documentation minimums because the difference between a compliant system and an insurable system lies in the depth and rigour of the controls, not their presence on a form.

Can an FP Certified report be submitted to an insurance underwriter as part of an AI liability underwriting submission?

Yes. The FP Certified assessment report is structured with insurance readability in mind. Each dimension produces scored evidence against named criteria that an underwriter can evaluate as part of a submission. This is distinct from a compliance checklist, which typically confirms the presence of a document without evaluating its operational depth. The report format aligns with the evidence categories that emerging AI underwriting frameworks, including AIUC-1 (Artificial Intelligence Underwriting Company, 2025), use to assess exposure.

What is the relationship between FP Certified and ISO/IEC 42001:2023?

ISO/IEC 42001:2023 sets requirements for an AI management system at the organisational level. FP Certified is agent-level and deployment-level. The two are complementary: an ISO 42001-certified organisation will have organisational governance infrastructure in place, but each individual agent deployment still requires the specific technical and operational controls that the seven FP Certified dimensions assess. An organisation can have ISO 42001 without having adequate Autonomy Envelope controls on a specific deployed agent.

References

  1. Regulation (EU) 2024/1689 of the European Parliament and of the Council of 13 June 2024 laying down harmonised rules on artificial intelligence (Artificial Intelligence Act). Official Journal of the European Union, L 2024/1689. Articles 9-17, Article 26.
  2. ISO/IEC 42001:2023. Information technology. Artificial intelligence. Management system. International Organization for Standardization and International Electrotechnical Commission.
  3. Artificial Intelligence Underwriting Company. AIUC-1: Framework for AI Liability Underwriting (2025). Version 1.0.
  4. Munich Re. aiSure: AI model performance insurance product documentation. Munich Reinsurance Company, 2025.
  5. Armilla AI. Underwriting framework for AI system reliability and performance. Armilla Inc., 2025.
  6. National Institute of Standards and Technology. Artificial Intelligence Risk Management Framework (AI RMF 1.0). NIST AI 100-1. National Institute of Standards and Technology, U.S. Department of Commerce, January 2023.
  7. Future Proof Intelligence. FP Certified Methodology Documentation, v1.0. May 2026. Available at: https://agentcertified.eu/methodology.html
  8. European Commission. Questions and Answers on the AI Act: guidance on high-risk AI system classification under Annex III. 2025.