- An Agent Certified tier is valid for 12 months. It is not a permanent label. The annual review confirms that the agent's configuration, governance, and documentation remain consistent with the certified tier. A missed review results in lapsed status.
- Four categories of events trigger a mandatory reassessment before the annual review date: model changes, scope changes, incident events, and governance changes. Any one of them restarts the clock for the affected dimensions.
- EU AI Act Article 72 post-market monitoring, required for providers of high-risk AI systems, generates the primary evidence base for ongoing certification maintenance. An Article 72 event that reveals a new failure mode is a reassessment trigger under the certification framework.
- Seven categories of documentation must be kept current throughout the certification period: authorised scope definition, technical documentation, audit log, incident register, risk register, vendor due diligence record, and board mandate. Gaps in any of these categories will result in a downgraded tier at the annual review.
- AI insurance underwriters price against current certification status, not original certification status. Operators with annual certification reviews should sequence them to complete before the insurance renewal submission.
The assessment process for AI agent certification is intensive. Operators compile evidence across seven dimensions, answer assessor questions, identify gaps they did not know existed, and receive a dated tier classification. The most common misreading of what follows is that the work is now done. The tier has been assigned. The badge is on the website.
That reading is incorrect. A tier is a snapshot of the agent's risk profile at a point in time. The agent's configuration changes. The model underlying it is updated. The regulatory context it operates in evolves. New incidents occur. Governance structures shift. Each of those events can change the agent's actual risk profile without changing the tier, creating a gap between what the certificate says and what is true. The maintenance programme exists to close that gap.
The 12-month validity window
An initial certification assessment produces a tier classification that is valid for 12 months from the date of issue. During that period, the operator may represent the agent as holding the certified tier to insurers, counterparties, procurement teams, and regulators.
At the 12-month mark, the operator must complete an annual review. The review is a lighter process than the initial assessment. It does not require a full reassessment of all seven dimensions. It requires the operator to confirm, with supporting documentation, that each dimension's evidence base remains current and that no undisclosed trigger events have occurred during the period.
Where the annual review reveals that the evidence base for one or more dimensions has deteriorated, the relevant dimensions are rescored and the tier is adjusted accordingly. Where the review reveals an undisclosed trigger event, a full reassessment of the affected dimensions is required before the tier is renewed.
Operators are advised to conduct a mid-cycle internal audit at the six-month mark. The purpose is to identify changes that may affect the certification status before the formal annual review date, allowing time to remediate gaps rather than discovering them under time pressure.
The four reassessment triggers
The following events trigger a mandatory reassessment before the scheduled annual review. Operators who experience a trigger event must notify the assessment body within 30 days and initiate the reassessment process within 60 days. Failure to respond within the 60-day window results in the certification being placed on review hold, during which the operator may not represent the agent as currently certified.
Model changes
A model change is triggered when the operator replaces the underlying AI model, fine-tunes an existing model on substantially different training data, or integrates a new model into a multi-model pipeline in a way that changes the agent's output behaviour. Minor version updates released by the model provider that do not change the model's behaviour in the agent's operational domain do not constitute a model change for this purpose.
The rationale for this trigger is direct. The initial assessment scored the agent's Trust and Safety, Context Integrity, and Product Maturity dimensions against the behaviour of the model in production at the time. A different model may exhibit different guardrail behaviours, different hallucination patterns, and different performance characteristics. Assuming continuity after a model change is not defensible evidence.
Scope changes
A scope change is triggered when the operator adds a new authorised action category to the agent's mandate, expands the agent's operational domain to a new sector or user population, or removes a human oversight checkpoint that was present at the time of the assessment. Scope changes are particularly relevant to the Autonomy Envelope dimension and to the Distribution Control dimension.
EU AI Act Article 25 is relevant here. Where a deployer makes a substantial modification to a high-risk AI system, the deployer may assume provider obligations under that Article. A scope change that constitutes a substantial modification for EU AI Act purposes will almost certainly also constitute a reassessment trigger for certification purposes. Operators with high-risk AI systems should treat the Article 25 substantial modification analysis and the certification reassessment trigger analysis as parallel exercises.[1]
Incident events
An incident event is triggered when a safety failure, a regulatory inquiry, a third-party claim, or a documented customer harm arises from the agent's operation, regardless of whether the incident was resolved cleanly. The trigger does not require that the incident resulted in litigation or formal regulatory action. It requires that an incident occurred that fell within the agent's operational scope and that involved the agent's output or action as a contributing factor.
The incident trigger exists because an incident is direct evidence of a gap in the agent's risk profile. Where an assessment had scored a dimension at a level implying that the failure mode was controlled, an incident demonstrates that the control was less effective than the evidence at the time of assessment indicated. The reassessment focuses on the dimensions most relevant to the incident's root cause.
Governance changes
A governance change is triggered by the replacement of the named senior owner identified in the assessment, a material change to the AI risk policy referenced in the assessment, or a board decision to alter the agent's mandate in a way that changes the scope or the risk tolerance boundary. Governance changes affect the Governance dimension directly and may affect the Autonomy Envelope dimension where the board mandate is the authority for the agent's action boundaries.
EU AI Act Article 72 and post-market monitoring
Article 72 of Regulation (EU) 2024/1689 requires providers of high-risk AI systems to establish and document a post-market monitoring system that actively collects and analyses data about the system's performance in production following its placement on the market.[2] The monitoring plan must be part of the technical documentation filed with the conformity assessment. The data collected must cover the system's accuracy, robustness, and the emergence of risks that were not identified in the initial conformity assessment.
For operators who are providers of high-risk AI systems and hold an Agent Certified tier, the Article 72 monitoring output and the certification maintenance programme operate in parallel and should be managed together where possible.
The Article 72 monitoring log is the most technically detailed record of the agent's performance in production. It is exactly the evidence base that the certification annual review needs to confirm the Product Maturity and Trust and Safety dimensions. Organisations that build a robust Article 72 monitoring system do not need to build a separate evidence collection process for certification maintenance. The monitoring data serves both purposes.
Critically, an Article 72 event that reveals performance degradation, a new failure mode, or a risk not captured in the initial conformity assessment constitutes an incident event for certification purposes and triggers the reassessment process. Operators should configure their Article 72 monitoring so that events meeting the certification trigger criteria are flagged automatically and routed to the person responsible for managing the certification programme.
Documentation to keep current
Seven categories of documentation must be maintained and kept current throughout the certification period. These are the same categories assessors review at the annual review and at any triggered reassessment.
Authorised scope definition. The written specification of what the agent is permitted to do, with whom, under what authority, and with what human oversight thresholds. This document should carry a version date and be updated whenever the scope changes. The current version must be accessible without modification at any point during the certification period.
Technical documentation. A description of the AI system's architecture, the model or models in use, their training data, their known limitations, and the guardrails and safety controls in place. The technical documentation should be updated when the model changes, when guardrails are modified, and when new limitations are identified. For operators subject to the EU AI Act, the technical documentation required under Article 11 and Annex IV is a superset of what certification maintenance requires and should be maintained under the same version control discipline.
Audit log. The tamper-evident record of the agent's inputs, processing steps, outputs, and any agent-initiated actions. The audit log is the primary evidence base for incident investigation and for demonstrating compliance with the scope definition. Logs must be retained for a minimum period consistent with the policy schedule and relevant national law. Most AI insurance policies specify 12 to 24 months.
Incident register. A record of every incident, near-miss, or deviation from expected behaviour, with documented response, root cause analysis, and resolution. The incident register demonstrates that the operator's safety controls are operating as described. An empty incident register at the annual review is not evidence of a clean record; it is evidence that incidents are not being recorded. Assessors distinguish between these two explanations.
Risk register entry. A current risk register entry for the AI agent, with the active risk rating, the listed controls and mitigations, and the most recent review date. The risk register demonstrates that the agent is within the organisation's active risk management process, not treated as a separate technical artefact outside governance.
Vendor and model supplier due diligence record. Documentation of the due diligence carried out on the model provider and any third-party components in the agent's pipeline. This should be updated when the model supplier is changed, when a vendor's AI governance practices change materially, or when a new third-party component is added. The EU AI Act Article 13 transparency obligations for providers and Article 26 obligations for deployers both require documentation of third-party components and their compliance status.
Board mandate and ownership record. The document or meeting record confirming the named senior owner, the board's understanding of the agent's purpose and risk, and the current mandate. Where the mandate includes an explicit risk tolerance boundary, that boundary should be stated and dated. This is the governance backbone that assessors will reference first when evaluating the Governance dimension at the annual review.
The relationship between certification and insurance renewal
AI insurance underwriters who price against certification evidence do not price against the original assessment. They price against the current certification status at the time of the renewal submission. Where the policy was placed on the basis of an Agent Certified tier, the renewal underwriting review will typically ask for one or more of the following: the most recent annual review report, the most recent reassessment outcome, confirmation that no undisclosed trigger events occurred during the policy period, and the current status of the agent in the certification registry.
A lapsed certification, a downgraded tier, or an undisclosed trigger event that was not resolved and notified to the insurer during the policy period will affect renewal pricing. In some cases it will affect the insurer's willingness to continue cover on the same terms. Certain policy forms treat non-disclosure of a reassessment trigger event as a material fact that was not disclosed in good faith, with consequences under the Insurance Act 2015 proportionate remedy framework or the equivalent national law provisions in member states.[3]
Operators with annual certification reviews scheduled within 90 days of their insurance renewal date are advised to sequence the review to complete before the renewal submission. The annual review report is the primary document demonstrating current certification status to the underwriter and reduces the possibility of disputes about whether the certification remains current at the time of renewal.
For operators who are approaching their first AI insurance renewal, the guide to AI insurance claims mechanics on agentinsured.eu covers what underwriters need to see and the relationship between policy conditions and certification maintenance obligations.
When the tier changes mid-cycle
A tier can change between annual reviews in two directions. It can increase, where the operator's governance and technical capabilities have matured substantially and the operator chooses to submit an early reassessment to claim the higher tier. It can decrease, where a reassessment trigger event reveals that the current tier is no longer supported by the evidence.
A tier decrease does not necessarily mean the agent should stop operating. It means the certification tier in the registry is updated to reflect the current evidence, and the operator has 60 days to remediate the gap and request reassessment of the affected dimensions. During the remediation window, the operator should update any external representations of the certification status to reflect the current registry entry.
The decertification scenario, where the tier drops to lapsed rather than to a lower active tier, occurs in three circumstances: failure to respond to a reassessment trigger event within the 60-day window; evidence at reassessment that the agent was operating materially outside the certified scope at the time of assessment, invalidating the original certification evidence; or voluntary withdrawal by the operator. Decertification removes the agent from the active registry. A new assessment is required to reinstate an active tier.
Pre-reassessment checklist
The following checklist is suitable for use as a self-assessment before the annual review or before initiating a triggered reassessment. It covers the evidence categories and common gaps that assessors find most frequently.
Scope: Is the authorised scope document current, version-dated, and consistent with how the agent is actually operating in production? Have any undocumented scope expansions occurred in practice?
Model: Is the model version documented? Has any fine-tuning, prompt modification, or pipeline change occurred since the last assessment? Were those changes captured in the technical documentation?
Incidents: Is the incident register complete? Does it include near-misses and customer complaints that implicated the agent, not only formal incidents? Has each incident been root-cause analysed and closed?
Governance: Is the named senior owner current? Has the board reviewed the agent's operation within the last 12 months? Is the risk register entry current?
Telemetry: Are the audit logs intact for the full certification period? Is the log architecture consistent with the policy schedule's requirements?
Insurance: Is the insurer aware of all reassessment trigger events that occurred during the policy period? Has any change in the certification tier been disclosed?
For the full methodology and scoring rubric used at assessment and at annual review, see the methodology page. For the tier thresholds and minimum dimension scores, see the certification levels page. To initiate a reassessment or an annual review, use the assessment request page.