AI is in your credentialing programme. It might be in your scoring pipeline, your proctoring tools, your item generation workflow, or buried inside features your platform vendor added in the last release. Wherever it sits, the question your board, your auditors, and your candidates will eventually ask is the same. Can you explain how it works, prove it is fair, and defend the decisions it influences?
A policy memo will not answer that question. A governance system will. Two ISO standards, ISO/IEC 42001 and ISO/IEC 23894, give credentialing bodies a practical operating model for doing this well. This article translates them into what to actually build, in what order, and what to expect from your suppliers.
“A policy memo will not answer that question. A governance system will.”
Why a governance system, not a policy memo
Credentialing organisations are familiar with management system thinking. ISO 9001 quality management, ISO 27001 information security, and ISO 17024 personnel certification all share the same logic. You define what you do, you assign responsibility, you run it, you check it, and you improve it. The same discipline applies to AI, and ISO 42001 codifies it.
The problem with treating AI governance as a one-off policy is that AI changes constantly. Models update. Vendors retrain. Features get added to tools you already use. A static document cannot keep pace. A management system, with documented controls, change processes, and review cycles, can.
If your organisation already runs ISO-aligned quality, security, or certification processes, ISO 42001 should feel like a familiar shape. If it does not, that is a separate issue worth addressing, because the EU AI Act, the NCME Testing Standards, and most regulator expectations all assume this kind of operating discipline exists.
ISO/IEC 42001: the management system that gives you structure
ISO/IEC 42001 is the world’s first AI management system standard. It is designed for any organisation that develops, provides, or uses AI based products or services. Credentialing bodies sit inside that scope, whether they build AI in house or consume it through vendors.
The core of 42001 is Plan, Do, Check, Act applied to AI. You plan by establishing policies, accountable roles, and risk-based controls. You do by operating those controls across the AI lifecycle. You check through monitoring, audits, and reviews. You act by feeding findings back into improvements. None of this is novel. What is new is that the standard explicitly addresses AI-specific concerns including model lifecycle management, supplier governance, transparency, and oversight obligations that go beyond generic IT controls.
For credentialing leaders, the value of 42001 is that it provides a defensible structure that other people recognise. When a buyer, regulator, or accreditor asks how you manage AI, the answer “we operate to ISO 42001” is a more substantial response than “we have a policy.”
ISO/IEC 23894: the risk method that gives you process
ISO/IEC 23894 sits alongside 42001 and answers a different question. It is not a management system. It is guidance on how to identify, assess, treat, and monitor AI-specific risks, and how to integrate that work into your existing risk management practice.
The standard maps to ISO 31000 style risk logic. You establish context, you identify risks, you analyse and evaluate them, you treat them, and you monitor and review. Where 23894 adds value for credentialing is in its treatment of AI-specific risk categories. Validity threats, fairness gaps, transparency limitations, security and leakage risks, and harm scenarios for affected groups all need explicit consideration.
Critically, 23894 is designed to be customised to your context. A formative diagnostic and a licensure exam have completely different risk profiles. The standard does not impose a single template. It gives you a method that scales.
The defensibility test that matters
For credentialing bodies, the central risk is not bias in the abstract. It is defensibility. Three questions tell you whether your governance is fit for purpose.
The three-question defensibility test
- Can you explain how an AI-influenced score or decision was generated, in language a candidate or appeals panel can understand?
- Can you defend the decision when challenged, with documented evidence rather than assertions?
- Can you satisfy a regulator, employer, or accrediting body that asks how you know the decision is sound?
If the honest answer to any of these is no, you have a defensibility gap that 42001 and 23894 are designed to close. The standards do not make AI safer in a technical sense. They make your governance demonstrable.
ISO/IEC 42006 and the certification market
The certification ecosystem around AI management systems is forming quickly. ISO has published 42006, which sets requirements for the bodies that audit and certify against ISO 42001. Accredited 42001 certifications are now appearing in the market. This matters because procurement and accreditation will follow.
Expect the questions in vendor RFPs and partnership due diligence to shift from “tell me about your AI ethics policy” to “show me your AI management system and your latest audit.” Credentialing organisations that get ahead of this shift will find themselves selling certification readiness as a competitive advantage. Those that lag will find themselves answering questions they did not prepare for.
Scale governance to stakes
The principle is the same as in any well run quality system. Match the weight of your controls to the consequences of failure.
Low-stakes uses, including practice tests and formative feedback, need transparency, data protection, and logging. They are fast to deploy and cheap to govern, but they still need to appear in your AI register and have a documented rationale for their classification.
Medium-stakes uses, including micro-credentials and modular components, need a documented risk assessment, structured human review, and a clear change control process when models or vendors update.
High-stakes uses, including licensure, chartered status, and regulated practice gating, need the full discipline. A risk file per use case, real human oversight with authority to override or stop, robust monitoring, and an evidence pack that can survive an audit.
The trap to avoid is letting low-stakes classifications become a way to skip governance work. Even a five-minute risk note is better than no record at all.
The 90-day operating model
A practical sequence to stand up 42001 and 23894 aligned governance in three months looks like this.
In the first 30 days, build the AI register across the full assessment lifecycle. Stand up a small governance group with clear decision rights covering legal, psychometrics, assessment operations, and product. Approve an interim AI policy and freeze adoption of unknown AI features pending review.
In days 31 to 60, run 23894-style risk assessments for every medium and high-stakes use. Define human oversight points and the authority needed to act on them. Document evidence requirements for AI scoring and proctoring support, including subgroup monitoring, drift review, and audit logs.
In days 61 to 90, assemble the evidence pack itself. Construct statements, validity evidence updates, monitoring plans, change logs, override logs, and incident response procedures. Update vendor contracts to require change notification, log access, audit support, bias monitoring evidence, and incident notification. Train your teams so human oversight is operational rather than ceremonial.
This is achievable for an organisation with existing quality discipline. It is harder, but not impossible, for one starting from scratch. In the harder case, the priority is the AI register and the governance group. Everything else builds from those.
The 90-day operating model
- Days 1 to 30: Build the AI register. Stand up a governance group with clear decision rights. Approve an interim policy and freeze unknown AI features.
- Days 31 to 60: Run 23894-style risk assessments. Define oversight points. Document evidence requirements.
- Days 61 to 90: Assemble the evidence pack. Update vendor contracts. Train your teams so oversight is operational, not ceremonial.
What to demand from your vendors
Most credentialing bodies do not build their own AI. They consume it through platform vendors, proctoring partners, scoring services, and item generation tools. Vendor governance is therefore the largest single area of exposure.
The contractual minimums to insist on are:
- written commitment to notify you of material AI model or configuration changes before deployment
- access to audit logs covering AI outputs, human overrides, and decisions
- documented evidence of bias and fairness monitoring with subgroup breakdowns where relevant
- incident notification within defined timelines, with clear severity definitions
- right to audit, including third party audits where the use case is high-stakes
- explicit confirmation that prohibited practices, including emotion recognition in educational contexts, are not present in any feature you use
“If a supplier cannot meet these terms, that is a procurement decision, not a negotiation.”
If a supplier cannot meet these terms, that is a procurement decision, not a negotiation. The market is moving in this direction and the vendors who understand that will adapt quickly.
From “tell me you’re ethical” to “show me your evidence”
The shift underway in AI governance is the same shift that happened in information security a decade ago. Buyers stopped accepting vague assurances and started asking for evidence. Today, you cannot sell into a regulated industry without a SOC 2 or ISO 27001 report on hand. AI is on the same trajectory.
ISO 42001 and 23894 are the structure and the process that let credentialing bodies meet that bar. They do not solve AI governance on their own. What they do is give your organisation a recognised, defensible operating model that other people can verify. In a market where trust is the product, that is the asset worth building.
Ready to stand up an AI management system aligned to ISO 42001 and 23894?
Talk to our team about how Globebyte can help you build governance that survives an audit.