Designing Assessments Where Using AI Is the Competency

Most of the conversation about AI in credentialing assumes the question is whether to permit it or prohibit it. That framing misses the most important opportunity in front of credential owners right now. When a profession has already adopted AI tools as part of normal practice, the credential that pretends those tools do not exist is failing the candidates and failing the employers who rely on the credential. The credential that adapts to certify responsible AI use is the one that retains its market value into the next decade.

This article is a practical guide to designing assessments where using AI well is the competency being measured. It covers what such an assessment certifies, how the rubric needs to change, what evidence the candidate must produce, how it differs from a traditional take-home, and what the validity argument now looks like. It is written for credential owners who have made or are about to make the construct decision in favour of AI-supported practice and who need a workable approach to assessing it.

What this kind of credential certifies

The shift from “with AI permitted” to “AI is the competency” is more than a policy change. It changes what the credential is telling employers, regulators, and the public about the holder.

A traditional credential certifies that the holder can perform a task. A credential with AI as the competency certifies that the holder can use AI tools to perform a task, verify the output, take accountability for the result, and recognise when the tools are not appropriate for the situation. The assessment is no longer about whether the candidate can produce the right answer. It is about whether the candidate can produce a sound answer, defend the reasoning, and demonstrate professional judgement in choosing and verifying the tools they used.

This is a richer construct than the traditional one. It is also closer to what most professional practice looks like now. A modern software engineer uses AI assistants. A modern legal researcher uses AI search and synthesis tools. A modern data analyst uses AI for code generation and data exploration. A modern marketer uses AI for content drafting and audience analysis. In every one of these professions, the credential that ignores AI is certifying competence in a version of the job that no longer exists.

“The credential that ignores AI is certifying competence in a version of the job that no longer exists.”

The rubric is the design problem

The traditional assessment rubric grades the output. The AI-as-competency rubric grades the process. This sounds like a small change. It is the central design problem of the entire assessment, and it determines whether the resulting credential is defensible.

“The traditional rubric grades the output. The AI-as-competency rubric grades the process.”

A rubric that grades only the output cannot distinguish between a candidate who used AI well and a candidate who copied an AI output without thinking. The two submissions can look identical on the surface. The same final document, the same conclusions, the same recommendations. The competence the credential is supposed to certify lives in the steps that were not visible.

A rubric that grades the process pulls those invisible steps into view. It rewards the candidate who chose the right tool for the task, prompted it effectively, recognised when the output was wrong, verified the parts that mattered, and took responsibility for the final recommendation. It penalises the candidate who accepted whatever the tool produced without examination, missed obvious errors, or failed to disclose the use.

In practice, an AI-as-competency rubric usually has four major dimensions:

The four rubric dimensions

Tool selection — did the candidate choose appropriate tools for the task, with reasoning that shows awareness of what each tool can and cannot do well.
Interaction quality — did the candidate prompt and steer the tools effectively, refine outputs through iteration, and make use of the tools’ actual capabilities rather than treating them as black boxes.
Verification and judgement — did the candidate critically evaluate the outputs, identify errors and limitations, cross-check key claims, and reject material that did not stand up.
Professional accountability — did the candidate take responsibility for the final result, document the use appropriately, identify any ethical or compliance considerations, and communicate the AI involvement in a way that downstream users can rely on.

These dimensions are not theoretical. They are the same criteria a senior professional would use to evaluate a junior colleague’s AI-supported work. The rubric is operationalising the judgement that already happens informally in practice.

The evidence the candidate has to produce

A traditional take-home assessment asks for a final submission. An AI-as-competency assessment asks for the submission plus the evidence trail that shows how it was produced. The evidence trail is what makes the rubric possible to apply.

The minimum evidence package for an AI-as-competency assessment includes the final submission, a structured AI disclosure listing the tools used and the purpose of each, the prompt history or interaction log for any substantive AI use, any AI outputs that were rejected or revised with brief notes on why, the verification steps taken on any AI-generated content, and a short reflection from the candidate on the appropriateness of the AI use and any limitations they encountered.

This sounds like a lot of paperwork. It is less than it sounds, because most of the artefacts already exist as a natural by-product of using the tools. Modern AI assistants log conversations. Modern collaborative documents track changes. The discipline the candidate is being asked to demonstrate is to organise these artefacts into a coherent package, not to manufacture them from scratch.

The structured AI disclosure is the artefact that connects the evidence pack to the rubric. A workable disclosure format has three parts. It lists each tool used with the version and the purpose of use. It describes for each substantive use what the AI produced, how the candidate evaluated it, and what verification steps were applied. It includes a short declaration that the submitted work represents the candidate’s own competence and judgement, with any specific contributions clearly attributed.

For high-stakes components, the evidence pack supports a verification interview or oral defence in addition to the rubric scoring. The interview asks the candidate to explain specific decisions in their AI use, defend their verification approach, and respond to questions about content the AI generated. This is the equivalent of the viva that supports any traditional take-home assessment, adapted to focus on the AI-supported reasoning rather than only the final conclusions.

How it differs from a traditional take-home assessment

The most common mistake in designing this kind of assessment is to start with a traditional take-home and add AI permissions. The result is an assessment that is neither one thing nor the other. It permits AI without measuring whether AI was used well, and it grades the output without distinguishing between thoughtful use and lazy use.

A genuine AI-as-competency assessment differs from a traditional take-home in five ways.

The first is the task itself. A traditional task can be answered by writing a good response. An AI-as-competency task is designed so that AI use is necessary or strongly advantageous for completing it well, so the absence of AI use is itself a finding. This typically means tasks that require synthesis across more sources than a candidate could feasibly handle unaided in the time available, or tasks that require the kind of generation and iteration that AI tools accelerate.

The second is the time budget. AI-as-competency assessments are usually shorter than the traditional equivalent, because the tools handle much of the mechanical work. The time saved is reallocated to the judgement and verification work that the rubric is grading.

The third is the scoring weight. In the rubric breakdown, the “process” dimensions (tool selection, interaction, verification, accountability) usually carry more weight than the surface quality of the final output. A submission with a well-presented but flawed conclusion can still score acceptably if the process evidence shows that the candidate followed sound judgement and the final flaw was a defensible error. A submission with an immaculate conclusion can fail if the process evidence shows that the candidate did not engage with the verification work the rubric requires.

The fourth is the disclosure requirement. Where a traditional take-home asks “did you do this yourself”, an AI-as-competency assessment asks “what did you do, and what did the tools do, and how did you verify the result”. The disclosure is structured, not optional, and it is part of what is graded.

The fifth is the appeals position. When a candidate disputes the result of an AI-as-competency assessment, the conversation is grounded in the evidence pack and the rubric. The credential owner can show specifically which dimension scored low and why. The candidate can defend their reasoning against documented criteria. This is a more defensible position than the traditional “your work was not good enough” verdict that take-home assessments often produce.

What the validity argument looks like

The validity argument for an AI-as-competency assessment has to demonstrate three things. First, that the tasks measure the construct that the credential certifies, which is competent AI-supported professional practice in the domain. Second, that the rubric can reliably distinguish stronger candidates from weaker ones on the dimensions that matter. Third, that the assessment produces decisions that hold up across the candidate population without subgroup bias.

The first of these requires evidence that the tasks reflect actual professional practice. This is similar to the job analysis or practice analysis that supports any credentialing assessment, with the additional requirement that the practice being analysed includes AI tool use as it currently happens. Practitioners describe their workflows, the tools they use, the decisions they make, and the verification steps they take. The assessment tasks are designed to reflect that reality.

The second requires evidence that the rubric is reliable across raters and consistent across applications. Inter-rater agreement statistics on AI-as-competency rubrics should follow the same standards as any other rubric, and the methods covered in our companion article on inter-rater agreement and AI scoring apply directly. Where AI scoring is used to support the rubric, the same evidence requirements apply.

The third requires subgroup analysis on rubric outcomes, the same way it would for any other assessment. The novel concern with AI-as-competency assessments is that candidates from different backgrounds may have very different access to AI tools in their study environment, and the assessment needs to either neutralise this gap by providing equal access at the assessment itself or address it through the construct definition. Both approaches are defensible. The decision needs to be documented in the validity argument.

What changes for the candidate experience

For candidates, AI-as-competency assessments feel different from traditional take-homes in ways that benefit fairness and clarity. The rules are explicit. The expectations are visible in the rubric. The disclosure requirement removes the ambiguity that creates appeals under traditional models. The tasks reflect the work the candidate is actually preparing to do in their profession.

The discipline the candidate has to develop is the discipline of working with AI thoughtfully rather than reflexively. Most candidates approaching one of these assessments for the first time discover that they have been using AI tools in a less structured way than the assessment requires, and that the discipline of disclosing, verifying, and reflecting actually improves their work. The assessment becomes the moment they level up their practice, which is exactly what good credentialing assessments should do.

Candidate-facing materials should make all of this explicit. The rules, the rubric, the evidence requirements, and worked examples of strong and weak submissions all belong in the candidate handbook before the assessment opens. Surprises at the rubric stage are unfair and create avoidable appeals.

What the credential owner needs to be ready for

Designing this kind of assessment well takes more upfront work than designing a traditional take-home. The construct statement has to be explicit. The rubric has to be developed, piloted, and refined. The disclosure format has to be designed. The marker training has to cover both the domain content and the AI-specific judgement criteria. The candidate-facing materials have to be written from scratch.

In return, the credential owner gets an assessment that certifies a competence the market actually values, that produces defensible decisions under appeal, and that ages better than traditional alternatives because it adapts as the tools change. The tools will keep changing. A rubric grounded in tool selection, interaction quality, verification, and accountability remains valid even when the specific AI products shift, because the underlying judgement is what is being measured.

“A rubric grounded in tool selection, interaction quality, verification, and accountability remains valid even when the specific AI products shift.”

For credential owners considering whether to invest in this design, the test is straightforward. Does the profession your credential serves currently use AI tools in normal practice? If yes, the assessment that ignores AI is the one with the validity problem, not the one that embraces it. The work to redesign is real, but the work to defend a credential that is increasingly out of step with practice is greater, and gets harder every cycle.

A practical starting point

Three steps move a credential from “AI permitted” to “AI is the competency”.

From “AI permitted” to “AI is the competency”

Pilot the approach on one component, ideally a take-home or portfolio component where AI use is already common in candidate practice. Build the construct statement, the rubric, and the disclosure format. Run it with a small cohort and compare the results to the traditional assessment they would have sat. Look for cases where the new approach produces a clearly better signal about candidate readiness.
Build the supporting infrastructure. Marker training, candidate guidance, sample submissions, and the appeals process all need to be in place before the approach scales. The pilot is the moment to refine each of these.
Decide whether and how to roll the approach out across the credential. Some components may stay traditional because the construct is genuinely unaided competence. Others may move to AI-as-competency because the construct has changed. The decision is per component, anchored to the construct decision discussed in our companion article.

The credential that emerges from this work is a stronger product than the one it replaces, because it is honest about the practice it certifies and rigorous about the judgement it measures. That combination is what credentialing has always been about. AI is the latest opportunity to demonstrate it.

Ready to redesign credentialing assessments where AI use is part of the competency?

Talk to our team about how Globebyte can help you build the construct, the rubric, and the supporting infrastructure.

Explore our services