Defining “Good”: Building Quality Measures Teams Can Run

Many teams believe they have defined quality because they have a scorecard. In practice, the scorecard is often a mix of reasonable intentions and unclear measurements. It produces numbers, but it does not consistently produce agreement, coaching, or improvement.

Defining “good” is not a documentation exercise. It is an operating decision. If the definition is vague, quality becomes subjective. If it is too rigid, teams optimize for compliance with the form rather than outcomes in the conversation. If it cannot be explained with evidence, it will not hold up under scrutiny.

The goal is a definition of quality that teams can run every day.

Quality is not a score. It is a standard

A score is an output. A standard is what the organization agrees to measure and improve. Without a shared standard, quality programs drift into personal preference and local norms.

A workable standard has three properties.

It can be observed in a conversation
It can be explained with evidence
It can be coached without ambiguity

If any of these are missing, teams will argue about interpretation, agents will distrust feedback, and coaching will turn into opinion.

A quality program that scales is a program that creates alignment.

What “Good” Should Be Made Of

When teams struggle to define quality, it is often because they mix different kinds of “good” in the same category. The result is confusion: agents receive feedback they cannot act on, and managers cannot separate skill issues from process issues.

A practical definition of “good” includes three layers.

Customer outcome

Did the customer get what they needed, clearly and correctly?

This is the highest-level measure, but it is not enough on its own. Outcomes often depend on factors outside the agent’s control. Still, outcomes matter because they anchor the purpose of the work.

Conversation behaviors

What did the agent do in the conversation that increased or reduced the chance of a good outcome?

Behaviors are where coaching lives. Behaviors are also where consistency is possible because they can be observed and evidenced.

Policy and process adherence

Did the agent follow required steps and say required disclosures?

This layer matters, but it should not dominate the definition of quality. Many programs overweight adherence because it is easy to score. When that happens, teams optimize for checklists rather than effective communication.

A stable quality model keeps these layers separate so the organization can see what is really happening.

Make “Good” Observable

The quickest way to ruin a quality definition is to use words that sound right but cannot be scored consistently.

Common examples:

“Professional”
“Confident”
“Helpful”
“Empathetic”
“Clear”

These are not wrong. They are incomplete. They describe a feeling, not an observable behavior.

To make quality operational, definitions need observable evidence. The simplest method is to convert abstract concepts into measurable behaviors.

From abstract to observable

Instead of “clear communication,” define “clear” as behaviors like:

States the purpose of the call within the first minute
Confirms understanding of the customer’s request
Uses plain language rather than internal terms
Summarizes the resolution and next steps before closing

Instead of “empathy,” define behaviors like:

Acknowledges the customer’s situation in one sentence
Uses confirming language before giving instructions
Avoids blame language when describing limitations

The point is not to reduce conversations to scripts. The point is to give supervisors and agents a shared language for what “good” looks like.

If “good” cannot be pointed to in the transcript, it will not scale.

Tie Measures to Coaching, Not Just Scoring

Quality measures should exist primarily to drive coaching and improvement. If a measure cannot produce a clear coaching conversation, it is usually the wrong measure.

A coaching-ready measure has these characteristics:

It describes a behavior the agent can change
It is specific enough to be actionable
It can be demonstrated with evidence from the interaction
It aligns with a real outcome or risk

This is also how you avoid “vanity quality.” Vanity measures make teams feel controlled while producing little improvement.

A simple test works well.

If a supervisor cannot explain the score using one or two short pieces of evidence, the measure is not ready.

Avoid Mixing Measurement Types

Many scorecards fail because they combine incompatible kinds of evaluation in the same category. A common example is a single “Communication” category that includes:

tone and politeness
accuracy of information
completeness of process steps
resolution quality

When these are mixed, teams cannot tell whether an agent needs training, whether the workflow is broken, or whether the knowledge base is wrong.

Good measurement design separates:

Skill problems
Knowledge problems
Process problems
Policy problems

This separation matters operationally. Skill coaching looks different from process repair. Knowledge fixes look different from compliance remediation. When categories are blended, everything becomes “agent performance,” and the organization misses systemic causes.

If quality is meant to improve the operation, it must be able to distinguish human performance from system design.

Set the Bar for Consistency, Not Perfection

A quality model does not need to capture everything. It needs to capture what matters most and do so consistently.

At scale, consistency beats completeness.

A practical approach is to define:

A small set of core measures that apply broadly
A limited number of call-type measures that apply only when relevant
A small number of non-negotiables for compliance or safety

This keeps the standard stable while allowing reasonable variation.

A scorecard that is too large becomes un-runnable. A scorecard that changes constantly becomes untrustworthy. A scorecard that is stable becomes an operating system component.

What This Enables Next

Once “good” is defined as observable behavior, you can measure it more consistently. Once it is measurable, you can attach evidence. Once evidence exists, you can build trust.

That trust is what allows quality to become less subjective, compliance to become more defensible, and coaching to become faster and more effective.

The next lesson focuses on the evidence requirement directly: why scores without context fail, and what makes evaluations trustworthy enough to run at scale.

Defining “Good”: Building Quality Measures Teams Can Run

Core Question

How do teams define quality in a way that is consistent, explainable, and coachable?

Quality is not a score. It is a standard

What “Good” Should Be Made Of

Customer outcome

Conversation behaviors

Policy and process adherence

Make “Good” Observable

From abstract to observable

Tie Measures to Coaching, Not Just Scoring

Avoid Mixing Measurement Types

Set the Bar for Consistency, Not Perfection

What This Enables Next

In Practice

Further Reading

Continue Reading

Solutions

Resources

Legal