What are evaluations?

Evaluations help you track how well Cal is responding to users. The information gathered through evaluations allows you to spot trends over time and determine whether any configuration or knowledge-based updates are needed.

How can I evaluate Cal?

Accessing the Quality Review page

In the left sidebar, click on Quality and select QA Review
Every interaction with Cal will be tracked from this page. Use the tabs and filters at the top to view tickets by interaction type, date, evaluation status, channel, category, and more.

Evaluating interactions

To start evaluating cases:

Click on any case in the list to open the detailed evaluation view
Review the full conversation history and Cal's response
Score Cal's performance using the evaluation scorecard on the right hand side
Add any additional comments in the provided text fields
Click "Submit evaluation" when finished

Evaluation criteria

The default evaluation criteria vary by interaction type. For custom evaluation criteria, please see below.

AI Agent Reply Accuracy (All interaction types)

Correct: The answer provided is accurate
Somewhat incorrect: The answer is partially accurate but contains some inaccuracies
Incorrect: The answer provided is inaccurate

Knowledge Retrieval (Co-pilot only)

Correct source: Cal referenced the correct knowledge base articles
Incorrect source: Cal did not use the appropriate knowledge sources

AI Agent Reply Style (Co-Pilot, Email)

Excellent: The reply follows your Style Guide well
Fair: The reply partially follows your Style Guide
Poor: The reply does not follow your Style Guide

Path correctness (Email only)

Correct: The relevant workflow was accurately chosen
Incorrect: The wrong workflow was selected

Custom Evaluation Criteria

Custom evaluation criteria can be added through the Scoring page under Quality.

Click 'Add score' to begin adding a custom score:

Screenshot 2025-04-15 at 11.34.00 AM.png

Channel: these are the types of conversations/interactions that this score will be available for on the evaluations page
Ratings: add the options you would like to show up as rating buttons under the score on the evaluations page, e.g. 'Correct', 'Somewhat incorrect', 'Incorrect' in this screenshot:
Reasoning Tags (optional): add tags that you'd like to be able to select between for a particular score

Once saved, you will be able to see the custom score on an applicable conversation type on the QA review 'Evaluate cases' page.

Quality reporting

Once evaluations are completed, you can view performance data in the Quality report. Note: The Quality report is currently available for Copilot only, with reporting for Email, Voice, and Chat agents coming soon.

Accessing the Quality report

In the left sidebar, click on Reports and select Quality
The report has two main tabs:
- Drafted replies: Includes performance data for all Co-pilot interaction types
- Agent guidance: Focuses on ‘Question answer’ and ‘Summarize’ interaction types
Use filters to view data by interaction type, date, or source

Report metrics

The Quality report includes metrics based on two inputs:

Evaluations completed by your QA team
Agent feedback submitted via the thumbs-up (helpful) and thumbs-down (not helpful) buttons in the Copilot interface

Tips for effective quality reporting

The Quality report only shows data for cases that have been evaluated by your QA team or received agent feedback. To obtain meaningful insights:

Have your QA team regularly evaluate interactions in the QA review page
Encourage agents to provide feedback directly in the Cal interface
Ensure evaluations cover all relevant Copilot support types
Maintain consistent evaluation criteria

Reviewing Quality evaluations

What are evaluations?

How can I evaluate Cal?

Accessing the Quality Review page

Evaluating interactions

Evaluation criteria

AI Agent Reply Accuracy (All interaction types)

Knowledge Retrieval (Co-pilot only)

AI Agent Reply Style (Co-Pilot, Email)

Path correctness (Email only)

Custom Evaluation Criteria

Quality reporting

Accessing the Quality report

Report metrics

Tips for effective quality reporting

Comments

What are evaluations?

How can I evaluate Cal?

Accessing the Quality Review page

Evaluating interactions

Evaluation criteria

AI Agent Reply Accuracy (All interaction types)

Knowledge Retrieval (Co-pilot only)

AI Agent Reply Style (Co-Pilot, Email)

Path correctness (Email only)

Custom Evaluation Criteria

Quality reporting

Accessing the Quality report

Report metrics

Tips for effective quality reporting

Related articles