What are evaluations?
Evaluations help you track how well Cal is responding to users. The information gathered through evaluations allows you to spot trends over time and determine whether any configuration or knowledge-based updates are needed.
How can I evaluate Cal?
Accessing the Quality Review page
- In the left sidebar, click on
Quality
and selectQA Review
- Every interaction with Cal will be tracked from this page. Use the tabs and filters at the top to view tickets by interaction type, date, evaluation status, channel, category, and more.
Evaluating interactions
To start evaluating cases:
- Click on any case in the list to open the detailed evaluation view
- Review the full conversation history and Cal's response
- Score Cal's performance using the evaluation scorecard on the right hand side
- Add any additional comments in the provided text fields
- Click "Submit evaluation" when finished
Evaluation criteria
The default evaluation criteria vary by interaction type. For custom evaluation criteria, please see below.
AI Agent Reply Accuracy (All interaction types)
- Correct: The answer provided is accurate
- Somewhat incorrect: The answer is partially accurate but contains some inaccuracies
- Incorrect: The answer provided is inaccurate
Knowledge Retrieval (Co-pilot only)
- Correct source: Cal referenced the correct knowledge base articles
- Incorrect source: Cal did not use the appropriate knowledge sources
AI Agent Reply Style (Co-Pilot, Email)
- Excellent: The reply follows your Style Guide well
- Fair: The reply partially follows your Style Guide
- Poor: The reply does not follow your Style Guide
Path correctness (Email only)
- Correct: The relevant workflow was accurately chosen
- Incorrect: The wrong workflow was selected
Custom Evaluation Criteria
Custom evaluation criteria can be added through the Scoring page under Quality.
Click 'Add score' to begin adding a custom score:
- Channel: these are the types of conversations/interactions that this score will be available for on the evaluations page
- Ratings: add the options you would like to show up as rating buttons under the score on the evaluations page, e.g. 'Correct', 'Somewhat incorrect', 'Incorrect' in this screenshot:
- Reasoning Tags (optional): add tags that you'd like to be able to select between for a particular score
Once saved, you will be able to see the custom score on an applicable conversation type on the QA review 'Evaluate cases' page.
Quality reporting
Once evaluations are completed, you can view performance data in the Quality report. Note: The Quality report is currently available for Copilot only, with reporting for Email, Voice, and Chat agents coming soon.
Accessing the Quality report
- In the left sidebar, click on
Reports
and selectQuality
- The report has two main tabs:
- Drafted replies: Includes performance data for all Co-pilot interaction types
- Agent guidance: Focuses on ‘Question answer’ and ‘Summarize’ interaction types
- Use filters to view data by interaction type, date, or source
Report metrics
The Quality report includes metrics based on two inputs:
- Evaluations completed by your QA team
- Agent feedback submitted via the thumbs-up (helpful) and thumbs-down (not helpful) buttons in the Copilot interface
Tips for effective quality reporting
The Quality report only shows data for cases that have been evaluated by your QA team or received agent feedback. To obtain meaningful insights:
- Have your QA team regularly evaluate interactions in the QA review page
- Encourage agents to provide feedback directly in the Cal interface
- Ensure evaluations cover all relevant Copilot support types
- Maintain consistent evaluation criteria
Comments
0 comments
Article is closed for comments.