Track and Analyze Your Copilot's Performance

Like all bots, those equipped with generative AI need to be tracked and frequently adjusted to continue improving their performance. Here are some tips on the key metrics to monitor closely, which you can find in the Copilot report.


1. How to Track Your Copilot's Performance?

1.1. Your Objectives: A Crucial Key to Interpretation

First and foremost, we encourage you to look at all indicators (not just those related to Copilot) in light of the goals set by your company and your conversational strategy.


Each strategy can vary greatly regarding goals to achieve, resources deployed, or even the volume of conversations; an identical score can be considered good for one strategy, or mediocre for another. Keep your objectives in mind when reviewing the data, as they are the key lens through which to interpret your indicators!


1.2. Monitoring Frequency

We recommend reviewing these metrics at least once a week, or even more frequently if your bot handles a high volume of conversations.


If you need to modify your bot or its knowledge base to improve it, we advise making all the changes at once to avoid multiple versions of your bot, ensuring maximum consistency and clarity in its results.

 


Three perspectives can guide your reading of the Copilot report’s indicators:

EN.jpg


2. Efficiency

2.1. AI’s Share of Conversations

This figure shows the proportion of all your conversations handled by Copilot. It should be analyzed in relation to the productivity goals you set at the start of your project: do you want the bot to handle most conversations, or just a specific part of your interactions with visitors?

1.jpg


If you opted for a strategy involving multiple bots (some without Copilot), it may be useful to compare the AI’s share of conversations with the automated conversation share available in the automation report: this figure will tell you the portion handled by all bots (including AI) and help you better evaluate the services provided by both bots and Copilot, which are the only ones using generative AI.


2.2. Transfer Rate

This figure shows the percentage of conversations initiated with AI that were then transferred to a human agent. It, too, should be reviewed in light of your overall response strategy: are you aiming for a fully autonomous bot, or do you want to facilitate escalation to human agents?
In the second case, a high transfer rate indicates the bot is fulfilling its role. But if you’re aiming for a fully autonomous bot, you should target a very low or zero transfer rate. If you’ve set up your bot with customized controls to redirect to an agent in edge cases but find that transfers are too frequent, we recommend reviewing your quality indicators.

 

2.jpg


3. Response Quality

3.1. CSAT

This classic indicator gives you an immediate view of the overall customer experience with conversations involving Copilot, as it is based on a rating form provided at the end of the conversation.
The Copilot report distinguishes the CSAT of fully AI-managed conversations from those partially managed by AI (i.e., that resulted in a transfer to an agent and typically receive higher ratings). You can also compare it with the CSAT of all your conversations, available in the Customer Experience report.
A low CSAT (below 60% for a Copilot) should raise a red flag and encourage you to examine other quality indicators more closely.

 

3.jpg


3.2. Customer Feedback on AI Responses

In the chatbox on your website, visitors can provide feedback on each Copilot response. They never evaluate every response, so feedback rarely reaches 100% (our teams observe a maximum of 10%).
A link below negative feedback allows you to view the responses that received unfavorable feedback. We recommend reviewing these carefully and correlating them with the content in your Knowledge to improve it.

4.jpg


3.3. AI Answer Rate

This is the percentage of conversations where Copilot's AI successfully provided an answer to visitors. Even with a solid Knowledge and prompts, Copilot doesn’t always have the answer:

  • The information in the Knowledge you provided may be incomplete,
  • Visitors may ask questions that are too individual or contextual (for example, including a name or email), and Copilot can’t answer with its general knowledge,
  • Visitors may ask questions outside the expected use case (for example, asking about order tracking when Copilot is set up for product discovery),
  • Copilot may self-censor due to internal or customized settings.

5.jpg


A low rate (60% is considered the minimum to aim for) indicates that Copilot is not fully delivering as expected and may negatively impact the CSAT: visitors are rarely pleased to hear that there’s no answer available.

Adjustments are almost always necessary during a project involving generative AI, especially early on.

To increase your AI response rate, we provide tips in the next section.


4. How to improve your Copilot?

4.1 Recurrent Topics

The Copilot report doesn’t just provide AI performance metrics but also gives you tools to improve them, directly from the recurring topics list, where all conversations handled by AI are grouped by theme.

By default, topics are sorted by the number of occurrences, as these are the most requested from Copilot. But you can also adjust your filters to display those with the lowest CSAT, highest number of transfers, or lowest response capacity: an excellent way to prioritize your work.

 

6.jpg


Clicking the “Improve” button gives you access to a list of the five most recommended questions by the AI to supplement its information on this recurring theme. This list of questions is derived from a synthesis of the conversations that took place between Copilot and visitors on the topic.

 


Clicking “Add Content” allows you to directly add a new record to the Knowledge on which Copilot relies to answer visitor questions. You can reformulate the topic if necessary and add as much detail as needed in the “Content” field.

7.jpg


4.2. In-Depth Conversation Review

After addressing the five main suggestions, you can also view the list of all conversations associated with this theme and read them to better understand the context and analyze Copilot’s responses in more detail.

For example:

  • If the bot seems to lack information or mixes things up, feel free to complete or rephrase your Knowledge to make it clearer and more explicit, or even add a source.
  • If visitors ask questions that are too individual or contextual (e.g., including an email address or order number), don’t hesitate to rephrase Copilot’s introduction card in its scenario to prompt visitors to ask more general questions.
  • If visitors' needs exceed the expected use case, you can add a new Knowledge and adjust your strategy to cover multiple use cases (e.g., product discovery and FAQs are often complementary). If visitors are looking to track their orders, you can also adjust your bot scenario for that specific use case.

 

 

Can Copilot Be Used for Multiple Needs at Once?


Thanks to generative AI, Copilot is a flexible solution that can meet a variety of needs.


You may find that your website visitors don't necessarily ask the type of questions you anticipated; for example, you thought they would need help choosing products, but instead, they’re asking customer support-related questions.


Copilot is likely capable of handling all types of questions, provided you supply the necessary information, either by enriching its existing Knowledge or adding a different type of Knowledge – more information is available in this article.


It's also very common for customers to ask about order tracking – Copilot can also handle this specific case, as explained here.