To demonstrate this concept, let’s say we are trying to test whether the chatbot understands the concept of machine insurance. To confirm whether the chatbot will be able to recognize language about machine insurance, but not confuse it with other language entered in as learning data, we need to write tests (in the form of phrases) that contain features typical of the language and define reports with appropriate measures for assessing the chatbot’s precision. Quality measures for the chatbot can be defined in different ways. But overall, you must answer one very important question: What do we mean by saying the chatbot learns to improve the classification of phrases?!

The answer is not so simple. Let’s assume we have the following categories defined for the chatbot:

  • Machine insurance
  • Machine technology
  • Type of the machine
  • Cost of credit

If the user types the sentence ‘I want to pay insurance for my new machine’, it does not mean that the chatbot will classify it into only one category. The classifier should assign a sentence with a very large “value” to only one category, but this sentence can also be assigned to other categories with a small value, eg:

Classification ScoreCategory Name
91%Machine insurance
41%Machine technology
30%Type of the machine
17%Cost of credit

The expression ‘I want to pay insurance for my new machine’ has been classified as Machine Insurance with a value of 91%, while the amount in the line underneath indicates matching this sentence to the cost of machine insurance with a value of 41%. The other two categories have an even smaller value that match with the entered sentence. Let us assume that the values of assigning a phrase to a category are in the interval (0; 1).

Therefore, looking at the results shown above, it can be concluded that the chatbot is confident in classifying this phrase because the difference between the first valid classification value and the second is equal to 50%.

During the classification of a phrase other issues which could cause problems include:

  1. too small a difference between the first two categories assigned
  2. the correct phrase’s value being too low
  3. uniform distribution of the category classification, indicating the chatbot is unsure how to classify a phrase

By testing a chatbot, not only is one able to train it and increase its levels of comprehension, but one can establish a systematic approach to handling new language which results in a chatbot performing at more advanced levels with increased comprehension and communication skills.

Checking the accuracy of the chatbot’s phrase classification is a crucial aspect of developing a chatbot’s proficiency, and just like in teaching children, enables it to learn on its own and build on its knowledge base.

Ailleron - Reading between the lines: checking the accuracy of chatbot phrases


Ailleron Marketing team includes digital marketers and content creators who provide insights and expertise from across the organization, including #AilleronExperts. For media queries, please get in touch with us via our contact form.

abstract lines

Let’s make financial experiences
easy and enjoyable together!

Tell us what you need and we will contact you shortly.

Tell us what you need and we will contact you shortly.