Help CenterAnalyticsAI Learning & A/B Testing

AI Learning & A/B Testing

Let your AI get smarter over time with sentiment tracking, empathy learning, and procedure experiments.

What is AI Learning?

AI Learning brings together two powerful optimization features: sentiment tracking (how well your AI recovers upset callers) and A/B testing (comparing different versions of procedures to see which performs better). Together, they help your AI continuously improve.

Sentiment Tracking

When your AI detects that a caller is frustrated, it can automatically add empathetic language to its responses. The AI Learning page shows you how effective these interventions are.

Key metrics

MetricWhat It Means
Total InterventionsHow many times the AI added empathy to a response because the caller was upset.
Recovery RatePercentage of upset callers who calmed down after the empathetic response. Higher is better.
By Sentiment LevelBreakdown between "Upset" (mild frustration) and "Irate" (strong frustration).
Tip: If your recovery rate is low, review and update your empathy phrases in Voice Experience. The AI learns which phrases work best over time, but starting with good phrases helps.

A/B Testing (Experiments)

A/B testing lets you compare two versions of a procedure to see which one performs better. For example, you might test a shorter greeting against a longer one, or a different question order.

How to run an experiment

Create a variant

Open a procedure in Flow Builder and click "+ Add Variant." This creates a copy you can modify.

Make changes

Edit the variant with your proposed improvements. Change wording, reorder steps, or try different approaches.

Publish the variant

When you publish, BLEUM automatically splits incoming traffic between the original and the variant.

Monitor results

Watch the A/B Testing page to see which version has better completion rates, shorter handle times, or higher satisfaction.

Understanding results

BLEUM tracks statistical confidence to help you make decisions. Wait until you have enough data (usually at least 100 calls per variant) before drawing conclusions. The system will show you when results are statistically significant.

Important: Do not end an experiment too early. Small sample sizes can be misleading. Let the experiment run until BLEUM indicates the results are statistically confident.

Common Questions

BLEUM randomly assigns each caller to either the original or the variant. The split is roughly 50/50. Each caller consistently gets the same version if they call back during the experiment.

Yes, you can run experiments on different procedures simultaneously. However, avoid running multiple experiments on the same procedure at the same time, as it makes results harder to interpret.