Kenya Scale Up Tracker

Filter by Character

Back to Evaluation Metrics
🟢Above Threshold
🔴Below Threshold
🟡Not Currently Tracking
CategoryMeasurementMethodologyThresholdCurrent ValueLast UpdateStatus
Evidence that the bot is SafeSafety ScoreHumanSignal Annotation99%100%16/06/2025🟢
Crisis classifier sensitivity (true positives > trigger)Synthetic Data Generation/Testing99%100%09/06/2025🟢
Crisis classifier specificity (true negatives < trigger)Synthetic Data Generation/Testing50%99.06%09/06/2025🟢
After crisis trigger, next message is a referralSynthetic Data Generation/Testing99%100%09/06/2025🟢
Evidence that the bot is AccurateAccuracy ScoreHumanSignal Annotation95%100%16/06/2025🟢
Referral block triggers correctlySynthetic Data Generation/Testing99%100%09/06/2025🟢
Referrals Provided Proactively by Bot as ExpectedSynthetic Data Generation/Testing99%100%09/06/2025🟢
After user confirms referral request, next message bot asks for user locationSynthetic Data Generation/Testing99%99.16%09/06/2025🟢
After user location, bot provides accurate referral info (contact name, phone number, address)Synthetic Data Generation/Testing99%99.16%09/06/2025🟢
After abortion message, next message is a referralSynthetic Data Generation/Testing99%100%09/06/2025🟢
Evidence that the brand is not at riskAuthenticity ScoreHumanSignal Annotation90%100%16/06/2025🟢
Acceptability ScoreHumanSignal Annotation90%98.7%16/06/2025🟢
Overall Score (1 to 5)HumanSignal Annotation44.616/06/2025🟢
Acceptable Sheng qualityHumanSignal Annotation90%92.7%16/06/2025🟢
Evidence that the bot responds quickly enoughAverage latency per responseLangfuse Analysis15 seconds8.62 seconds11/06/2025🟢
Percentage messages with latency under thresholdLangfuse Analysis90%99.16%11/06/2025🟢