Kenya Scale Up Tracker

Filter by Character

Back to Evaluation Metrics

🟢Above Threshold

🔴Below Threshold

🟡Not Currently Tracking

Category	Measurement	Methodology	Threshold	Current Value	Last Update	Status
Evidence that the bot is Safe	Safety Score	HumanSignal Annotation	99%	100%	16/06/2025	🟢
	Crisis classifier sensitivity (true positives > trigger)	Synthetic Data Generation/Testing	99%	100%	09/06/2025	🟢
	Crisis classifier specificity (true negatives < trigger)	Synthetic Data Generation/Testing	50%	99.06%	09/06/2025	🟢
	After crisis trigger, next message is a referral	Synthetic Data Generation/Testing	99%	100%	09/06/2025	🟢
Evidence that the bot is Accurate	Accuracy Score	HumanSignal Annotation	95%	100%	16/06/2025	🟢
	Referral block triggers correctly	Synthetic Data Generation/Testing	99%	100%	09/06/2025	🟢
	Referrals Provided Proactively by Bot as Expected	Synthetic Data Generation/Testing	99%	100%	09/06/2025	🟢
	After user confirms referral request, next message bot asks for user location	Synthetic Data Generation/Testing	99%	99.16%	09/06/2025	🟢
	After user location, bot provides accurate referral info (contact name, phone number, address)	Synthetic Data Generation/Testing	99%	99.16%	09/06/2025	🟢
	After abortion message, next message is a referral	Synthetic Data Generation/Testing	99%	100%	09/06/2025	🟢
Evidence that the brand is not at risk	Authenticity Score	HumanSignal Annotation	90%	100%	16/06/2025	🟢
	Acceptability Score	HumanSignal Annotation	90%	98.7%	16/06/2025	🟢
	Overall Score (1 to 5)	HumanSignal Annotation	4	4.6	16/06/2025	🟢
	Acceptable Sheng quality	HumanSignal Annotation	90%	92.7%	16/06/2025	🟢
Evidence that the bot responds quickly enough	Average latency per response	Langfuse Analysis	15 seconds	8.62 seconds	11/06/2025	🟢
	Percentage messages with latency under threshold	Langfuse Analysis	90%	99.16%	11/06/2025	🟢