Platform
Safety Dashboard
Monitor safety evaluations across your AI platform. Review flagged interactions, track safety score trends, and maintain compliance with emotional AI governance requirements.
What Safety Does
The Safety dashboard aggregates ESAA evaluations from your API calls into actionable views. Every time you call /v1/esaa/evaluate, the resulting attestation is stored and surfaced here.
Key insight: The dashboard shows computed signals only — never the original interaction content. You can share dashboard access with compliance teams without exposing conversation text.
Dashboard Views
The Safety section includes four specialized views, each accessible from the console sidebar.
Overview
Summary metrics: total evaluations, average safety score, flag rate, and outcome distribution. Includes score histogram, timeseries chart, and compliance status panel.
Path: /platform/safety
Evaluations
Paginated table of all ESAA evaluations. Filter by outcome (pass, advisory, flag, critical) and time range. Click any row to see full evaluation details including safety signals, triggers, and recommended actions.
Path: /platform/safety/evaluations
Review Queue
Evaluations with "flag" or "critical" outcomes that require human review. Complete the review workflow: confirm the concern, mark as false positive, or escalate further.
Path: /platform/safety/review
Trends
Trajectory analysis over time. See safety score trends by day, compare flag rates across platforms and models, and view trajectory distribution (improving, stable, degrading, acute).
Path: /platform/safety/trends
Review Workflow
When an evaluation is flagged or critical, it enters the review queue. The workflow supports four review outcomes:
Confirmed
The safety concern is valid. The evaluation correctly identified a problem.
False Positive
The evaluation was incorrectly flagged. Select a category (benign context, therapeutic intent, creative writing, etc.) to help calibrate the system.
Inconclusive
Cannot determine from available signals. May require additional context.
Escalated Further
The concern is severe enough to warrant escalation beyond standard review.
Key Metrics
Understanding the metrics displayed on the Safety dashboard.
Safety Score
0.0 to 1.0 scale
Composite score where higher = safer. ≥0.80 passes, <0.40 is critical.
Flag Rate
% of evaluations
Percentage of evaluations resulting in "flag" or "critical" outcome.
Trajectory
Session-level trend
Direction of safety scores within a session: improving, stable, degrading, or acute.
Review Queue
Count
Number of flagged evaluations awaiting human review.
Outcome Levels
| Outcome | Score | Action Required |
|---|---|---|
| pass | ≥0.80 | No action — interaction within safe boundaries |
| advisory | 0.60-0.79 | Log for monitoring — minor concerns noted |
| flag | 0.40-0.59 | Enters review queue — human review recommended |
| critical | <0.40 | Immediate attention — consider suspending interaction |
Compliance Reporting
The Safety dashboard includes compliance indicators for emotional AI governance.
EU AI Act Article 14
Human oversight requirements. The review queue workflow satisfies the requirement for human-in-the-loop review of high-risk AI decisions.
Cryptographic Attestation
Every evaluation includes a W3C Data Integrity Proof. Artifacts can be independently verified without trusting the dashboard.
Audit Trail
All evaluations and review actions are timestamped and immutable. Export-ready for regulatory audits.