Trust Scores
The Trust Score is a 0-100 metric representing an agent's trustworthiness based on its configuration and behavior.
Calculation
Trust Score = (Risk Profile Score × 40%) + (Behavioral × 35%) + (Alignment × 25%)
| Component | Weight | Source | Range |
|---|---|---|---|
| Risk Profile Score | 40% | Risk scoring (Assess phase) | 0-100 |
| Behavioral | 35% | Policy compliance (Authorize + Monitor) | 0-100 |
| Alignment | 25% | Goal consistency (Verify phase) | 0-100 |
Components
Risk Profile Score (40%)
Based on the agent's inherent risk profile:
- Configured at agent creation
- 14 parameters across three weighted categories: Base Security (25%), AI-Specific (45%), Impact (30%)
- Produces an Risk Profile Score (0–100) and a Risk Tier (1–4)
- Static unless re-assessed
- Higher score = lower inherent risk
Behavioral Score (35%)
Based on runtime compliance:
- Behavioral Compliance component starts at 100 for new agents
- Violations affect the Behavioral Compliance component (35% weight), not Trust Score directly
- Increases with compliant behavior
- Updated continuously
Factors:
Penalty to Behavioral Compliance component:
- Minor violation: -5 pts (→ -1.75 pts Trust Score)
- Major violation: -15 pts (→ -5.25 pts Trust Score)
- Critical violation: -25 pts (→ -8.75 pts Trust Score)
Alignment Score (25%)
Based on goal consistency:
- Starts at 100 for new agents
- Updated per session based on goal alignment checks
- Uses LLM evaluation (configurable)
Calculation per session:
Session Alignment = avg(operation_alignment_scores)
Overall Alignment = weighted_avg(recent_sessions, decay=0.95)
Score Ranges
| Risk Profile Score | Risk Tier | Risk Level | Description |
|---|---|---|---|
| 0% – 24% | Tier 1 | Low | Read-only, public data access |
| 25% – 49% | Tier 2 | Medium | Internal data, non-critical actions |
| 50% – 74% | Tier 3 | High | PII, financial data, critical actions |
| 75% – 100% | Tier 4 | Critical | System admin, destructive actions |
Score Display

Trust Score card on the Assess tab, showing the score, tier badge, and component breakdown.
Color coding:
| Tier | Color |
|---|---|
| Tier 1 (0% – 24%) | Green |
| Tier 2 (25% – 49%) | Blue |
| Tier 3 (50% – 74%) | Yellow |
| Tier 4 (75% – 100%) | Red |
Score Evolution
New Agents
Initial Trust Score:
├── Risk Profile: (from risk profile) × 40%
├── Behavioral: 100 × 35% = 35
├── Alignment: 100 × 25% = 25
└── Total: varies by risk profile
Behavioral and Alignment components start at 100 for new agents. Overall Trust Score depends on the Risk Profile score.
Example: Risk Profile Score = 98, Behavioral = 100, Alignment = 100 → Trust Score = (98 × 0.40) + (100 × 0.35) + (100 × 0.25) = 99.2 → TIER 1
Over Time
Day 1: 92 ━━━━━━━━━━━━━━━━━━ Tier 1
Day 7: 88 ━━━━━━━━━━━━━━━━━━ Tier 2 (minor violations)
Day 14: 84 ━━━━━━━━━━━━━━━━━━ Tier 2 (stable)
Day 21: 86 ━━━━━━━━━━━━━━━━━━ Tier 2 (recovering)
Day 30: 89 ━━━━━━━━━━━━━━━━━━ Tier 2 (approaching Tier 1)
Recovery
To improve a degraded score:
- Consecutive compliance - No violations for 7+ days
- High operation volume - More compliant operations
- HITL success - Approved requests
- Goal alignment - Consistent alignment scores
Recovery rate:
- Tier 1-3: +1 pt/day
- Tier 4: +0.5 pt/day
Related
- Trust Tiers - How scores map to trust controls
- Assess Phase - Configure the Risk Profile component
- Adapt Phase - Watch trust evolve over time