HalfCheetah PPO — Speed-Randomized Training (After)

Deployment Robustness
PASS
Return 100% (jitter / delay / spike)
Stress Robustness
FAIL
Return 38% (5x speed)
Deployment Ready
Agent maintains performance under deployment timing conditions. No immediate action needed. Consider adding timing awareness for enhanced performance at extreme speeds.

1. Robustness Test — Timing Perturbations

Wraps the environment with timing perturbations. The agent runs normally — no internal intervention. Deployment scenarios (jitter, delay, spike) model realistic conditions. Stress scenarios (5x speed) test extreme resilience.

Robustness Under Timing Perturbations

Robustness Bars

Robustness Detail

CategoryScenarioReturn (% nominal)95% CIRMSE ratioReturn Change
DeploymentSpeed jitter (2 +/- 1)121%112%–132%0.52x-21.3%
DeploymentObservation delay (1 step)148%136%–161%0.80x-48.0%
DeploymentMid-episode spike (1-5-1)113%102%–126%0.92x-13.2%
Stress5x Speed (unseen frequency)38%34%–44% ***1.01x+61.6%

Recommendation

Agent maintains performance under deployment timing conditions. No immediate action needed. Consider adding timing awareness for enhanced performance at extreme speeds.

Speeds tested: [1, 2, 3, 5, 8] | Episodes per condition: 30 | Intervention support: False

Generated by deltatau-audit v0.3.4