HalfCheetah PPO — Timing Audit

Deployment Robustness
FAIL
Return 4% (jitter / delay / spike)
Stress Robustness
FAIL
Return -9% (5x speed)
Deployment Fragile
Agent degrades under deployment timing conditions. Recommended fix: train with speed randomization (jitter/delay/spike augmentation).

1. Robustness Test — Timing Perturbations

Wraps the environment with timing perturbations. The agent runs normally — no internal intervention. Deployment scenarios (jitter, delay, spike) model realistic conditions. Stress scenarios (5x speed) test extreme resilience.

Robustness Under Timing Perturbations

Robustness Bars

Robustness Detail

CategoryScenarioReturn (% nominal)95% CIRMSE ratioReturn Change
DeploymentSpeed jitter (2 +/- 1)25%24%–28% ***2.38x+74.6%
DeploymentObservation delay (1 step)4%2%–5% ***3.79x+96.2%
DeploymentMid-episode spike (1-5-1)91%86%–98% ***1.07x+9.1%
Stress5x Speed (unseen frequency)-9%-11%–-8% ***4.82x+109.3%

Recommendation

Agent degrades under deployment timing conditions. Recommended fix: train with speed randomization (jitter/delay/spike augmentation).

Speeds tested: [1, 2, 3, 5, 8] | Episodes per condition: 30 | Intervention support: False

Generated by deltatau-audit v0.3