Recovery: Readiness Scores Compared

Category: monitoring Updated: 2026-04-01

Whoop, Garmin, and Oura use HRV-anchored algorithms with 24-hour update cycles; Düking et al. 2018 found low-to-moderate agreement between wearable readiness scores and gold-standard lab markers.

Key Data Points
MeasureValueUnitNotes
HRV measurement agreement (vs. ECG)r=0.82-0.96correlationPhotoplethysmography (PPG) wrist-based HRV approximates ECG-derived HRV; accuracy varies by motion and skin tone
Readiness score vs. lab performanceLow-to-moderateagreementDüking et al. 2018 found wearable readiness indices do not reliably predict same-day performance test outcomes
Oura sleep stage accuracy~79% vs. PSGPolysomnography comparison from Altini & Kinnunen 2021; best among consumer wearables
Whoop Recovery update frequencyEvery 24hoursUpdates after each sleep period; requires consistent sleep data for accurate daily score
Garmin Body Battery range5-100pointsProprietary energy reservoir model based on HRV, sleep, and activity; recharges during sleep, depletes with activity
Oura Readiness score range0-100pointsComposite of resting HR, HRV, body temperature, sleep, and activity balance factors

Wearable readiness scores give athletes a daily number attempting to answer: ‘How recovered am I?’ The algorithms behind them differ significantly in inputs and design.

Düking et al. (2018 — PMID 29742032) evaluated wearable monitoring devices in athletes and found low-to-moderate agreement between wearable-derived readiness indices and gold-standard laboratory markers of physiological state. No consumer device reliably predicts same-day performance test outcomes. What they do well is track relative trends over time within an individual.

Device Comparison Table

DeviceAlgorithm InputsUpdate FrequencyValidation StudiesReliabilityKey Limitation
Whoop RecoveryHRV (rMSSD), RHR, sleep duration/stages, respiratory rateEvery 24h (post-sleep)Limited; primarily internalModerate within-subjectRequires consistent wear; no display
Garmin Body BatteryHRV, RHR, sleep, accelerometer (activity drain)Continuous (depletes in real-time)Limited independent studiesModerateActivity model is proprietary; poor with shift work
Oura ReadinessHRV, RHR, body temperature, sleep stages, activity balanceEvery 24h (post-sleep)Most published (Altini & Kinnunen 2021)Moderate-to-goodRing fit affects PPG accuracy
Apple Health readinessRHR trend, HRV trend, sleep, walking HRVDaily; limited synthesisMinimal peer-reviewed dataLow-to-moderateNo unified readiness score; fragmented
HRV4Training (app)Morning camera HRV, subjective wellness surveyDaily (manual measurement)Plachta et al. 2022 — strongest independent validationGood for HRV trendsRequires active morning routine; no wearable passivity
Manual RMSSD (chest strap)Single-lead ECG via Polar H10Daily (60-second morning measurement)Gold standard consumer methodHigh — matches clinical ECG closelyRequires dedicated hardware + app (Elite HRV, etc.)

Algorithm Inputs in Depth

All four systems anchor primarily on HRV — specifically rMSSD (root mean square of successive differences), which reflects parasympathetic tone. Resting heart rate trend is the secondary input in all systems. Beyond these, the approaches diverge: Oura adds wrist skin temperature (a reliable infection/stress indicator), Garmin uses real-time activity data to model energy depletion, and Whoop weights respiratory rate during sleep as a physiological stress indicator (Flatt & Esco, 2016 — PMID 26964014).

Sleep stage accuracy matters because all systems use sleep quality as a readiness input. Altini & Kinnunen (2021 — PMID 33348753) found Oura’s sleep stage detection achieved approximately 79% agreement with polysomnography — the best published result among consumer wearables, though still imperfect.

How to use this data: Readiness scores are most valuable as rolling trend indicators, not daily pass/fail signals. Track a 7-day moving average. Adjust training intensity for trends, not single-day scores. Combine the readiness number with subjective feel (morning mood, motivation, soreness) — when both are low, a reduced session is clearly warranted. When they disagree, subjective feel often carries equal weight for same-day decisions.

🛌 🛌 🛌

Related Pages

Sources

Frequently Asked Questions

Which wearable readiness score is most accurate?

No consumer wearable has been validated against gold-standard performance metrics with consistently strong results (Düking et al. 2018). Oura has the most published sleep validation data. Whoop has more athlete-focused training load integration. Accuracy varies by individual physiology, skin tone, and consistency of wear. None should be used as a sole training decision tool.

Can I use two devices simultaneously to cross-validate?

Cross-validation is useful for identifying consistent signals. If both Whoop and Oura show low readiness on the same morning, the signal is more reliable. Disagreement between devices on a given day is common and normal — treat it as data uncertainty rather than conflicting ground truth.

How much should I adjust training based on a low readiness score?

A single low score warrants attention, not automatic deload. A trend of 3+ consecutive low scores is more meaningful. The research on using readiness scores for training modification shows mixed results — athletes who adjust training responsively based on HRV-anchored scores tend to perform slightly better over multi-week blocks, but the evidence is not strong for single-session decisions.

Does the Garmin Body Battery measure recovery differently from HRV-based scores?

Yes. Body Battery uses a proprietary energy reservoir model that depletes with activity (using accelerometer and heart rate data) and recharges during sleep (using HRV). It is more activity-context-aware than pure HRV scores but less directly tied to autonomic nervous system state. It integrates more data types but with less physiological specificity.

What is the biggest limitation of all these scores?

All readiness scores are backward-looking — they summarize recovery from recent stress. They do not measure readiness for a specific type of future effort. A high readiness score does not mean optimal performance for a maximal power session; it means recovery from recent load is good. Training context and accumulated fatigue over weeks require human interpretation beyond any single daily score.

← All recovery pages · Dashboard