NEW Apple Sleep Score - Scientific Test! Versus Oura and WHOOP!
By The Quantified Scientist
Summary
## Key takeaways - **Apple Sleep Score Not Exclusive**: You don't need an Apple Watch to get Apple's sleep score; any device tracking sleep stages and syncing to Apple Health will suffice, provided its sleep tracking is reliable. [00:21], [00:48] - **Retrospective Sleep Score Calculation**: Apple's sleep score appears to be calculated retrospectively, meaning you can obtain scores for past nights going back many years, which is useful for analyzing old data. [00:53], [01:02] - **Apple Sleep Score Components**: The Apple sleep score is a combination of three factors: total sleep duration including deep and REM sleep (0-50 points), bedtime consistency (up to 30 points), and number/duration of wake-ups (up to 20 points). [01:06], [01:46] - **Apple vs. Oura & Whoop Correlation**: The Apple Watch sleep score shows the highest correlation (0.67) with the Oura Ring, while the Whoop strap tends to deviate more, potentially due to its consideration of heart rate variability and resting heart rate. [03:28], [04:10] - **Sleep Stage Accuracy Matters**: The reliability of your sleep score depends heavily on the accuracy of your device's sleep staging. If a watch cannot reliably track deep or REM sleep, the resulting score will be less dependable. [05:26], [05:34] - **Avoid Cross-Brand Sleep Score Comparisons**: Comparing sleep scores derived from different brands (e.g., Apple Watch vs. Garmin) can be misleading, as they may use different methodologies and parameters, leading to score discrepancies unrelated to actual sleep quality. [09:15], [09:21]
Topics Covered
- Apple's Sleep Score: More Accessible Than You Think?
- How Apple Calculates Your Sleep Score: A 3-Part Formula.
- Why Do Top Sleep Trackers Disagree on Your Score?
- Garbage In, Garbage Out: Why Your Watch's Sleep Staging Matters.
- Why Switching Sleep Trackers Skews Your Data.
Full Transcript
I systematically tested the new Apple
Watch sleep score and I made a few
important discoveries. Now, I also
compared that sleep score to the sleep
score of the Aura Ring and the Whoop
strap. And I will show you comparison
graphs later in this video. Now, this
was very interesting, but I was still
left with a few questions. But let's
start with what I discovered. First of
all, you don't need an Apple Watch to
get Apple's sleep score. Any smartwatch
or other device that tracks sleep stages
and syncs them to Apple Health seems to
result in a sleep score. So even if you
use a Garmin, a Huawei watch or another
device that is able to sync those sleep
stages to Apple Health, you will get a
sleep score. Now, of course, if the
sleep tracking of the device is really
bad, the sleep score will be less
reliable. However, still, it's good to
know that you don't need an Apple Watch
to get Apple sleep score. Another thing
I discovered is that I got sleep scores
even for all my past nights as well
going back many years. So it appears to
be even retrospectively calculated which
is actually pretty cool if you would
like to analyze your old data. But how
is this sleep score calculated? Well,
your sleep score can go up to 100 and is
a combination of three things. First of
all, you get 0 to 50 points for your
total sleep duration and this includes
getting enough deep sleep and REM sleep.
So for getting the maximum 50 points for
this part, a total sleep time is not
enough. You also need enough deep sleep
and REM sleep. Second, you can get up to
30 points for your bedtime. So
basically, if you went to bed around the
time that you normally go to bed, at
least it's how I interpret it based on
everything I read. And the last 20
points reflect the total number of times
you woke up and the total time you spent
awake. So that makes a maximum of 100
points total. So, that's how it works.
But how does this compare to the sleep
score from the two most popular sleep
trackers out there, the Aura Ring and
the Whoop Strap? Well, let's take a
look. I have data for 32 nights, where I
made sure that the Apple Watch sleep
score was based just on data from the
Apple Watch. So, just the sleep stages
from the Apple Watch, which for me most
of the time was this Apple Watch Ultra
2. And here we have those sleep scores
over time for the Apple Watch in black,
the Aura Ring in pink, blue, and the
Whoop strap in green. By the way, for
those of you who are new to the channel,
my name is Rob and I'm a post-doctoral
scientist specializing in biological
data analysis. Now, looking at the graph
right here, we can see that at least the
extremes mostly tend to agree. For
instance, the lower scores most of the
time agree or the whoop tends to deviate
a little bit more. But also right here,
my lowest score was also a low score for
all devices. However, interestingly,
this night, for instance, doesn't agree
so well between the Whoop strap, which
actually showed an increase in score,
whereas Apple and Aura showed a
decrease. But let's actually calculate
direct correlations because that will
tell us even more. And the correlations
between those sleep scores are shown in
these three plots. So on the left, we
compare the Apple Watch to the Aura
Ring. Then in the middle the Apple Watch
to the Whoop strap and then on the right
the Aura ring to the Whoop strap. And we
then calculate what is basically called
the Pearson correlation where we want to
see how well do the sleep scores of the
Apple Watch agree with the Aura Ring,
the Whoopstrap and even how do the Aura
ring and the Whoop strap agree with each
other and we actually see the biggest
correlation between the Apple Watch and
the Aura Ring at 0.67. Now for this we
have 32 measurements. Actually for the
Whoop strap three nights were missing.
So we only have 29 when comparing them.
But we see that at least the very low
scores and the very high scores tend to
agree between the Apple Watch and the
Aura Ring. If we compare the Apple Watch
to the Whoop strap, we still see that in
most cases they agree. But for instance
here we have two outliers where the
Whoop still gave a high score and the
Apple Watch gave a low score. And I
suspect these are potentially nights
where I had a really good heart rate
variability. So a high heart rate
variability, maybe low resting heart
rate and the Whoop strap takes this into
account. Whereas the Apple watches looks
at the total time asleep and those sleep
stages. So that's a different approach
that those take and we see something
similar if we calculate the sleep scores
between the Aura and the Whoopstrap.
They just don't perfectly agree with one
another. However, I hope to create a
bigger data set even so we can get even
better comparisons between these. But I
think the results are already pretty
striking. You do get quite different
calculations in terms of sleep scores
from the Aura Ring, the Apple Watch, and
the Whoop Strap. And the Whoop strap
appears to be the most different in
their approach or at least gives most
different values. Before I forget to
mention, all these results were obtained
with a public beta of iOS and Watch OS.
So there could be changes in the future,
but these are the results as I have them
now. So that was really interesting, but
it does leave me with a few questions.
The first is how does the sleep score
handle multiple inputs? What if, for
instance, you wear a Garmin watch and an
Apple Watch? Does it then prioritize the
Apple Watch? What happens in those
cases? I have no idea yet. Or what if
you wear two non-Apple devices like a
Garmin and a Huawei? Well, this is
something I'll have to ask Apple and
I'll let you know as soon as they tell
me. But based on what I see now, I think
there are two major things to conclude
here. The first is that the quality of
the sleep staging actually matters. So,
if your watch cannot reliably track deep
sleep or RAM sleep, you're not going to
get a reliable sleep score, at least not
the way that Apple intended it. As we
say in data science, garbage in, garbage
out. Luckily, the Apple Watch is one of
the best sleep stage trackers out there.
Let me quickly show you the testing that
I did, but also testing from scientific
papers. And let's start with the testing
that I did over many years by now where
we use different reference devices. Now
these are all EEG or PSG reference
devices. So all of them measure brain
waves, eye movements and other things.
And then we compare different watches to
them. Now I won't explain all the
details. You can check out some other
videos for that. But basically the
better the agreement of a watch with the
reference, the more to the top right
they are. And you can see all the Apple
watches I tested are all the way to the
top right. So, they are really some of
the best performers. We have the Apple
Watch Series 9, the Apple Watch Series
10, the Apple Watch Ultra 2, the Apple
Watch Ultra 1. All of them are amongst
those best devices. So, they're doing
really well. And there's only three
brands that do about the same. Those are
the eight sleep pot, my favorite sleep
improvement device. It's really
expensive, but it actually cools and
heats each side of the bed. So, it's not
a wearable. It actually goes around your
mattress. It's one of my favorite
devices out there, but very expensive.
If you want the best discount possible,
use my affiliate link below. Um, there's
also the Aura Ring, which does really
well. And finally, we have the NUA or
Sleep 2 app. And these are the only four
brands that do super well in my testing.
But they don't just do well in my
testing. Some of them were also tested
in scientific literature. So, let's take
a look at that. And those results are
right here. So, these are all data taken
from scientific papers. Now, some of
them were partially paid for by the
brand themselves. So take them with a
grain of salt, but the overall patterns
match very well. The new Aurora
algorithm is very good. The Apple Watch
does really well and the Whoopstrap and
Fitbit devices are sort of second tier.
Though Google is soon releasing a new
sleep tracking algorithm for Fitbit and
Google, so stay tuned for those results.
And if you look at other devices like
Garmin and Polar, they really don't do
that well. And if you're wondering what
these gray ones are, these are tracked
for people with a sleep disorder, which
is likely why all devices didn't do so
well. So, with an Apple Watch, at least
based on my testing and some scientific
studies, you will more likely get
reliable sleep stages, but less so for
some other brands. And this means that
the watch brand you use really does
matter. And that brings me to my second
remark. But before doing that, I run
this YouTube channel next to my
full-time job as a scientist, and it's
not cheap. I've bought all my Apple
watches myself and I have about a dozen
or even more at the moment. And of
course, I also have to pay my editor,
Alex. I edited this video myself, but
most of them are edited by him. If you
want to financially support the channel,
the most direct way of doing that is by
becoming a YouTube member, which is
basically like Patreon on YouTube, and
you'll get early access to many of my
videos. Another way of supporting is by
using one of the affiliate links in the
description below. If you buy a specific
device from many of them, you get the
best discount possible. or if you buy
anything at all on Amazon for that
matter, if you first click my link, I
get a small kickback and it doesn't cost
you any extra. You can even bookmark it
if you like. I think it's command or
control D. But of course, only if you
want to and even subscribing, liking,
and commenting already really helps. But
back to my second remark that I wanted
to make, that's about switching between
devices. If you switch between them, one
night you sleep with one, one night with
the other, you may see big differences
in your sleep score that actually don't
reflect a change in sleep quality. I
saw, for instance, that when I quickly
went through the data, the sleep score
based on Garmin data seemed to give
little deep or RAM sleep, which provided
me with a lower sleep score even though
I slept enough and probably also of
sufficient quality. So, I wouldn't
compare Apple sleep scores if the data
comes from different sources, so
different brands. And as we saw, the
sleep scores that are natively
calculated by other brands like Aura and
Whoop are quite different from the Apple
sleep score. Part of the reason will be
that Aura and Woo potentially also take
other parameters into account like your
heart rate variability, your resting
heart rate and maybe even the sleep of
previous nights and none of this is done
by Apple. Now, if you do end up buying
an Apple Watch, an Aura ring, a Whoop
strap, an HD pod, or anything at all on
Amazon for that matter, please consider
using one of my affiliate links below,
many of which give you the best discount
possible. And I think you will like this
video on the new Apple watches or this
video on the HL pod.
Loading video analysis...