Comparing Three Sleep Quality Metrics
If you downloaded my September data (you can do it here, absolutely free!), you probably noticed that the research agenda behind data collection that month was focusing primarily on diet, exercising (#fitsperiment!), and sleep. I finally found time to look closer at some of that data, and in this post, will share some interesting results of my sleep data analysis.
First of all, you probably noticed that dataset has only 19 days worth of records. It is because I have been collecting data every weekday, starting Tuesday, September 4th (Monday, September 3rd was a Labor Day, an official holiday here in US). I do not collect data on weekends (Friday to Saturdays and Saturday to Sundays), because my schedule is usually unpredictable, and it makes it hard to follow the measurement routines (e.g., weight myself at 7:15 am, or log caloric content of every meal). I may, however, change that in the future.
The objective of this first analysis was to compare three different methods for measuring sleep quality. Every night, I was going to bed with a Bodymedia tracker strapped to my arm and an iphone with the running Sleep Time app placed next to me on the mattress under the sheets. Next morning, about ten-fifteen minutes after awakening, I would also answer two questions: “Was my sleep long enough?” and “Was my sleep restful?”, using 10-point scale (1 = Not At All, 10 = Extremely). The arithmetic average of two responses would be logged as a subjective score of that night’s sleep quality. Both Bodymedia and Sleep Time app define sleep efficiency as a time slept over time spent laying in bed. I believe, such definition captures both the length and quality of sleep, and thus, should be pretty much comparable to my subjective definition. I have also been logging “dream recall”, which is simply a “yes/no” to question “Do I remember any dreams?”. I read somewhere that if you don’t remember your dreams, it means you slept very well. Unfortunately, there was not enough variability in September data (only after 2 out of 19 nights, I did not remember my dreams) to include dream recall in this analysis.
If you look at the descriptives statistics, all three metrics look similar. Both Bodymedia and Sleep Time app reported sleep efficiency in the same ranges (between 71% and 94% for Bodymedia scores, and 77% and 96% for Sleep Time app), and even had the same average scores (84%). If my multiply my subjective sleep score by 10, they will fall between 60 and 90, with average score of 72 (click on image on the right to enlarge).
And this is how all three measurements look in the same graph (I multiplied subjective scores by 10 to get them on the same scale with other readings).
Both Bodymedia and Sleep Time app also report actual time slept (as opposed to spent laying in bed), which I recorded in minutes. According to Bodymedia, the shortest sleep lasted 330 minutes, and the longest 427 minutes, with average time of 387 minutes. Sleep Time’s estimates were 332 minute shortest, 463 minutes longest, and 398 minutes average.
Things, however, get messier when you start looking at the correlations. Theoretically, If all three metrics reflect the same thing (in this case, sleep quality), you would expect them to be correlated, positively and moderately or highly. Since we are dealing with non-normally distributed data, I first transformed scores into ranks (using RANK function in Excel, in ascending order), and then computed Spearman rank-order correlations using SAS. You can also use Statwing, which will give you Cramer’s v. I must also point out that since we are dealing with such a small number of time points (only 19 days), it makes sense to look only at the direction and strength of the association, and avoid any conclusions about statistical significance.
I started by computing correlations between Bodymedia score and other two metrics. To my surprise, both correlation coefficients suggested that there is no significant relationship within these pairs! The correlation between the Bodymedia score and Sleep Time score was meager .06, and the correlation between the Bodymedia score and my subjective sleep assessment score was actually negative ( – .33)! In other words, at least according to my data, Bodymedia is measuring completely different construct, compared to Sleep Time and my own subjective assessment!
Could it be that Bodymedia and Sleep Time app measure time slept and laying in bed differently? A quick look at the the relationship between the “time slept” estimates indeed found that there was not much concordance between them: the correlation coefficient was only .10! That only confirms that Bodymedia and Sleep Time app somehow capture sleep characteristics differently.
Things get even more interesting, when you compare Sleep Time score with my own subjective assessment of sleep quality. The correlation between these two metrics is positive and relatively strong: rho = .72! This suggests that Sleep Time app and my own scale are on the same page, and tap into the same sleep characteristics:
So which readings I should accept as a more accurate estimate of sleep quality, those of Bodymedia or of Sleep Time app? Why are their sleep efficiency scores so discordant? Bodymedia reports only total time slept and total time spent laying in bed. The Sleep Time app gives more granular report, including REM+Deep Sleep, Light Sleep, and Awake time. Could it be that the efficiency in Sleep Time app is calculated differently (e.g., by taking into account only REM+Deep Sleep)? Also, the slightly higher min and max values of Sleep Time scores suggest that the iphone’s accelerometer may be more sensitive to my movements, compared to Bodymedia’s accelerometer. But why Sleep Time’s score is more likely to reflect my own subjective assessment of sleep quality than Bodymedia?
To be honest, I have no idea how to answer these questions, but I promise to look into this mystery closer. This month, I included another tool into my sleep measurement inventory: Zeo Mobile Sleep Manager. By the end of October I should have enough data points to compare all four metrics. I may also contact both Bodymedia and Sleep Time research teams for help. Stay tuned!