In this post, I will discuss three invisible temporal patterns that are likely to be present in your self-tracking data and which, if ignored during analysis, may lead to erroneous conclusions and interpretations. I am talking about trends, social rhythms and intra-day variability.
The most common mistake that people make when analyzing their self-tracking data is treating it as stationary. In other words, they assume that the variables that they track are independent of time, and are affected only by changes in the treatment or routines. The variables, however, may be slowly changing over time due to some other major, unrelated long-term forces (e.g., learning effect, lifestyle changes, etc.). As a result, they share a common factor – time, which becomes a confounding factor and thus often results in correlation between otherwise unrelated variables.
A good example would be tracking weight and weather during spring and summer. Many people start dieting and working out at the first signs of spring. As a result, they start losing weight. If they were to look at the correlations between the average daily temperature and body weight, they would find a moderate negative correlation: as the temperature increases, their body weight is decreasing. This may lead to erroneous conclusion that warmer temperature is associated with (or even causes) weight loss. In reality, the “relationship” between weight and daily temperature is nothing more but a statistical artefact due to the third “lurking” variable that they have in common: time. Both weight loss and rise of temperature are in fact functions of time, and association between these two is merely a spurious relationship.
Thus, before analyzing your data, it is important to test it for trends and remove potential long-term effects in order to flash-out more . If significant trend is detected, you can transform data by detrending it. There are several ways to detrend data, with the simplest technique (can be done in Excel) called “1-order differencing”. Basically, you take score on day S(t+1) and subtract from it the score from the previous day S(t)). In rare occasions, 1-order differencing is not enough, and you may have to do it again, but most often, one pass should remove the time effect completely.
In the future posts, I will discuss how to perform trend analysis and detrending of your self-tracking data in Excel. In the meantime, here is an chart of my physical energy scores during the summer, original showing trend (blue) and detrended (orange). I am still trying to figure out what caused small but significant and steady decline in my energy levels over the course of summer.
In 1990s, several scientists from Western Psychiatric Institute in Pennsylvania developed a questionnaire for measuring regularity of major daily behaviors, called Social Rhythm Metric. The idea behind the SRM is that certain everyday routines, like waking up, having breakfast/lunch/dinner, going to bed, etc. are performed at a relatively consistent schedule. The deviations from the normal rhythm may be due to major changes in lifestyle, cognitive or psychological health, etc. Unless you are Tim Ferris, you most likely have different SRM values for 9 to 5 work days vs. weekends and holidays. Your work days are more structured, consistent and “streamlined” when it comes to sleep, meals, and social schedule, whereas weekends and holidays are more flexible and unpredictable. This often translates into differences in patterns of relationships among the same variables when looking at work days vs. weekends data.
Here is just one example, using data from my “100 Summer Days” project. If you look at the correlations between the time slept and daily average alertness (I used detrended data because both variables exhibited linear trend), you will see that the correlation on off-workdays was noteworthy and statistically significant (rho = .51), whereas on workdays, there was no statistically significant association. In other words, if I sleep more on holidays and weekends, I am more alert during the day, however, on workdays this pattern disappears. The most obvious explanation for such discrepancy is .. caffeine. I tend to drink a lot of coffee and Yerba Mate throughout the day while at work. As a result, the effect of caffeine masks effects of sleep on my mental alertness.
In the future blog posts, I will discuss how to track and compute Social Rhythm Metric, both passively and manually.
When it comes to tracking traits and other relatively unstable variables, I am a proponent of bi-daily and tri-daily measurements. The reason for that is simple: some variables may have different values when measured at different times of the same day, due to circadian rhythms, lifestyle or other factors. A lot of self-trackers assume that such instability is typical only for subjective variables, like mood or fatigue, but many objectively measured variables (e.g., alertness, body temperature) may also exhibit intra-day differences.
The best way to assess intra-day variability is to track variable bi- or tri-daily for a couple of weeks or even a month, then look at the data. Sometimes, simple chart in Excel is enough to spot the patterns; often, however, you may have to employ statistical analysis to test the differences. Here is, for example, my average alertness scores (measured using Mind Metrics app) logged four times a day: in the early morning (around 8 am), late morning (around noon), afternoon (around 4 or 5 pm), and evening (around 9-10 pm). That peak in the late morning and drop in the evening on workdays are actually statistically significant.
In summary, it important to take in consideration these potential temporal patterns when analyzing your self-tracking data. Plus, the patterns themselves can help you learn more about yourself.