Data Visualisation and the Rugby World Cup

We’ve now had the first week of the Rugby World Cup 2015, held in England, and by tonight when New Zealand face Namibia every team will have played at least once.  Rugby is a complex and data rich sport so I was interested to see what data visualisations I could find about the tournament.  Perhaps because rugby is so complex collecting and coding data is quite an effort with top quality offered by the likes of Opta (and their Opta Jonny twitter account) who then sell those data feeds, and their visualisations and editorial, commercially for use typically by the major broadcasters and media publications.  There is also detailed match statistics on the Rugby World Cup official site in both their Match Centre (see the stats for ENG vs FJI) and their Stats Hub. It’s detailed, professional and authoritative.

There are some other places to get rugby data that may be more open and in formats better suited to creating your own data visualisations:

I haven’t yet seen many interesting visualisations using Rugby World Cup data but more might emerge as the tournament progresses.  Tableau have provided two of the more interesting ones.

In The History of Rugby World Cups since 1987 they compile a dashboard featuring summary and comparative data.  It’s pretty good and features summaries, progression heat maps, head to head records and detailed team overviews.  Despite my efforts at banter with my antipodean friends the evidence here strongly supports the strength of Southern Hemisphere teams in the competition.  France really do emerge as the best team never to have won a world cup and Ireland, you would think, should finally improve their rather underwhelming world cup record.

RWC Teams Story

Their second visualisation is described as a Visual Prediction.  It’s hard to see why as there’s no obvious model behind it, but it does offer a good head to head history of two teams showing that England’s epic clash against Wales on Saturday evening will likely be closely fought.  Rather confusingly there’s also player statistics in here somewhere but there’s no obvious navigation or filter and the tableau workbook for this visualisation isn’t available to download.

Dashboard 2

Taking a Local Peek at Strava Labs Clusters

Clusters is a project by Strava Labs to visualize the top routes from within their data.  Their labs projects are often exploring some of the more interesting things you can do with the data when millions of people around the world log their cycling and running activity.

Screenshot 2015-09-24 09.47.37

Races predominate the Reading clusters for running with the Reading Half in the centre and both local park runs popular.

Taking a look at running in Reading, formal races and training sessions predominate.  There is the Reading Half Marathon route at the top but also plenty of local 10k races such as the O2O, Dinton Pastures, Shinfield and Royal Berkshire 10k events.  The Reading and Woodley park runs take 2nd and 3rd sport with the Reading roadrunners track sessions at Palmer Park coming in 4th.

Screenshot 2015-09-24 11.53.04

All local activities show an interest in triathlon and cycling events but not much evidence of regular commuting.

When you look at all activity types in Reading you get some more triathlon and duathlon events and the Wokingham bikeathon.

The picture for London shows a lot of cycle commuting and regular (all hours of the day, days of the week and months of the year) cycling in Richmond park.

Screenshot 2015-09-24 11.54.21

Cycle commuting into London is more popular as is cycling around Richmond Park and Bushey Park Run.

Looking just at running in London, the marathon ranks 2nd but it is Bushey Parkrun that is the most popular route with London’s parks and commons being well utilised by runners.The most intriguing is a 6km route simply labelled Recovery Run, a year round work day lunchtime escape.

Screenshot 2015-09-24 11.57.04

Amidst more formal race or Parkrun routes is this daily lunch break between London’s twin cities.

Screenshot 2015-09-24 11.58.31

Running along the Thames and back between the City of London and the City of Westminster seems to be a regular way for city workers to relax and escape work at lunchtime.

It’s unclear if the recovery is from harder running or from work but it’s nice to imagine city workers jogging along the Thames from the City to Westminster and back at lunchtime, or enjoying a 10k loop of north and south banks as part of a lunchtime escape or run commute.

Screenshot 2015-09-24 12.00.36

Also enjoyed as part of a lunch break or run commute is this 10k route taking in the north and south bank of London’s Thames.

The visualisations are obviously based only on data uploaded to Strava, not all logged activity, and there isn’t much information about how the data is processed but it’s another interesting example of the kinds of insight you can get from the quantification of regular exercise and might give local councils and planners pause for thought.

My Quantified Self Experiment

Despite being a (slightly) Quantified Athlete who maps and logs my runs, I’ve never paid much attention to the Quantified Self movement and measured my life by logging any other aspect of my life.

I’ve been, in fact, quite sceptical of plugging myself into a nexus of devices and services that would constantly track my behaviour and illuminate the biometric data generated by my everyday activity:

  • Viewed positively, the insight from visualising our data promises to help me form good habits, meet life goals and make me happier and/or healthier.
  • More darkly, I fear the connotations with surveillance, normativity and conformance, the industrialisation of everyday life that makes every action subservient to the demands of productivity this kind of self-tracking work might imply and a reluctance to fully include women.
  • Part of me is also curious about the implications are for humanity when we align our bodies so closely with devices that augment it, and potentially allow us to hack it, and when we combine our analogue feelings with digital measurement.

I wonder is self-tracking an empowering or an oppressive phenomenon?

One way to think about these issues is not just to read about them but experience it and try an activity tracker myself. Whilst researching sport informatics and analytics and training for a half marathon (whilst being in full control of my schedule and eating habits) seemed like a good time to try it.

Choosing a Tracker

First problem was deciding what tracker to use. There are an increasing number of the market all doing pretty much the same thing but with slightly different specifications and form factors and backed by a variety of applications and online dashboards.  In fact the choice was quite overwhelming. Whilst there are plenty of reviews around, the selection will also be deeply, personal and involve deciding what will work best for you. It’s no easy task to figure this out from other people’s impressions and opinions.

Eventually, after visiting several websites to read product information, reading several reviews, checking prices and ratings in a number of online shops, and signing up to take a look at several of the apps and dashboards, (many of which allow you to use them to log basic metrics manually without needing an associated device), I decided to order a FitBit Charge HR.

2015-06-28 17.54.09

One of the things that I love about running is being able to switch off and so I am a bit apprehensive that wearing a tracking device will be a distraction during my workouts. I don’t run with a watch, just with a phone application in my back pocket giving quiet updates and cues every so often. The FitBit Charge HR collects a lot of data but wears its power lightly, offering a subtle, configurable OLED display. I hope I’ll be able to configure the device so it stays discreet and silent even during vigorous activity.

HR measurement at the wrist is a growing development that will allow me to track a piece of data I haven’t previously because I don’t like wearing a chest strap, even though these are still slightly more accurate.  It also looked like it has a small design, which would suit my female sized wrists better than some of the bulky smart watch designs, and a secure strap like a watch, rather than a clip.  From this I realised the form factor was more important to be than the functionality, accuracy or analytical features.

Your Steps; Your Questions

Decision made I listened to Inquiring Minds podcast #91 that interviewed Rachel Kalmar, a neuroscientist and data scientist, on the power of wearable technology.

 

There were some interesting things in this interview that informed my perspective on what trackers do, and can do and how to set my expectations accordingly.

Firstly, accuracy is relative. Whilst they are continually improving, trackers can’t measure us with absolute accuracy and certainly not in a way that is directly comparable with other people or other trackers. As Rachel explained, what is important is that they are “internally consistent”. Once they have established benchmark metrics for an individual they should continue to measure those metrics in the same way so that individual can tell how their behaviour differs from previously and whether it is trending in the right direction.  We shouldn’t expect them to be clinical grade or be able to replicate a sport science lab in terms of absolute precision … yet.

Secondly, a ‘step’ is used quite generically as a unit of activity. A modern activity tracker may have taken its terminology from pedometers and may mostly track contact with the ground, but is now also about broader measurement of everyday movement. In the interview Rachel poses the question What does a step mean when you are doing yoga? The continued use of step is perhaps slightly misleading  if now trackers aim to combine the continuous stream of data from all the sensors packed into the device with software algorithms to calculate how many motion units an individual accumulates in a day more broadly.  Crazy upper body kitchen dancing counts, albeit in a different way than walking. The combination of sensor position, software and person create huge amounts of variability in ‘step’ count and the device will need to be calibrated and the your goals configured accordingly once some initial data from your typical usage is available .

Finally a reminder of the difference between informatics and analytics.  Trackers make sport and well being informatics much easier. It is simple to capture, collect, organise and store a continuous stream of very precise data. Applications and dashboards enable this data to be visualised as semantic content but the benefits of these analytical offerings are less clear.  What they won’t really provide is much insight about your own life unless you know how you want to interpret and analyse them and you understand what you are looking for, or at least know what you would like to achieve.  Rachel suggested you need to bring your own questions (BYOQ) in order to maximise use of a tracker (and perhaps control it rather than have it control you!)

Definitely an interesting listen with some useful input on expectation setting when approaching activity tracking.   This is where my experiment might fall down.  I don’t really have any interesting questions, or even any specific goals (beyond say the general and ongoing fight against the entropic forces of middle age) and I am therefore a bit uncertain what activity tracking data will tell me or really help me achieve.  I feel like I already know what is ‘good’ and ‘bad’ about my behaviour and yet I keep eating croissants.  Perhaps the data will confirm or deny my suspicions.   So I’m not expecting more data to be more informative and I doubt the extent it will influence my behaviour.    I wouldn’t say I’m excited, but I am intrigued about the experience.


A reminder my survey on athlete information and data use is now open.  If you are an amateur runner, cyclist, triathlete, swimmer or rower please take the survey and let me know how using information resources or getting insight from your training data helps you in your sporting goals.

Inform to Perform: Take the Survey

I am looking for amateur athletes to participate in my research project, Inform to Perform.

TakeTheSurvey

To participate you take an online questionnaire that takes 10-15 minutes to complete if you answer all the questions.  The questionnaire is anonymous and you do not need to provide any identifying personal information to take part.  Further information on the survey methodology is available on the project information page.

If you:

  • are aged over 20
  • are an amateur (not a professional) athlete
  • participate regularly in one of these sports:
    • running
    • cycling
    • swimming
    • triathlon
    • rowing

Please Take the Survey on information resources used by amateur athletes and contribute to my research on the information needs of athletes.

TakeTheSurvey

Inform to Perform: Sport Informatics and Amateur Athletes

I wrote previously about the theory of sport informatics and analytics. I used the following definitions:

Sport informatics involves collecting data, things known or assumed to be fact, then arranging it in ways that convey or represent something about sporting performance or participation.

Interpreting this information, using sport analytics, involves identifying patterns that tells us something about the sport and extends our knowledge of it.

This post introduces Inform to Perform, my masters research project that asks:

What does sport information and analysis look like to the average athlete?

Defining the Amateur Athlete

According to Sport England, 15.5 million of us participate in sport at least once a week. These people will participate in sport at various levels and for various reasons and will therefore need different information in order to support their personal sporting aims.

Some of those will be casual participants or those who are just getting started. Others will be talented and dedicated performers who are capable of representing their country, like BBC presenter Lousie Minchin who has qualified in her age group for the World Triathlon Championships after only 2 years in the sport.

The group I am interested in surveying lie somewhere in the middle. Athletes who have developed a training habit and compete in events, or are thinking about it, but don’t necessarily expect to win. For them sport is challenging and fulfilling and a leisure pursuit they enjoy and devote time to.

Here's a group of typical amateur athletes celebrating the end of a long day competing/supporting at an triathlon event.

Here’s a group of typical amateur athletes celebrating the end of a long day competing/supporting at an triathlon event.

These athletes probably aren’t sports scientists and probably don’t have access to an extensive network of sport science, medicine and conditioning experts. Increasingly though many of us have access to some of this knowledge via the internet and carry miniature sport science laboratories around with us in our phones or attached to our wrist.

My research will hopefully find out something about how this cohort of athletes use information to support their training and competitive tasks. this post provides some context by writing about some of the information habits of myself and my friends that motivated this study.

The Activity Information Cycle

For most athletes their sporting participation exists in phases. An annual training calendar will be built around target goals or races and involve build-up, peak and tapering phases interspersed with periods of rest, recovery and maintenance. Within this macro cycle is the mini cycle of what activities you do each week. An athlete’s information needs may change during different phases of this training cycle or in response to certain events, such as injury or plateau.

In my study of information I’m placing the activity itself at the centre creating a nano cycle of pre-activity, during activity and post-activity. I call this the activity information cycle.

The Athlete Information Cycle

The Athlete Information Cycle

Pre-Activity

Pre-activity covers both general information gathering relating to your sport and specific task oriented activity. It is the bubble of information and knowledge you as an athlete exist in and helps prepare you for training or competitive activity.

Screenshot 2015-06-26 10.15.17

UKRunChat (and UKCycleChat and UKTriChat) use Twitter and the Web to create an online community for information sharing and socialising.

The nature of this kind of information is vast and diverse and will depend very much on the athlete’s needs and motivation. I’m hoping you are going to tell me a lot more about what this kind of information looks like and how easy you find it to interpret and incorporate into your activities.

In our group, examples include other people, such as friends or club mates, social media, such as Facebook and Twitter, magazines and an array of books.

A selection of the sport related books on my shelf.

A selection of the sport related books on my shelf.

Training plans are a way of arranging data to convey something about sporting performance. In this case the activities we need to undertake to meet a certain goal, usually a race, at a certain time. The athletes in my pilot group all get the information needed for our training plans in very different ways and use very different tools to put together training plans and refer to them.

I’ve used many methods to put together a training plan over the years including following plans from books, magazines or created by algorithms based on my previous times and target times.  I’ve never had a running coach, but as I’m currently returning from injury I wanted a bit more structured guidance for my current plan and am currently using a half marathon plan created by the coaches at Full Potential that is provided free as part of the Garmin Connect online platform.  I can view it through Garmin Connect, in my calendar via an iCal subscription and as a printed copy.

During Activity

During activity covers the period you are physically active, whether training or competing. Most athletes will receive some kind of information feedback during their activity from some kind of device even if it simply the activity duration and the humble stopwatch. Even rate of perceived exertion (RPE) is a form of information feedback that we estimate from the data our brains our receiving from our body’s sensors.

iSmoothRun the phone application I use for activity tracking.

Again, this will vary depending on the athlete’s needs and motivations and may differ when training or when competing (or doing a time trial or other benchmark training session). I only get 10-15 time cues from a phone application stuck in my back pocket when I am exercising and I don’t even wear watch because I find it distracting; my partner won’t even go for a leisurely jog without being strapped into an array of devices capturing an array of metrics and buzzing angrily whenever out of target heart rate zone.

Garmin Forerunner 610 an example of a GPS watch.

Garmin Forerunner 610 an example of a GPS watch.

Hopefully my survey will provide further examples of how athletes like to use in-activity feedback and what devices are used..

Post-Activity

Post-activity is the time for storing, arranging and interpreting all the data collected from the activity. This can be a lot of data!

Storing Training Data

This is trail half marathon race I did last year as captured on my phone (using iSmoothRun the application I use to track my activity):

This one race captured 1644 tracking points of data along with several pieces of metadata such as the name of the race, the shoes I was wearing, how I was feeling and the date.  The application also performs some initial calculations on the raw data to provide be with some derived information such as the distance I covered, my total steps, my average stride length, my average cadence and my average pace and speed.

The data itself is really just a series of tracked data points measuring things like the time and my longitude and latitude.

Screenshot 2015-06-26 09.29.24

How my race looks as raw data.

This is really hard for the average athlete to make use of but fortunately we don;y have to.  As well as providing the phone application interface iSmoothRun allows me to export this data as a GPX file so I can store it an analyse it using other tools.

Interpreting Training Data

How we make sense of this data will again vary depending on the data that is collected and the tools each athlete uses to store and analyse their data. These can range from paper training logs to the growing range of online platforms for organising and sharing training. These services enable us to visualise our data in different ways and provide additional calculated data, graphs and reports to help us understand our data and use it to monitor our progress.

I use several tools to visualise my training data and help me understand it.  I save a copy to a desktop application called Rubitrack and export it to Strava, Garmin Connect and Runkeeper.  You can see that the same race looks very different across each of these services.

Different tools provide different views of that raw data and different guidance on interpreting it and it’s a personal choice which we use.  They all do a similar think and often it’s the user interface and the social community they offer or the ease of synchronising data that can sway use rather than different features.  I can’t decide which is my favourite which is why I export my data to several services and use them for slightly different things.

How we analyse our data and what we do with it depends on our goals, motivations and what questions we want to ask of our training data.  I will confess I think my training data looks great visualised in these ways and I use the services to keep a training diary but I don’t really do a great deal of analysis of my data.  I check how my average pace compares with how I felt and that’s about it.  I also quite enjoy how Strava compares the runs I do on the same route and tells me if I’m improving or, as seems to be the case at the moment, slowing down!

Strava automatically detects activities over the same route and compares them.

Strava automatically detects activities over the same route and compares them.

I am a lazy analyst though.  I take what the applications easily give me and I use it to calibrate my RPE and gut feel.  I check my weekly totals to make sure I’m not risking injury.  I don’t spend a great deal of time digging into my data.  My partner spends much more time (hours, days, weeks, eternities) pouring over training data using even more powerful tools.  She also has a coach so uses Training Peaks to get her training plan and report data back to her coach so he can analyse it.

Over to You!

Hopefully that covers some of the things that amateur athletes will understand as sport informatics and analytics and some examples from my experience that already suggest there is a great deal of variety in how athletes use information and why we find, collect, store, share and analyse information in relation to our sports..

Now I need your help to provide data collected in a more rigorous way to explore this topic further so I can start to map out the ways athletes are using these types of information.

How do you find the information you need for your sport? What are your trusted sources?

Are you an information omnivore or to you just find what you need to get by?

Are you are technology fan who has all the latest gadgets or are you unsure about how these devices could help you?

What do you do with all the information you collect and how does it help you achieve your sporting goals?

Also what information gaps do you have? Do you find it hard to find or access some information or are you not quite sure how to make best use of some of the information or data you have?

If you are aged over 20 and are a runner, cyclist, swimmer, triathlete or rower please

TakeTheSurvey

to answer a few questions on your information use. It will just take 10-15 minutes and will really help my research.

Homo cyberneticus: our growing relationship with fitness trackers

How we discovered the dark side of wearable fitness trackers
Rikke Duus, UCL and Mike Cooray, Ashridge Business School

Really interesting article published in The Conversation recently about a study of 200 women and their usage of Fitbit tracking devices.  The women surveyed suggested that tracking devices, particularly those designed to be worn constantly, become incorporated into their sense of self.

“most users embraced the devices as part of themselves and stopped treating it as an external technology”.

 

The authors connect this finding to the research of Sherry Turkle on the Tethered Self and how this changes social existence and the nature of public space: from communal to collective.   Applied to the sporting domain, if more meaningful connections increasingly take place in a virtual realm, how will this change the experience of racing for example?

On the positive side the researchers found devices that measure activity can be encouraging companions and coaches. The potentially darker side effects include constraining our behaviour or devaluing activity if it isn’t logged.  These so-called ‘Always-On, Always on Me‘ devices may amplify some of these effects that are described as alarming, but I know a fair few people who might say similar things about their coach or training group!

This may say more about motivational effects of observation and accountability with regard to exercise goals than the technology itself, though this research does suggest Fitbit and social exercise does extend those same forces into the realm casual leisure participants.  It would also be interesting to tease out more what are the real differences between Web 2.0 type social monitoring and Web 3.0 type data monitoring and everyday wellbeing goals.

Whilst the findings from the research that are discussed are interesting, the most striking aspect of the article is the final hypothesis: that technology is changing the nature of humanity itself, potentially transforming homo sapiens into a new species that doesn’t solely rely on biological information processors to sustain, monitor and calibrate its life systems.