Who can beat 3:47 for our data challenge?

As with parkrun events, this challenge is not about winning but about bettering oneself.  We’d love someone to beat our best time of 3:47 for our challenge and share with us how.  If you can’t, then check out our solution (coming soon to this blog) and learn how anyone can.


To get ready for the challenge

  • Extract the zipped folder into a location of your choice
  • Go to this link to get the source data files 
  • Download the files onto your PC
  • Open one of the spreadsheets to view the data
  • Plan and test your intended method for solving the challenge
  • Close all spreadsheets without saving
  • Now you’re ready for the challenge!
Park Run Data Google Drive

Screenshot of data folder

 


The data challenge

Your challenge is to see how quickly you can calculate the number of known runners in each of the runs in the data set.  This may or may not be helpful to your approach but the number of known runners is the difference between total runners and “Unknown” runners.

stopwatch and track

 

 

 

 

 

 

Rules

This challenge is self-timed so to standardise on how we do this here are the rules:

  • Viewing of the data and trialing techniques before starting is permitted
  • No formulas/ code or references written before you start
  • The time starts the moment you open the first workbook or write the first formula/ piece of code
  • The timer stops when you have a table showing the run number and total known runners
  • You must be able to repeat the solution to within 5% of your claimed time

Post and mark your results in this form: Park Run Challenge

Just to make sure you’re on the right track:

  • Run 224 answer: 1809 Known of 1933 Total
  • It’s normal for a regular Excel user to take 30-35 mins
  • If you hit 45 mins, please stop 🙂

Have fun with us and post your thoughts/ ideas / questions on the comments section.

 


Background to the challenge

The Story

All the members of our team (and our families) have been participating in parkrun events for a few years now and love the way they bring together communities and create an opportunity to get out and exercise.  One of our local parkrun events is consistently the most attended in the country (and in fact the world).  A unique issue we have is that we can’t process finishers quickly enough causing a backup of runners.  This caused us to ask the question – what is the limiting processing rate of finishers?

(FYI: The answer to this question is about 70 runners per minute or 0.85 per runner)

This question has led to a series of interesting projects as we’ve worked with the data.  We’ve also noticed that many people are completely unaware of the data processing tools that exist in software they either have or can get for free so we decided to put out a challenge to see how people would tackle this problem.

What is parkrun?

Wikipedia says it better than I could:

Parkrun (styled as parkrun) is the name given to a collection of five-kilometre running events that take place every Saturday morning in fifteen countries across five continents. Each Parkrun territory has its own sponsors. Events are run by volunteers, and participation is free of charge.

At each event the runner’s barcodes are scanned at the finish (those that don’t have are recorded as “Unknown”).  The event’s results are uploaded on the local run’s website where they are available to the public.

To compare processing rates from race to race, we had to combine a number of these result datasets and would like to share with you a really quick way of doing this that will help you with similar tasks in your work.  But before we do that, take the challenge to see how quickly you can do it now…