How much of the story does your social data really provide? We’ve discussed previously the challenges rich data tools pose for understanding the social space. There are a multitude of data points available and just as many data providers, each of whom is ready and willing to fill your inbox with reports, data, and charge you a retainer for monitoring. As you probably realize by now, asking the right question is just as crucial as getting the right data. Equally important, however, is having some way to validate the data you get – validation is what moves you from statistic to insight, what moves you from just knowing something to truly understanding.
To think this through, let’s talk about lunch. For the last three months I’ve been tracking people checking-in via Foursquare to Bite, my favorite lunchtime establishment, in hopes of discovering when the crowd is thinnest. As a (very) frequent customer, I’m interested in swooping in and waiting about a minute for my lunch order to be ready.
To do this I hacked together a script using Foursquare’s API that runs every minute and grabs a bunch of data. It records the total number of checkins, a “here now” count, tips, a time stamp, and numerous other data points about the venue.
I’m happy to report that, after a little trial and error, I have cracked the first part of this mystery. On aggregate, the most checkins occur between 2:00 pm and 3:15 pm, with the busiest time being between 2:45-3:00 pm. Any time before noon or after 4:30 pm and you’re in the clear – lunch can be had in only as long as it takes them to make it.
(Click to embiggen)
Let me be the first to say that this is hardly scientific. Whenever you perform an experiment like this you’re bound to create more questions than you may have being trying to solve for in the first place. In some regards, however, that’s whole point. If nothing else an experiment like this reveals several exploration points to start uncovering a more complete story.
Raising More Questions Than Answers
Is 2:45-3:00 pm really the busiest time at Bite or is it because that’s when the tech crowd comes to Bite? After all, Foursquare’s office is only a couple blocks away, and Bite sits on the edge of what used to be considered “Silicon Alley.” The checkin time to any venue is going to vary, and the degree of variance could have easily knocked the accuracy of my data down a couple points. An infinitely long list of variables were left out for simplicity sake. The weather, for example, plays a role in how many people are willing to stand outside and wait for their lunch but is not something I have accounted for in my initial data model.
This is so often the case when you get metrics back about your social performance. Knowing the frequency of likes, comments, or checkins is a great start to the story, but insight – knowledge that can drive action– requires us to understand the meaning behind the activities as much as the activity itself.
John Winterkorn, a fellow Associate Strategist at Undercurrent, thought it would be a good idea to try and correlate my Foursquare data with actual customer counts – and it is. We went out a couple times in half hour increments to record the activity, matching head-counts to the figures Foursquare provided. That is providing us with a richer insight, but we need a more substantial sample before we can say anything meaningful.
Set Yourself Up For Experimentation
What experiments like this do better than anything else is remind us to keep asking questions about the data we’re being presented. Data provided by tools like Foursquare Merchant Metrics, Facebook Insights, and the like are a great place to start, but they’re by no means what you should be measuring your success by in its entirety. We’ve built numerous data mining scripts that explore these data sources in ways that are unique to our clients’ objectives for this precise reason. Often, we use these tailored scripts to provide the larger context for the results we get from off-the-shelf solutions, that is, to validate the data we are seeing and access knowledge we can turn into insights.
If you’re interested in running your own experiment, give this post some love by tweeting about it and in return I’ll share the source code that collected the data.