Going through week 2 of LAK11, I could not help think which data is more appropriate – BIG or small. In a discussion forum exchange, George Siemens volunteered his view on the definition of BIG Data.
most discussion about big data centres on quantity. Chris Anderson considers the implication of big data (new methods of science). Marissa Meyers talk (video next week) is focused on what is driving data abundance (computing speed, scale, and new data sources – i.e. sensors). Social media obviously increases quantity as well with twitter, facebook, flickr, youtube all contributing more data.
The other elements you mention – implication, new models, new decision making approaches – all flow from this abundance of data. I intentionally selected PW Anderson article “More is Different” as our opening article this week. Increased data quantity requires new approaches (and Hadoop and MapReduce evidence this).
So I posted in the discussion forum – Is small beautiful? Look at the following links.
When I think of Big Data, I am thinking how BIG it needs to be before it can be useful. This week’s reading on Insurers and the work done by Levitt and Dubner on Freakonomics tells us clearly that data not earlier thought relevant or causal can be an efficient predictor.
Secondly, strategies designed on BIG data (telephone usage follows a power law implies we should take out small sized recharge coupons) may overpower small data strategies (enable a community of 5 friends to communicate at lower prices).
Thirdly, BIG data also has BIG impacting factors. For example, mobile number portability has come in to India today. It will shift individual company demographics and usage data, as well as impact mobile phone usage patterns. For some BIG data, policy decisions, technological differences et al will make a vast difference. You can homogenize who-called-who-and-where-and-when but that robs the context and obscures the diversity. Similarly, BIG analytics, robbed of context, cannot predict negative reactions of a strategy which does not provide for any contingency planning.
Fourthly, actions taken on BIG data will have big consequences, perhaps rendering the initial analytics obsolete – would BIG analytics follow the Heisenberg principle?
Lastly, if everybody, big or small, started using BIG analytics, to make decisions (say on customer profiles in the XYZ insurance segment), companies would anyway lose the competitive differentiator that analytics brings to them.
Corresponding to the question, how big does BIG need to be, the question I have is – how small really is small. We know from Chaos and Complexity Theory, that there are defining patterns that emerge from very small pieces of data (e.g. synchronicity). A small observation of behavior can provide a gateway to a constellation of attributes defining an organization or a culture.
Connectivism plays to both ends – BIG and small data. On the one end it looks at how tools for SNA and analysis of BIG data can apply to Learning and Knowledge Analytics, while at the other end it embraces how small changes can cause long term variations. I can tell you that its easier to handle BIG data just defined by its size. It is definitely not easy to analyze the small data.
Perhaps Learning Analytics will require us to look at small data – data that cannot be aggregated into huge databases, data that is based on the individual (how can you really personalize or adapt systems according to aggregate analysis; mass personalization is not enough), data that is small enough not to be generalizable.
In my opinion, that is equally, if not more, important as looking at BIG data.