The amazing amount of data that have been gathered in these last fifteen years, and, more than that, the overwhelming quantity of data we are gathering every day has created a new type of raw material that we are starting just now to leverage.
Clearly data have been around for millennia, the first documented examples are in the Assyrian clay tablets listing the amount of oil, wheat that where produced in a given area on that particular year. Digitalization (and storage/access) has multiplied the availability of data and processing is making possible to combine and analyze data generating even more data (meta-data). The interesting part is that these meta-data have a higher value of the original data, because the processing has the goal of producing more relevant information. A data is a self standing “fact”. Such a self standing fact is, most of the time, irrelevant to most people and to most business.
Through processing and contextualization we can transform a data into an information. The timetable of railway services in Italy is a (set of) data. Pinpointing to the time of the train that would fit my need next week is an information (to me). Information has higher value than data.
Same is for business. The huge collection of data indicating what people are buying can be used to pinpoint a trend in spending that can streamline production of a certain shirt in a given size and color. This is an important information that results in the availability of that shirt and cuts the unsold product.
There are three components related to this value creation, to the transformation of data into information:
- the availability to relevant data (data that in some way “contain” the information)
- the processing of data to extract the meta-data
- the knowledge of the context that provides value to the meta-data by pinpointing the information.
These three components, all together, are at the bases of the “data economy”.
Clearly the starting point is the availability of data and, in general, the more data available, and the more diverse they are, the better. This explains why GAFA (Google, Apple, Facebook and Amazon) are so important in the data economy. Each of them has huge amount of data and the digital relation they have with their clients increases their advantage every day more.
- Google handles 2.3 million searches per second (March 2016), they are quite secretive about the amount of data they store but a good estimate based on the power used by their servers puts the figure around and above 10 ExaBytes.
- Apple has sold over 1 billion IoS devices, manages a growing number of payment transactions (1 million Apple pay activation in the first 3 days since launch) has over 2 million apps in its store with downloads exceeding 50 billions and tens of billions of songs and an amount of data storage that is in the range of one Exabyte.
- Facebook has over 1 billion users active every day looking at over 8 billion videos for a total viewing time of 100 million hours (a day!) storing 0.3 ExaBytes of users data.
- Amazon has 244 million active users, 2 billion products purchased in 2014 and an amazing 44% of people buying on the Internet that go first to Amazon to look what is available. Its data centres are estimated to run on some 5 million servers with storage capacity in the order of a hundred PetaBytes.
Of course there is more than GAFA out there, and this is important, as I'll discuss in the next post.