Smart cities and Tech Evolution - XXXVI Building the Data Fabric

The amount of data created, stored and used by the US Government has kept growing over the last ten years and it will keep growing over the coming ones. Credit: Scality 2014

As data keep growing new technologies become available to process them. Credit: Booze and Company

Over and over, be it when discussing sensors, infrastructures vidiwall or personal screen, the leit motif is one: data.

Data used to be something one would feed into a computer to have them crunched and transformed into other data. They are now so numerous, no-one really knows how much data is out there, how much is being created and exchanged. There are, of course, estimate. In 2000 and then again in 2002 people at Berkeley tried to measure the amount of data created under a certain number of assumptions (how much data is a song? Well it depends on how you code it and how you compress the code…). More recently Cisco started to release their estimate on the amount of data exchanged, consulting companies are periodically publishing reports on the number of data certain market segments are producing/using…. 

These figures seldom match in the different “studies”, surely because the methodologies differ but also because it is extremely difficult to really pintpoint these numbers.

On one thing everybody agrees: the number of data, created, exchanged, used is huge and it keeps growing. Just one number to grasp the magnitude: the US Department of Defense by 2020 will be operating 30,000 drones, each of them generating 43 TB per day. That is 1 Exabyte per day!

How many data is a city generating? Again a quite difficult question to address. What is a city? Is it just the public infrastructures or does it include its citizens, the private business, the vehicles moving around … 

We will be moving over the next ten years from the ExaByte Era to the ZettaByte Era. These quantities are just beyond our capability to grasp their magnitude.  Managing and leveraging on these data requires ever more powerful technologies. 
 A lot has been done if we look back to the time of the first data bases and heuristic problem solving. Around the end of the last century the data flow and their availability, along with sufficient processing power has radically changed the way we approach problems and solve them.

Whereas in the past the focus was on the design of sophisticated algorithms (or other computational method) now the approach favours quantity of input with statistical analyses that gets rid of imprecise input. This has been affecting many areas.

A city used to have a few (expensive) air quality sensors and spent money in ensuring their effectiveness.  Now we can have sensors placed on hundreds of taxis that by moving around every day generate many thousands samplings all around the city. Many of these samplings are repeated over and over by several taxis passing through the same place and these multiple measurements are used to get rid of any inaccuracy in individual sensors.

The continuous creation of messages in various forms by citizens can be leveraged by technologies of sentiment analyses providing cities with an understanding of their citizens’ feeling. 

The digital signatures of a city create a multiGB set that is contiuously evolving and can be used for awareness generating red flags if something unexpected happens, often well in advance of being perceived by citizens.

As in the past as city was defined by its landmarks in the future it will be defined by its data. A municipality shold rapidly move to aggregate, manage and leverage data opening them to let others to leverage on them as well creating wealth.


In most cases, municipalities are looking at data as “problems” and as “cost”. The perspective needs to change radically, seeing them as “resources” and “revenue generators”.

Author - Roberto Saracco

© 2010-2018 EIT Digital IVZW. All rights reserved. Legal notice