Who has the data and who has not? That is the question!

Meet our teachers: Marco Brambilla

The amount of data is growing every day. That creates opportunities for medium-sized and large business and risks for small businesses. “Companies need to innovate their business models”, says Marco Brambilla, Professor of Data Engineering and lead of the data science Lab at Politecnico di Milano, and teacher at the EIT Digital Professional School course Data Science for Business Innovation about the business of data science. “Business is about who has the data and who has not.”

The amount of data available keeps growing. The Global DataSphere Forecast of market intelligence provider IDC says that the amount of data created over the next three years will be more than the data created over the past 30 years. It continuous: the world will produce more than three times the data over the next five years than it did in the previous five.

(Mis)use of data

“The growth is not going to stop”, says Brambilla. “Actually, just the opposite: Data collection practices get more and more widespread and cover all aspects of life around us. We are going to see more refined data descriptions for each person, object, and business. Our life will be described by the data we produce. That opens possibilities. Today, we can capture data all the time from all kinds of sources. We can analyse these data and transform these into actionable values to, for instance, improve business processes. Data Science can exploit benefits for individuals and society. On the other side, there are also risks in terms of misuse of these data, like identity theft. Data Science should therefore also be about preventing or be able to fight against possible misuse of data.”

Quicker reactions

The growth of data can help society and business in the long term with a quicker reaction to changes and optimisation problems, Brambilla foresees. “We already started to see automatic notification of changes in services. For instance, at the travel level, you get automatically redirected if disruptions happen. Notifications could come from crowdsourced information, meaning that people detect disruptions and notify a system. When this becomes a standard process, it becomes another source of information and enters the processing of data science flow.”

New business opportunities

More data creates new business opportunities and innovations. “Companies will be able to exploit this data integration, data availability and improve for instance a lot on the cost of cross-selling, cross-marketing, or cross-platform devices. Companies start getting information on what other people do and where they go and can use these for improving the customer experience over a broader level. The point is what companies can do within their business field in terms of continuous interaction with the users.”

Example new business innovations

Brambilla gives an example. “We have automated assistants to prepare the next day appointments for you, based on your availability and wishes. Services like these will grow in number and in-depth. Not everything will be perfect though: we will also perceive fragmentation because we will not have a homogeneous integration of all these data sources. For instance, I have different accounts on different platforms to put activities on my agenda. To have a consolidated agenda of my activities, I need to integrate data from many accounts. That is an example of how a trivial day-to-day problem can become a challenge. The same applies to everything else. We have a lot of data spread around and there is a high risk of losing control over it. From an individual perspective, the growing amount of data might become hard to manage. Maybe there will be automated agents that will do that for us.”

The Data Science Lab of Politecnico di Milano, works on data sciences projects with a strong attention to application-oriented problems. These can be physical industry-oriented business challenges like industry 4.0, which relate to the industrial plant monitoring and anomaly detections, or smaller scale specific applications, for instance studying brand reputation online. The lab has a share of problems coming from the industrial companies that via joint projects or consultancy activities.

On the more research-oriented side the lab is researching techniques and methods, for instance on the optimisation of language models. In a recent Large-Scale Analysis of On-line Conversation about Vaccines before COVID-19 Brambilla investigated with his team in the Data Science Lab at Politecnico di Milano, the impact of disinformation on vaccination. This research found a correlation between the dynamics of social media shares on fake, unreliable, partial, or biased news about vaccination and if people get vaccinated or not.  At same time the researchers also try to detect how one can attack fake accounts fake bots, social media bots that generate misinformation.

Most of the projects the lab conducts come from European or national public institutions or research grants. Brambilla’s team is a research partner within the European research consortium PERISCOPE, running from November 2020 until October 2023, that investigates socio-economic and behavioural impacts of the COVID-19 pandemic. 

Mid and large size business

When talking about the business of data science, Brambilla says it are typically the mid to large size companies that can extract value from data. Small companies are not reaping the benefits of Data Science. “To run relevant analysis and approaches on data, it is essential that you need some amount of data. Sometimes small entities do not have this amount of data. The other problem is the initial investment. There is an initial investment to put these approaches in place. Small entities may not have the means to do that.”

Data Divide

Small companies have difficulties keeping the pace of Data Science developments. That is what Brambilla calls the Data Divide. “It is the modern evolution of the digital divide. Data divide is about who has the data and who has not, who can access data and who cannot, and who can process the data and who cannot. This divide is critical, even at a societal and economic level. It will impact a lot, like market models and so on. The huge amounts of data within large companies as Amazon create huge potential business opportunities. Small business may not be able to keep this pace. That is challenging. Assuming that small businesses have the power to leverage this in a significant way, is an optimistic perception at the moment. The quantitative trend may kill in the long term all the small competitors.”

Small business strategy

The challenge for small businesses is at a strategic level. Brambilla gives some suggestions. “Small businesses could try to benefit as much as possible by maybe using some existing platforms applied to their small amounts of data and see what they can get. The other way to go could be to build a federation of small businesses. The data is not just collected by one hairdresser or one bakery but within a network of small businesses at the town level. This network shares information for optimising the quality of life and business in the cities.”

A third option Brambilla suggests is creating alliances between small and big entities to counteract the dynamics of big companies pushing small companies out of business.

“Building mechanisms for creating alliance or coordination between small and big entities could work if both get benefits. To state, Italy installed a regulation that large malls should be closed in the weekends. Because of this, people somehow discovered again the small stores in the city centres and the benefits that these can bring. An alliance between big and small businesses could leverage a combined data-driven valorisation. It is impossible to compete in terms of pricing with the big players, you need to use the value you have.“

That brings up another option: working more closely together in the value chain. “One could try to integrate the provider and clients in one value chain.  Small stores buy things from someone big warehouse or even directly from the producer. A pasta producer or car producer might be interested to build a stronger alliance with their sales channel and maybe deliver some kind of data-driven platforms along the pipeline.

They also face the challenge now of making sure their products are not only sold on Amazon or Alibaba. They want to keep all the sale channels open to survive themselves. That could also benefit the small actor in the chain. The automotive field has already a data-driven optimisation value chain that goes from the initial producer to the final salesmen in place.”

Blind for the value of data

Despite all the growth opportunities of Data Science, a lot of companies are still blind to the value of data. This ‘blindness’ has, according to Brambilla, to do with the fact that companies do not know which data they possess, or they lack the technical skills to get value from the available data. And on top of that: "it is hard to extract value from the data in a way that the data could be an infinite gold mine of the value.”

Be ready for change

Brambilla says that all companies should rethink their strategies. Staying ignorant of the impact of data science on your business is not an option. “You then become marginal on the market. You must be ready for a change. If you want to scale up with your business, you need to have a data-driven mindset focused on the business objective. That can be: ‘I want to become more cost effective’ or ‘I want to become more efficient in reaching out to our customers’. The course Data Science for Business Innovation, for example, is just about this: how to change from the business need.”

More stories about data science:

EIT Digital Data Science education portfolio for professionals:

For students:

Sign up for updates

Receive the latest news and events updates by subscribing to our newsletter.

Sign up for our Newsletter

Continue reading

Scroll up