With an exponential growth in the amount of data generated daily, we need fast and reliable big data analytics tools. If we think about the number of Google searches or Facebook posts each of us contributes every day – we can surely appreciate the boundless opportunities that these activities would enable. If we were to analyse the aggregated data collectively, we would produce something of great value & consequently improve business decision-making for example by facilitating the deployment of personalized recommendations such as songs or products based on user behavior, predicting and avoiding customer churn, or optimizing the industrial production based on massive sensor data.
In 2008, Professor Volker Markl laid out a vision. He sought to develop a big data analytics system jointly with his team at the Technische Universität Berlin and their collaborators at Humboldt University of Berlin and The Hasso Plattner Institute in Potsdam. In 2012, they delivered the first open-source release of a system that would later become Apache™ Flink™.
Interview with Professor Volker Markl
What makes Apache Flink unique?
For near real-time big data analytics, Flink is superb and many, including me consider the ability to efficiently conduct streaming data analysis to be a unique selling proposition. Due to its pipelined architecture Flink is a perfect match for big data stream processing in the Apache stack. Apache Flink is fast, reliable, scalable and easy to use: it is compatible with the Hadoop ecosystem and runs on top of HDFS & YARN. We have a growing community of users and committers from academia, young startups, and established companies.
Flink was chosen to be an Apache Software Foundation Top-level Project. What does this mean?
Basically, it means that Apache™ Flink™ has great potential. In January 2015, the Apache Software Foundation (ASF), an all-volunteer developer community announced that Apache™ Flink™ had graduated from the Apache Incubator to become a Top-Level Project (TLP) signifying that Flink’s community and products are well-governed under the ASF's meritocratic process and principles.
How did EIT Digital support the creation of Apache Flink?
German Research Foundation (DFG) grants via the Stratosphere Project laid the first foundations for what would go on to become Apache Flink. However, it was EIT Digital that empowered us to identify and work with numerous researchers across Europe, including members of SICS (The Swedish Institute of Computer Science), INRIA (The French Institute for Research in Computer Science and Automation), and SZTAKI (The Hungarian Academy of Sciences). EIT Digital grants facilitated our conducting technology maturation, open-source dissemination, and contributed to traction around a fledgling open-source project.
You are part of a start-up that was created earlier this year to drive Apache Flink development. Can you elaborate on your role as an entrepreneur and a professor?
My career as an Innovator can be traced back to my years at the IBM Almaden Research Center in San Jose, California, where I played an active role in rolling out industrial lab research into viable commercial products and services. I have been fortunate to have had the opportunity to play a pivotal role in enabling innovation in large companies and be an entrepreneur, both in Europe and the US, enabling several startups to take flight. However, academic environments, in contrast, are ideal since they enable researchers to focus on deep, disruptive technological innovations that hold great promise, i.e. highly likely to make an impact, as opposed to sustaining (existing) innovations already prevalent in companies.
As a Professor, I invent groundbreaking technologies, build entrepreneurially minded research teams comprised of Postdoctoral Researchers, PhD Students and Master’s Students, and concentrate on the pipeline, i.e., pioneering groundbreaking research and then transitioning to innovation and eventual productisation (via the establishment of a startup or the transfer of technology). However, in my opinion, the biggest challenge is the lack of public funding for technology maturation and productisation.
I hope that EIT Digital will continue to advance technology innovation and entrepreneurship in Europe. In this context, I would like to again emphasize the importance that technology maturation and open-source booster catalysts play and the vital support that universities, such as TU Berlin, KTH Stockholm, as well as research institutions, such as DFKI, SICS, INRIA, and SZTAKI provide, in order to bootstrap, create ecosystems around, and capitalize on substantial innovation, such as Apache Flink.
The next global Flink Community meeting – Flink Forward - is taking place in Berlin on 12-13 October. What can the participants expect from the conference?
Flink Forward 2015 is a conference targeted at Flink users, as well as data scientists interested in large-scale data analysis. The conference will consist of two parallel sessions: a presentation and a training session.
Participants can expect:
- technical presentations on Apache Flink by project committers, e.g., on system internals, as well as the project’s roadmap for future releases
- use case presentations on Big Data projects using Apache Flink
- presentations about related Big Data projects in the Apache ecosystem and beyond
The parallel training session will feature two days of hands-on training workshops by Flink committers. No prior experience with the system is required.
The primary target audience of Flink Forward is developers and data scientists working with Big Data tools and programming languages such as Java, Scala, and Python to make sense of streams of data. As the ecosystem and the community around Apache Flink is growing, Flink Forward offers a unique opportunity to meet the community, discuss the future directions of the project, and form future collaborations. Prior knowledge in Flink is not assumed, and indeed, the workshop session is the perfect venue to get started with your first Flink programs.
Visit flink-forward.org for more information.
For more information, feel free to visit:
Author - Suvi Lavinto