Data Observability for Snowflake Register

DataOps

Modern Data Stack Predictions With Unravel Data Advisory Board Member, Tasso Argyros

Unravel lucked out with the quality and strategic Impact of our advisory board. Collectively, they hold a phenomenal track record of entrepreneurship, leadership, and product innovation, and I am pleased to introduce them to the Unravel […]

  • 5 min read

Unravel lucked out with the quality and strategic Impact of our advisory board. Collectively, they hold a phenomenal track record of entrepreneurship, leadership, and product innovation, and I am pleased to introduce them to the Unravel community. Looking into the year ahead, we asked two of our advisors for their perspective on what 2019 holds as we dive into the next few quarters.. Our first guest, Herb Cunitz, was featured in Part 1 of our Prediction series (read it here) and discussed breakout modern data stack technologies, the role of artificial intelligence (AI) and automation in the modern data stack, and the increasing modern data stack skills gap. Now, in Part 2, Tasso Argyros, founder, and CEO of ActionIQ will outline his take on the upcoming trends for 2019.

Tasso is a widely recognized and respected innovator, with awards and accolades from the World Economic Forum, BusinessWeek and Forbes, and has a storied career with more than a decade’s experience working with data-focused organizations and advising companies on how to accelerate growth and become market leaders. He is a CEO and Founder at ActionIQ, a company giving Marketers direct access to their customer data, and previously founded Aster Data, a pioneer in the modern data stack, which was ultimately acquired by Teradata. He was also a Founder of early-stage Big Data seed fund, Data Elite, that helped incubate Unravel Data.

In 2018, big data matured and hit a point of inflection, where an increasing number of Fortune 1000 enterprises deployed critical modern data applications that depend on the modern data stack, into production. What effect will this have on the product innovation pipeline and adoption for 2019 and beyond? This is an excerpt of my recent conversation with Tasso:

Unravel: Looking back in the ‘rear-view mirror’ to the past year, what were the most exciting developments and innovations in the modern data stack?

TA: While some say innovation has slowed down in big data, I’m seeing the opposite and believe it has accelerated. When we started Aster Data in 2005, many thought that database innovation was dead. Between Oracle, IBM, and some specialty players like Teradata, investors and pundits believed that all problems had been solved and that there was nothing else to do. Instead, it was the perfect time to start a database company as the seeds of the Big Data revolution were about to be planted.

Since then, the underlying infrastructure have experienced massive, continual changes in about 3 to 4-year intervals. For example, in the mid-2000s, the primary industry trend was moving from expensive proprietary hardware to more cost-effective commodity hardware, and in the early-2010s, the industry spotlighted open source data software. Now, for the past few years, the industry has been focused on the introduction of and transition to cloud solutions, the increasing volume of streaming data and debut of Internet-of-Things (IoT) technologies.

As we focus on finding better ways to manage data, introduce new technologies and databases, and explore the ecosystem that lays on top of the big data layer, these will be the underlying trends that will continue to drive innovation in 2019 and beyond. Whereas initially, big data was more about collection, aggregation and experimentation, in 2018, it became clear that big data is a crucial, mission-critical aspect to the next generation of applications – and there is much more to learn.

Unravel: What breakout technology will move to the forefront in 2019?

TA: There has been a definite increase in the number and variety of data-centric applications (versus data infrastructure) that are being created and in-use today. As a result, there is a rising interest in the industry in learning how to manage data for specific systems and in different environments, including on-premises, hybrid, and across multiple clouds. In 2019, the industry will start empowering these organization with tools that help non-experts become self-sufficient at managing their data operations processes across their end-to-end irrespective of where code is executing.

Unravel: Which industries or use cases have delivered the most value and have seen the most significant adoption of a mature modern data ecosystem?

TA: Digital-native organizations were the first companies to jump in at-scale – which is not a surprise as they have historically advanced more rapidly and been ahead of those who have some form of legacy to consider. Although heavily regulated, financial services institutions saw the value of an effective modern data strategy early-on as well as those industries that struggled with the cost and complexity of traditional data warehousing approaches when the 2008-2009 recession hit. In fact, few realize that the big recession was one of the key catalysts that accelerated the adoption of new modern data stack technologies.

Big data started with a heavy analytics focus– and then, as it matured, turned operational. Now, it’s coming to the point where streaming data is driving innovation, and many different industries and verticals are set to benefit from this next step. For example, one compelling modern data use case is delivering improved customer experiences through real-time customer data gathering, inference and personalization.

Moreover, the convergence of data science and big data has accelerated adoption as it activates the use of big data for critical business decision-making through optimized machine learning. By offering the ability to filter and prepare data, extract insights from large data sets, and capture complex patterns and develop models, big data becomes a critical value driver for modern data application areas like fraud and risk detection, or industries telecom and Healthcare.

Unravel: Is 2019 the year where ‘Big data’ gives way to just ‘Data’, as the lines and technologies between the 2 become increasingly hard to separate. A data-pipeline is a data pipeline after all.

TA: In the early days, there was confusion between big data and data warehousing. Data warehousing was the buzzword during the two decades prior, whereas big data became the hot trend more recently. A data warehouse is a central repository of integrated data – it is rigid and organized. A technology category, such as big data, is instead a means to store and manage large amounts of data – from many different sources, at a lower cost– to make better decisions and more accurate predictions. In short, modern data stack technologies have been more efficient at processing today’s high-volume, highly variable data pipelines that continue to grow at ever increasing rates.

With that in mind however, nothing stands still for very long, especially with technology innovation. The delineation between categories, as with any maturing market continues to evolve and high degrees of fragmentation, often lead by Open Source committers is often juxtaposed with the evolution of existing adopted technologies. SQL is a good example of this where the traditional the landscape of SQL, NoSQL, NewSQL and serverless solutions like AWS Athena start to blur the lines between what is ‘big’ and what is just ‘data’. One thing is for sure, we have come a long way in a short space of time and ‘Big Data’ is much more that on-premises Hadoop.

Unravel: What role will AI and automation, and capabilities like AIOps, play in the modern data stack in the coming year?

TA: Technologies like Spark, Hive and Kafka, are very complex under the hood, and when an application fails, it requires a specialist with a unique skill set to comb through massive amounts of data and determine the cause and solution. Data Operations frameworks need to mature to permit separation of roles rather than relying on a single Data engineer to solve all of the problems. Self-service for the applications owners will relieve part of this bottleneck but fully operationalizing a growing number of production data pipelines will require a different approach that relies heavily on Machine Learning and Artificial Intelligence.

In 2019, as the industry continues to strive for higher efficiency, automation will rise as a solution to the modern data stack skills problem. For example, AI for Operations (AIOps), which combines big data, artificial intelligence, and machine learning functionality, can augment and replace many IT operations processes to e.g. Accelerate the time it takes to identify performance issues, proactively tune resources to reduce cost or Automate configuration changes to prevent an app failures proactivly.

Unravel: What major vendor shake-ups do you predict in 2019?

TA: The industry now understands that there is more to a big data ecosystem than just Hadoop. Hadoop, for many years, was the leading open source framework, but Spark and Kafka’s increasing rise in popularity has proven that the stack will continue to rapidly evolve in ways we have not yet thought of. Complexity will be with us for a very long time and along with that some incredible new innovative companies, a new emerging incumbent (Cloudera/Hortonworks) and the Cloud giants will jockey for customer mindshare.