Data Observability for Snowflake Register


Three Venture Capitalists Weigh In on the State of DataOps 2022

The keynote presentation at DataOps Unleashed 2022 featured a roundtable panel discussion on the State of DataOps and the Modern Data Stack. Moderated by Unravel Data Co-Founder and CEO Kunal Agarwal, this session features insights from […]

  • 7 min read

The keynote presentation at DataOps Unleashed 2022 featured a roundtable panel discussion on the State of DataOps and the Modern Data Stack. Moderated by Unravel Data Co-Founder and CEO Kunal Agarwal, this session features insights from three investors who have a unique vantage point on what’s changing and emerging in the modern data world, the effects of these changes, and the opportunities being created.

The panel’s three venture capitalists—Matt Turck, Managing Director at FirstMark, Venky Ganesan, Partner at Menlo Ventures, and Glenn Solomon, Managing Partner at GGV Capital—all invest in companies that are both users and creators of the modern data stack. They are either crunching massive amounts of data and converting it into insights or are helping other companies do that at scale.

DataOps Unleashed 2022 keynote speakers

Today every company is a data company. And these data pipelines, AI, and machine learning models are creating tremendous strategic value for companies. And these companies are depending on them more than ever before. Now, while the advantages of becoming data-driven are clear, what’s often hard to grasp is what’s changing in this landscape.

 The discussion covered a broad range of topics, loosely revolving around a couple of key areas. Here are just a handful of the interesting observations and insights they shared.

What are the top data industry macro-trends?

Glenn Solomon: I don’t think companies are nearly as advanced as you’re likely to believe. Most companies are in the early innings of trying to figure out how to manage data. We see that even with born-digital companies. And the complexity is compounded by the fact that there is a lot of noise in the startup universe. Figuring out the decisions you make as a company is difficult and challenging. So I think this best-of-breed vs. platform, balancing act that companies had to go through in software is also going to happen in the data stack.

Matt Turck: The big driver is the rise of modern cloud data warehouses, and the lake houses as well. So for me, that’s been the big unlock in the space. We finally have that base level in the data hierarchy of needs, where we can take all this data, put it somewhere, and actually process it at scale. Now this whole thing is becoming very real, no longer experimental. Now the whole thing needs to work.

Venky Ganesan: The digital transformation that was happening just got super accelerated by the pandemic. All these analog business processes were digitized. And now that they are digitized, they can be tracked, stored, analyzed, evaluated and acted upon. I think the data stack has got to be one of the most important stacks in a company because your success long term is going to be based on how good is your data stack? How good is your DataOps? And then how do you build the analytics on top of it?

What trends are you seeing within DataOps specifically?

Venky Ganesan DataOps Quote

Venky: I would say the biggest trend I’m seeing is pushing these data workloads to the cloud. And I think it’s a really interesting game-changer. One of the things we are seeing now as we move to the cloud is suddenly you can separate out the storage and compute, have the infrastructure handle it, and then have the data warehouses such as Snowflake, Databricks. Now there are new sets of problems that come into play around DataOps when you move the data to Snowflake or Databricks or any other cloud providers, which is that you need to still understand the workloads, still need to optimize them. I think there’s going to be a whole DataOps category that helps you both migrate workloads to the cloud and also monitor them, because you can’t have the fox guarding the henhouse, you need some third party there to help you make sure you’re optimizing the workload, because the cloud provider is not interested in optimizing the workload for efficiency.

Glenn Solomon DataOps Quote

Glenn: A driver that I’m seeing accelerate, and gain momentum, is quite simply just the need for speed. In organizations there’s a tremendous amount of momentum around real-time streaming, real time analytics. Companies are growing the number of business processes for which they want real-time data to make decisions. And that is having a big impact on this whole world.

Another trend I’d point out is the rise of open source and the impact open source is having on many, many areas within the DataOps world. It looks like Kafka has had a massive impact on streaming as a result. That shows me that open source can really standardize markets. It can standardize technologies and standardize workflows. We’ll have to see how this all plays out, but I think open source—when it works well—is a very, very powerful trend.

Watch full panel discussion: The State of DataOps and the Modern Data Stack

Watch session on demand

Moving data to the cloud—why or why not?

Venky: I think if you are a company that has data on premises, you just have numerous issues. On prem is very heterogeneous: heterogeneous in hardware, heterogeneous in environment. And in a world of labor shortages, what happens if you don’t have the people? If you have the kind of turnover you’re seeing, can you get the people to manage it? So if you don’t move to the cloud, you’re going to be trapped on an island with fewer and fewer resources that cost more and more.

But I actually think the most important part of moving into the cloud is that it gives you an opportunity to standardize data, think about the data you want. And then once you move it to the cloud, you can unlock new generations of AI technologies that come into the cloud and allow you to get more insights from data. And so to me, eventually, data is worthless if it doesn’t translate to insights. Your best way of getting that insight is to figure out a scalable way of moving to the cloud, cheaper, and also unlock a lot of the new AI techniques to get insight from it.

Glenn: I think we’re on a continuum where there are still lots of companies who are reticent to move all their data to the cloud. I think the view is, hey, we have regulatory obligations and there’s risk if we don’t manage things ourselves. For data that is viewed as too sensitive, too risky, too valuable to move to the cloud—it’s just a matter of time. The value that can be driven from having data up in the cloud is just too great. But how do you safely move data into the cloud? And then once it’s there, how do you manage the applications that consume that data in a way that is rational? And if you want to use [cloud services] rationally, and use them the right way, and in a cost-efficient way, then you really do need other tools to make sure that things don’t run away from you. 

Matt Turck DataOps QuoteMatt: For many years, there was an almost cognitive dissonance, where everything I read, and all the conversations I was having with like execs and people in the industry, was all cloud, cloud, cloud. But our customers all wanted to be on prem—actually, zero people want to be in the cloud. It feels like in the last year and a half that cognitive dissonance has disappeared, and suddenly I starting seeing all these customers, almost all at once—and the pandemic certainly accelerated all this—saying, okay, now is the time I will move those workloads and the data to the cloud. So it feels like there’s been an inflection point of some sort. And it is very anecdotal, I realize, but it is very, very clear.

The only nuance to all of this is I think there’s a little bit of a growing realization and concern around the cost of being in the cloud. When you start in the cloud, you actually save a lot of money. Once you’ve configured your organization to actually run in the cloud, you save a lot of money for a while. But then there’s a moment when it starts actually being pretty expensive. And I think that’s a problem that’s starting to come to the fore. 

What’s the impact of the talent shortage?

Glenn: It’s very difficult to amass the kind of talent you need to really effectively both manage the data and then ultimately evaluate and analyze it for good purpose in your business. If you split the world into managing data—DataOps and the data engineer and all the challenges and complexities there, where there’s definitely a labor shortage—and then analyzing the data, getting value from it (data scientists up through business analysts), where there’s also a shortage—we have a human problem in both. One of my colleagues used the term “unbundling the data engineer.” If you look at all the tasks a data engineer would need to do to get a well-functioning data stack in place, there just aren’t enough of those people. Companies are picking off and automating various aspects of that workflow.

But on the other side, on analyzing the data, I think there are a lot of interesting things to be done there. How do you make data scientists, because there aren’t enough of them, more efficient? What tools and technologies do they need? I think we’ll see more solutions on the analysis side because we have that same human capital problem there too.

Matt: It’s an obvious problem that is only going to get worse, because the rate that we as a society produce technical people—data engineers, ML engineers data scientists—is nowhere near the pace we need to meet the demand. Again, every company is becoming not just a software company but a data company. That has two consequences. One, we need products and platforms that abstract away the complexity. That’s empowering people who are somewhat technical but not engineers to do more and more of what’s needed to make the whole machinery work. And the second, related consequence is the rise of automation—making a lot of those technologies just work in a way where no human is required. There’s plenty of opportunity there, especially AI-driven information of system optimization, tuning, anomaly detection, auto-repair, and the like.

Venky: Whether we’re talking about DataOps or security, these are things that will get automated at scale. It won’t replace humans. They will be complemented by technology that does most of the mundane stuff, and humans will deal with the exceptions. The mundane stuff gets done automatically, the exceptions get kicked to humans.That’s the only way forward. There’s no way to build the human capital required.

Watch the entire panel discussion on demand here. Hear more war stories, anecdotes, and expert insights, including predictions for they coming year.