Cost Optimization

Reflections on “The Great Data Debate 2022” from Big Data London

This year’s Big Data LDN (London) was huge. Over 150 exhibitors, 300 expert speakers across 12 technical and business-led conference theaters. It was like being the proverbial kid in a candy store, and I had to […]

  • 3 min read

This year’s Big Data LDN (London) was huge. Over 150 exhibitors, 300 expert speakers across 12 technical and business-led conference theaters. It was like being the proverbial kid in a candy store, and I had to make some tough decisions on which presentations to attend (and which I’d miss out on).

One that looked particularly promising was “The Great Data Debate 2022,” panel discussion, hosted by industry analyst and conference chair Mike Ferguson with panelists Benoit Dageville, Co-founder and President of Products at Snowflake, Shinji Kim, Select Star Founder and CEO, Chris Gladwin, Ocient CEO, and Tomer Shiran, Co-founder and Chief Product Officer of Dremio. You can watch the 1 hour Great Data Debate 2022 recording below.

Great Data Debate 2022

The panel covered a lot of ground, everything from the rise of component-based development, how the software development approach has gate-crashed the data and analytics world, the challenges around integrating all the new tools, best-of-breed vs. single platform, the future of data mesh, metadata standards, data security and governance, and much more.

Sometimes the panelists agreed with each other, sometimes not, but the discussion was always lively. The parts I found most interesting revolved around migrating to the cloud and controlling costs once there.

Moderator Mike Ferguson opened up the debate by asking the panelists how the current economic climate has changed companies’ approach—whether they’re accelerating their move to the cloud, focusing more on cost reduction or customer retention, etc.

All the panelists agreed that more companies are increasingly migrating workloads to the cloud. Said Benoit Dageville: “We’re seeing an acceleration to moving to the cloud, both because of cost—you can really lower your cost—and because you can do much more in the cloud.” 

Chris Gladwin added that the biggest challenge among hyperscale companies is that “they want to grow faster and be more efficient.” Shinji Kim echoed this sentiment, though from a different viewpoint, saying that many organisations are looking at how they want to structure the team—focusing more effort on automation or tooling to make everyone more productive in their own role. Tomer Shiran made the point that “a lot of customers now are leveraging data to either save money or increase their revenue. And there’s more focus on people asking if the path of spending they’re on with current data infrastructure is sustainable for the future.”

We at Unravel are also seeing an increased focus on making data teams more productive and on leveraging automation to break down silos, promote more collaboration, reduce toilsome troubleshooting, and accelerate the DataOps lifecycle. But piggy-backing on Tomer’s point: While the numbers certainly bear out that more workloads are indeed moving to the cloud, we are seeing that among more mature data-driven organisations—those that already have 2-3 years of experience running data workloads in the cloud under their belt—migration initiatives are “hitting a wall” and stalling out. Cloud costs are spiraling out of control, and companies find themselves burning through their budgets with little visibility into where the spend is going or ability to govern expenses.

As Mike put it: “As an analyst, I get to talk to CFOs and a lot of them have no idea what the invoice is going to be like at the end of the month. So the question really is, how does a CFO get control over this whole data and analytics ecosystem?”

Chris was first to answer. “In the hyperscale segment, there are a lot of things that are different. Every customer is the size of a cloud, every application is the size of a cloud. Our customers have not been buying on a per usage basis—if you’re hammering away all day on a cluster of clusters, you want a price based on the core. They want to know in advance what it’s going to cost so they can plan for it. They don’t want to be disincented from using the platform more and more because it’ll cost more and more.”

Benoit offered a different take: “Every organisation wants to become really data-driven, and it pushes a lot of computation to that data. I believe the cloud and its elasticity is the most cost-effective way to do that. And you can do much more at lower costs. We have to help the CFO and the organisation at large understand where the money is spent to really control [costs], to define budget, have a way to charge back to the different business units, and be very transparent to where the cost is going. So you have to have what we call cost governance. And we tell all our customers when they start [using Snowflake] that they have to put in place guardrails. It’s not a free lunch.”

Added Shinji: “It’s more important than ever to track usage and monitor how things are actually going, not just as a one-time cost reduction initiative but something that actually runs continuously.”

Benot summed it up by saying, “Providing the data, the monitoring, the governance of costs is a very big focus for all of us [on the panel], at different levels.”

It’s interesting to hear leaders from modern data stack vendors as diverse as Snowflake, Select Star, and Dremio emphasise the need for automated cost governance guardrails. Because nobody does cloud cost governance for data applications and pipelines better than Unravel.

Check out the full The Great Data Debate 2022 panel discussion.