Inside Unravel

Adobe Migrates to the Cloud with Unravel Data

Adobe is a legendary Silicon Valley company. From the desktop publishing era of the 1980s, powered by the Adobe Postscript page description language, through the creation and marketing of Photoshop, Illustrator, and other creative power tools, […]

  • 10 min read

Adobe is a legendary Silicon Valley company. From the desktop publishing era of the 1980s, powered by the Adobe Postscript page description language, through the creation and marketing of Photoshop, Illustrator, and other creative power tools, the digital revolution is unthinkable without Adobe. 

But the company is more than a legend. Adobe is now one of the fastest-growing companies in tech, which is saying something. The company’s value has increased roughly twenty-fold in ten years, and it is now valued at more than $200 billion, making Adobe a true tech giant. The company is steadily moving up the Fortune 500 list

Application engineering manager Kevin Davis has been with the company for about ten years – all through its recent, record-setting growth. He has seen the employee headcount, for instance, more than double.

Kevin works on Adobe’s corporate data platform, the engine behind all of the company’s offerings, old and new. He calls the company’s recent growth “monumental.”

Watch A Journey to the Cloud for Adobe’s Corporate Data Platform (24:13)

Kevin was a main stage speaker at DataOps Unleashed, the first-ever DataOps-specific industry conference, founded and hosted virtually by Unravel. Kevin’s talk was the best-attended of the entire conference. He brought deep insight into a new round of creativity at Adobe, where the company’s next phase of growth will be powered by the cloud. And Adobe’s move to cloud is getting a big boost from Unravel Data. 

Where Adobe Stands Today

“Adobe has transformed its business.” – Tomasz Tunguz

This is actually Adobe’s second “move to cloud.” The first one began ten years ago, when Adobe moved from selling boxed software to a software as a service (SaaS) model. You may remember the hoopla, for decades, around any major Adobe Photoshop upgrade, featuring giant cardboard boxes stuffed with dozens of floppy disks and hundreds of pages of printed manuals. 

Adobe’s move to SaaS was accomplished quickly and effectively. As famous tech investor and commentator Tomasz Tunguz wrote in 2015, “In 2.5 years, Adobe has transformed its business from a software license business into a SaaS business.” 

And Adobe achieved outstanding results. Adobe used their improved product delivery pipeline to move fast and, well, make things. Adobe has not only introduced new products, but whole new lines of business, such as the analytics-heavy Experience Cloud, which supports marketing and content management. 

Customers are happy, the share price has rocketed upward, and the company is a top-tier creative force – in the business world, and in the lives and work of its customers. What could possibly go wrong?

More Money, More Problems

“We have thousands of users, we have petabytes of data, and literally millions of monthly job executions… we’ve just literally reached our limit.” – Kevin Davis

With all these good things happening, Adobe saw the need to implement the back-end equivalent to SaaS: infrastructure as a service (IaaS), delivered by the public cloud providers. 

For years now, Adobe has run its business on an on-premises Hadoop platform. According to Kevin, “we’ve seen significant challenges in keeping up with growth. We have thousands of users, we have petabytes of data, and literally millions of monthly job executions.”

He continues, “The biggest challenge is that we’re simply running out of capacity. We don’t have any more room in the data center after this year to deploy any additional infrastructure. We’ve just literally reached our limit.” 

Adobe runs a benchmark job to track the performance of their complex stack. The benchmark’s results vary widely from day to day, and even from hour to hour, which reflects users’ experience. “A job that may run in 30 minutes one day may take two or three times that to run on another day,” he says. 

“In order to try to help manage that,” he continues, “we’ve actually put growth quotas on – to not allow them to grow more than 20% this calendar year, or we’ll simply run out of capacity, and won’t have anything available for these teams to run all of their critical data workloads.”

Before solving the problem, Adobe has set out to understand it better. They’ve identified key pain points. The top three: 

  • Compute efficiency (scale up/down, faster compute) 
  • Data quality (implement data qualify rule, notify anomalies)
  • Data discovery (search and understand data)

As Davis describes the challenges: “Given limitations, our platform on-premises, it’s no surprise that compute efficiency was one of the biggest pain points… The other ones that really jumped out were data quality and data discovery. Data is such a critical component of this platform.”

Only one, fundamental change can solve problems at Adobe’s scale: moving to the cloud. 

Preparing for Cloud Migration 

The data on the platform is just as important as the platform itself.” – Kevin Davis

Adobe has recently completed an extensive research and planning effort, and is now executing on its move into the cloud. However, Adobe doesn’t just want to fix existing problems.

According to Davis, “We’re not just trying to take our on premises infrastructure, pick it up, and move it into the cloud. We want to make sure we’re adding more value and services into this platform, to solve some of the key pain points and key challenges that our customers have been facing. In terms of the vision for our platform, we’re calling our journey to the cloud, data and platform as a service basically critical.”

The top ten cloud capabilities Adobe users want:

  • Scale
  • Data science
  • Automation/isolation
  • Advanced/specialized analytics
  • Batch processing
  • Monitoring and alerting
  • Notebook
  • Data catalog automation
  • Streaming processing 
  • Other options

Adobe has designed an architecture to solve these challenges. As Davis describes it, “The data on the platform is just as important as the platform itself. That data is a valuable asset that all of our customers want access to. That’s why we chose to call it data and platform as a service (DPaaS).” 

Davis describes the architecture in some detail. “Across the top, those are all of the personas that we’re trying to cater to. On the left, the tenant admin. These are the people who are going to deploy the infrastructure, manage who has access to that infrastructure, and also be responsible for managing the cost of that infrastructure.” 

“They need to be able to see what’s there, who’s accessing it, and how much is it costing them. Across the top to the right, are some of the other personas that Kunal mentioned that are part of that key DataOps strategy, but all of those different personas access the platform in a different way.” (As Davis mentions, it’s no accident that these roles map to the role definitions in Unravel’s DataOps infinity loop.)

“The data engineer is building data pipelines. The data analyst and BI developer may need access to the platform, but they’re not building any new data assets. They’re building dashboards on top of what the data engineers have built. And the data scientists obviously need a whole different toolset to do the work that they’re doing. We needed to make sure that our platform had services and capabilities that met the needs of all of those different personas.”

Unravel’s Data-Driven Approach to Migration 

(Before Unravel,) “We had a really hard time understanding all of the workloads that were running on-premises and figuring out how to get those workloads into the cloud.” – Kevin Davis

So far, so good – Adobe had a well-researched plan. But cloud migration is famously hard. As Davis says, “One of the other things that we found as we started to plan this migration is that it was really challenging to gather the insights that we needed to properly plan our migration.”

Adobe found out the hard way that they had many questions about the “before” state – the on-premises shop, powered by Hadoop. “We didn’t really fully understand all of the details of all of the workloads that we’re running on premises,” said Davis. “We have so many different teams across the company that are running workloads. We didn’t have all of the information about those workloads, the resources they use, the data assets that they leveraged. We needed to find a way to get some of those insights in a more automated manner.”

The questions kept piling up. “Which workloads consume the most resources? Was it most advantageous to move those to the cloud first? Or was it better to move the ones that were simpler, had the fewest dependencies upstream and down? When these workloads made it into the cloud, we wanted to know: How much is it going to cost to run this workload in the cloud?”

Concludes Davis: “In terms of trying to answer some of the questions that I mentioned, how do I know which workloads I should migrate to the cloud first, which workloads are consuming the most resources, and which would be the easiest to move to the cloud because they don’t have as many dependencies. Those are insights that have been challenging to get.” 

Partnering with Unravel Data

“We’re getting ready to take that next step in our cloud migration journey, as partners with Unravel.” – Kevin Davis

“We’ve been partnering with Unravel to try to help,” says Davis. Adobe started by using Unravel to optimize workloads in the on-premises Hadoop estate. This made life much easier on-premises, as Adobe was taking the initial steps in their move to cloud. 

Davis continues, “Adobe has had a relationship with Unravel for a couple of years. And Unravel has been helping us to optimize some of our on-premises workloads. You can see there a couple of statistics as to how Unravel has been helping not only to optimize the workloads, but also to enable our people resources, to be able to be more effective in doing root cause analysis when a job is failing or a job is running.”

Moving to the Cloud, One Workload at a Time

“Do I need to keep meeting this SLA and incur these (high) costs, or am I OK reducing this SLA and trying to achieve some cost savings?” – Kevin Davis

Cloud migration is often spoken of as if it were a single process. But those who have succeeded in this effort know that it’s really cloud migrations. Organizations usually move workloads in groups, not all at once. And they are often flying blind in deciding just how to do that: which cloud platform to move to, which cloud services to use, what workloads to move first, and how much it will cost – per workload, and taken as a whole. 

A desirable part of the solution is to optimize workloads on-premises before making the move to cloud decision. That seems daunting, when faced with all the work needed to move the workloads to the cloud. But, as Adobe has found, Unravel makes on-premises optimization much easier. 

The next step is to study the optimized workloads and decide which ones to move to the cloud first. As Davis puts it: “Unravel has capabilities that allow you to analyze the workloads that are running on-premises, and not only have deep insights into what those workloads look like, but also what it might cost to run those workloads in the cloud. We’re getting ready to take that next step in our cloud migration journey, as partners with Unravel.” 

David continues, “I’ve used the Unravel tool to try to understand what does the workload, when migrating to the cloud look like in terms of cost and also understanding, are there ways that I can reduce the cost of that workload running in the cloud?”

“To maintain 100% SLA, Unravel tells me, it’s going to cost you this much to run your workload in the cloud,” Davis demonstrates. “But if you don’t need to maintain that SLA, you can save costs. You can reduce the cost of that workload by reducing your SLA, thereby reducing the infrastructure that’s needed. This gives you the insight to start some of those trade off decisions. Do I need to maintain this SLA and incur these costs? Or am I okay reducing my SLA and trying to achieve some cost savings?” 

Saving Half the Cost with Auto-Scaling

Unravel says by deploying this in an auto-scaling type environment, you can save more than 57% of the cost.” – Kevin Davis

“Another example of how Unravel is helping give us some of these insights is a look at what this workload might require in the cloud in an auto-scaling environment,” explains Davis. “Unravel looks at how this workload is running on-premises and sees that for one hour during the day, every day, this workload requires significant resources. But for the rest of the day, the resource requirements are significantly reduced.”

“This is the kind of workload that would make sense to deploy in the cloud in an auto-scaling model where the infrastructure could scale up during that hour, that the resource utilization was high. But then scale back down for the rest of the day,” Davis continues. 

“In this example, you can see that Unravel says, ‘By deploying this in an auto-scaling type environment, you can save more than 57% of the costs.’ These are some of the critical insights that we needed, and we’re excited to partner with Unravel to leverage their product; to understand our workloads and to start trying to forecast what’s it going to cost when we run these in the cloud, and also what’s the best infrastructure deployment. 

Creating the Roadmap

“We had a real blind spot and Unravel helped us uncover some of the insights that we need… to help us plan our migration into the cloud.” – Kevin Davis

Adobe has made a lot of progress with Unravel – optimizing on-premises workloads, profiling them to identify those which can use auto-scaling, and taking a hard look at SLAs to help reduce costs in the cloud. So Adobe is now positioned to make the move in a carefully planned and effective manner. 

Davis explains, “We had a real blind spot, and Unravel has really helped us to uncover some of the insights that we need, understanding the workload resources, interdependencies between workloads to help us plan our migration of those into the cloud. We’ve also learned that it’s really important to have that migration analysis, to be able to look at what this workload might cost when it runs in the cloud. We’ve also learned that many of the workloads that were running on-premises weren’t necessarily optimized. We’ve asked every team across the company to start optimizing workloads, so that we can make sure we’re not incurring unnecessary costs when they’re running in the cloud.

Unravel’s role won’t end as workloads are moved to the cloud. “And then finally, even though we don’t have workloads in the cloud yet, we’ve realized that we want the same insights for those workloads when they’re running in the cloud, as we have today, when they’re running on premises,” explains Davis. Again, this is an area where we can partner with Unravel to deploy some of their APM capabilities in the cloud, so that when a workload migrates from on-premise to the cloud, we have the same insights about how that workload is running.”

To the Cloud, and Beyond

“We can partner with Unravel to deploy some of their APM capabilities in the cloud so that, when a workload migrates from on-premises to the cloud, we have the same insights about how is that workload running.” – Kevin Davis

Davis concludes, “So we’re excited to continue on this journey. We have a big objective for the year in getting the infrastructure deployed to getting 20% of our workloads migrated. And we’re excited to partner with Unravel through this journey to give us some of the insights that we’ve been missing.”

This blog post is a useful summary, but you can also view the entire Adobe presentation (24:13) directly. And you can view all the videos from DataOps Unleashed here. You can also download The Unravel Guide to DataOps, which was made available for the first time during the conference.