Data Observability for Databricks Register

App Performance

Healthcare leader uses AI insights to boost data pipeline efficiency

One of the largest health insurance providers in the United States uses Unravel to ensure that its business-critical data applications are optimized for performance, reliability, and cost in its development environment—before they go live in production. […]

  • 2 min read

One of the largest health insurance providers in the United States uses Unravel to ensure that its business-critical data applications are optimized for performance, reliability, and cost in its development environment—before they go live in production.

Data and data-driven statistical analysis have always been at the core of health insurance. But over the past few years the industry has seen an explosion in the volume, velocity, and variety of big data—electronic health records (EHR), electronic medical records (EMRs), and IoT data produced wearable medical devices and mobile health apps. As the company’s chief medical officer has said, “Sometimes I think we’re becoming more of a data analytics company than anything else.”

Like many Fortune 500 organizations, the company has a complex, hybrid, multi-everything data estate. Many workloads are still running on premises in Cloudera, but the company also has pipelines on Azure and Google Cloud Platform. Further, its Dev environment is fully on AWS. Says the key technology manager for the Enterprise Data Analytics Platform team, “Unravel is needed for us to ensure that the jobs run smoothly because these are critical data jobs,” and Unravel helps them understand and optimize performance and resource usage. 

With the data team’s highest priority being able to guarantee that its 1,000s of data jobs deliver reliable results on time, every time, they find Unravel’s automated AI-powered Insights Engine invaluable. Unravel auto-discovers everything the company has running in its data estate (both in Dev and Prod), extracting millions of contextualized granular details from logs, traces, metrics, events and other metadata—horizontally and vertically—from the application down to infrastructure and everything in between. Then Unravel’s AI/ML correlates all this information into a holistic view that “connects the dots” as to how everything works together.  AI and machine learning algorithms analyze millions of details in context to detect anomalous behavior in real time, pinpoint root causes in milliseconds, and automatically provide prescriptive recommendations on where and how to change configurations, containers, code, resource allocations, etc.

Platform Technology Diagram

Application developers rely on Unravel to automatically analyze and validate their data jobs in Dev, before the apps ever go live in Prod by, first, identifying inefficient code—code that is most likely to break in production—and then, second, pinpointing to every individual data engineer exactly where and why code should be fixed, so they can tackle potential problems themselves via self-service optimization. The end-result ensures that performance inefficiencies never see the light of day.

The Unravel AI-powered Insights Engine similarly analyzes resource usage. The company leverages the chargeback report capability to understand how the various teams are using their resources. (But Unravel can also slice and dice the information to show how resources are being used by individual users, individual jobs, data products or projects, departments, Dev vs. Prod environments, budgets, etc.) For data workloads still running in Cloudera, this helps avoid resource contention and lets teams queue their jobs more efficiently. Unravel even enables teams to kill a job instantly if it is causing another mission-critical job to fail. 

For workloads running in the cloud, Unravel provides precise, prescriptive AI-fueled recommendations on more efficient resource usage—usually downsizing requested resources to fewer or less costly alternatives that will still hit SLAs. 

As the company’s technology manager for cloud infrastructure and interoperability says, “Some teams use humongous data, and every year our users are growing.” With such astronomical growth, it has become ever more important to tackle data workload efficiency more proactively—everything has simply gotten too big and too complex and business-critical to reactively respond. The company has leveraged Unravel’s automated guardrails and governance rules to trigger alerts whenever jobs are using more resources than necessary.

Top benefits:

  • Teams can now view their queues at a glance and run their jobs more efficiently, without conflict.
  • Chargeback reports show how teams are using their resources (or not), which helps set the timing for running jobs—during business vs. off hours. This has provided a lot of relief for all application teams.