Whether you are building a lakehouse with Databricks on AWS for data engineering, analytics, BI, and data science or deploying your first cloud data pipeline, Unravel’s AI-powered data observability for Databricks simplifies the challenges of data operations, improves performance, saves critical engineering time, and optimizes cost. Unravel provides AI insights to proactively pinpoint and resolve data pipeline performance issues, ensure data quality, and define automated guardrails for predictable spend.
COST GOVERNANCE
Understand, optimize and actively govern your costs
AI-enabled cost governance identifies where you’re spending more than you have to (and how to fix it), with guardrails to proactively manage costs and prevent budget overruns.
OPTIMIZATION
Optimize for performance and cost before you deploy
Automated AI recommendations eliminate trial-and-error tuning. Unravel cuts to the chase to tell you exactly how to change code or configurations for better performance and cost.
TROUBLESHOOTING
Less effort, more problem-solving—faster, easier
No more spending hours (or days) doing time-consuming manual detective work. Unravel’s automated root cause analysis pinpoints why jobs fail or where pipelines are bottlenecked.
DATA QUALITY
Automatically correlate external data quality checks with AI-driven insights
Unravel integrates data quality check results from other tools, correlates all data details into a workload-aware context, and applies AI analysis for automated insights.
CLOUD MIGRATION
Avoid landmines and setbacks before, during, and after migration
Avoid migration setbacks and cost overruns. Unravel’s deep intelligence and automation enables confident, data-driven decisions before, during and after your move to the cloud.
EXPLORE KEY FEATURES
See how Unravel’s top features and capabilities work
2-minute demo videos and self-paced guided tours walk you through the “best of” Unravel.
Understand, optimize and actively govern your costs
AI-enabled cost governance identifies where you’re spending more than you have to (and how to fix it), with guardrails to proactively manage costs and prevent budget overruns.
Optimize for performance and cost before you deploy
Automated AI recommendations eliminate trial-and-error tuning. Unravel cuts to the chase to tell you exactly how to change code or configurations for better performance and cost.
Less effort, more problem-solving—faster, easier
No more spending hours (or days) doing time-consuming manual detective work. Unravel’s automated root cause analysis pinpoints why jobs fail or where pipelines are bottlenecked.
Automatically correlate external data quality checks with AI-driven insights
Unravel integrates data quality check results from other tools, correlates all data details into a workload-aware context, and applies AI analysis for automated insights.
Avoid landmines and setbacks before, during, and after migration
Avoid migration setbacks and cost overruns. Unravel’s deep intelligence and automation enables confident, data-driven decisions before, during and after your move to the cloud.
See how Unravel’s top features and capabilities work
2-minute demo videos and self-paced guided tours walk you through the “best of” Unravel.
No. Databricks Units (DBUs) are reference units of Databricks Lakehouse Platform capacity used to price and compare data workloads. DBU consumption depends on the underlying compute resources and the data volume processed. Cloud resources such as compute instances and cloud storage are priced separately. Databricks pricing is available for Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP). You can estimate costs online for Databricks on AWS, Azure Databricks, and Databricks on Google Cloud, then add estimated cloud compute and storage costs with the AWS Pricing Calculator, the Azure pricing calculator, and the Google Cloud pricing calculator.
Learn more about FinOps for data teams
Cost 360 for Databricks provides trends and chargeback by app, user, department, project, business unit, queue, cluster, or instance. You can see a cost breakdown for Databricks clusters in real time, including related services such as DBUs and VMs for each configured Databricks account on the Databricks Cost Chargeback details tab. In addition, you get a holistic view of your cluster, including resource utilization, chargeback, and instance health, with automated AI-based cluster cost-saving recommendations and suggestions.
Learn more about cost governance
Databricks offers general tips and settings for certain scenarios, for example, auto optimize to compact small files. Unravel provides recommendations, efficiency, insights, and tuning suggestions on the Applications page and the Jobs tab. With a single Unravel instance, you can monitor all your clusters, across all instances, and workspaces in Databricks to speed up your applications, improve your resource utilization, and identify and resolve application problems.
Learn more about AI-enabled optimization
Real time monitoring and alerting with Databricks Overwatch requires a time-series database. Databricks refreshes your billable usage data about every 24 hours and AWS Cost and Usage Reports are updated once a day in comma-separated value (CSV) format. Since Cost Explorer includes usage and costs of other services, you should tag your Databricks resources and you may consider creating custom tags to get granular reporting on your Databricks cluster resource usage. Unravel simplifies this process with Cost 360 for Databricks to provide full cost observability, budgeting, forecasting and optimization in near real time. Cost 360 includes granular details about the user, team, data workload, usage type, data job, data application, compute, and resources consumed to execute each data application. In addition, Cost 360 provides insights and recommendations to optimize clusters and jobs as well as estimated cost improvements to prioritize workload optimization.
Learn more about cost governance
Data teams spend most of their time preparing data—data aggregation, cleansing, deduplication, synchronizing and standardizing data, ensuring data quality, timeliness, and accuracy, etc.—rather than actually delivering insights from analytics. Everybody needs to be working off a “single source of truth” to break down silos, enable collaboration, eliminate finger-pointing, and empower more self-service. Although the goal is to prevent data quality issues, assessing and improving data quality typically begins with monitoring and observability, detecting anomalies, and analyzing root causes of those anomalies.
Learn more about flexible data quality
Databricks collects monitoring and operational data in the form of logs, metrics, and events for your Databricks job flows. Databricks metrics can be used to detect basic conditions such as idle clusters and nodes or clusters that run out of storage. Troubleshooting slow clusters and failed jobs involves a number of steps such as gathering data and digging into log files. Data application performance tuning, root cause analysis, usage forecasting, and data quality checks require additional tools and data sources. Unravel accelerates the troubleshooting process by creating a data model using metadata from your applications, clusters, resources, users, and configuration settings, then applying predictive analytics and machine learning to provide recommendations and automatically tune your Databricks clusters.
Learn more about automated troubleshooting
Virtual Private Cloud (VPC) peering enables you to create a network connection between Databricks clusters and your AWS resources, even across regions, enabling you to route traffic between them using private IP addresses. For example, if you are running both an Unravel EC2 instance and a Databricks cluster in the us-east-1 region but configured with different VPC and subnet, there is no network access between the Unravel EC2 instance and Databricks cluster by default. To enable network access, you can set up VPC peering to connect Databricks to your EC2 Unravel instance.
Unravel provides granular Insights, recommendations, and automation for before, during and after your Spark, Hadoop and data migration to Databricks.
Get granular chargeback and cost optimization for your Databricks workloads. Unravel for Databricks is a complete data observability platform to help you tune, troubleshoot, cost-optimize, and ensure data quality on Databricks. Unravel provides AI-powered recommendations and automated actions to enable intelligent optimization of big data pipelines and applications.
Learn more about cloud migration