Big Data is no longer a side project. Hadoop, Spark, Kafka, and NoSQL systems are slowly becoming part of the core IT fabric in most organizations. From product usage reports to recommendation engines, organizations are now running various types of Big Data applications in production to provide their customers greater value than ever before.
But Big Data applications are not easy to run, and ongoing operations management presents organizations with never-before-seen challenges. Application developers often complain about applications missing their delivery commitments (SLAs) or failing outright. Meanwhile, Big Data operations teams struggle with everyday tasks such as debugging, scheduling, and allocating resources. These ongoing management and performance challenges make it difficult for organizations to rely on their Big Data investment – let alone profit from it.
Shivnath Babu and I saw that managing the chaos and complexity of Big Data systems was taking up the majority of the time spent by Big Data professionals — versus working to deliver results to the business from this Big Data stack. We also saw that these problems weren’t unique to one organization, but were common in companies employing Big Data technology. Complicating matters, the Big Data ecosystem is expanding at such a rapid pace that practitioners are unable to keep up. Lack of expertise is often cited as one of the primary reasons for Big Data projects failing or slowing down.
We thought there had to be a better way to cope with this complexity so that enterprises could focus their attention on delivering value quickly using their Big Data stack, and set out on a mission to radically simplify Big Data operations.
Companies have engineers responsible for Big Data operations. Their job is to keep clusters healthy and Big Data applications performing reliably and efficiently. Today, they use a fragmented set of tools, command-line interfaces, and home-developed scripts to keep an eye on their Big Data stack. However, these fragmented tools fail to show the complete picture, making it very hard to root-cause and solve problems. For example, to troubleshoot the slowdown in the performance of a critical Big Data application, engineers have to look at the entire stack since the root cause could be in the application code, configuration settings, data layout, or resource allocation.
Therefore, we had to create a management solution that wasn’t looking only at one part of the stack, but one that was holistic. We also had to do more than simply provide charts and dashboards, which most users cannot make sense of; we had to provide ‘performance intelligence’ which would simplify the process of ongoing management and make data teams more productive.
Creating software like Unravel requires a mix of both industry experience as well as deep scientific knowledge. Therefore, we have assembled a team of innovators who previously worked at companies such as Cloudera, Oracle, IBM, Netflix, and scientists from Duke University, IIT, MIT, and Stanford. Together the Unravel team brings the needed experience in distributed computing and enterprise software that is crucial to solving this major problem that our industry faces today.
I am excited to announce that Unravel is already being used in production by several leading web and Fortune 100 companies today. We couldn’t be happier to see that Unravel is helping organizations rely on Big Data by ensuring that applications are fast and error-free and that the underlying cluster is being utilized to its full potential.
See how Unravel is mission critical to run big data in production here!