Next-gen Cloud OLAP technology can solve the problems of traditional OLAP, helping you achieve speed-of-thought analytics on massive cloud data
The massive volume of data involved in enterprise analysis of today has more than a noticeable adverse effect on the responsiveness of analytics queries. What took a few seconds with gigabytes a few years ago takes minutes or hours with terabytes or petabytes today. These lags result in a tremendous creative hindrance to analysts who require â€śspeed of thoughtâ€ť responsiveness.
Regaining an acceptable level of responsiveness with todayâ€™s volume of data is beyond the Big Data approach of simply throwing more commodity servers at the problem. Not all of the problems are linearly scalable and adding more servers or compute raises costs to infeasible levels. How can we restore speed of thought responsiveness without substantial increases in overall cost?
The disciplined answer is to implement the time-honored OLAP methodology of â€śprocess-once, access many timesâ€ť. In this blog, weâ€™ll explore how Kyvos Smart OLAPâ„˘ is a rare example of a technology that further dissolves the proverbial business axiom of â€śfast, good, and cheap â€¦ pick twoâ€ť by covering all three towards the goal of speed-of-thought responsiveness.
Analysis is a Creative Process
Analysis is a process. Itâ€™s a sequence of queries, each based on the previous queries. Itâ€™s a creative exercise by human analysts using their human intelligence to find strategic opportunities. It goes something like this:
- Where can I best apply marketing funds?
- Which products are showing little or no gain?
- In what regions are these products showing the least gain?
- What do the customers in these regions buy the most?
- Which specific customers seem like good candidates for our products?
This â€śslice and diceâ€ť pattern of inquiry is so innate to our thought process that it underlies the user experience of major analytics tools such as Tableau, MicroStrategy, and PowerBI. These tools are generally judged by criteria such as their selection of graphs and ease of use. But if query results used to render those graphs arenâ€™t almost immediate, the train of thought is derailed. No value is gained by waiting. How many cups of coffee can you get?
Cloud-Scale Data and the Need for Speed-of-thought Analytics
The answers to analytical questions such as those above are processed from data stored in data warehouses. These databases, especially when cloud-based, can be massive, housing billions to trillions of transactions collected over the years. The analytical queries submitted to these data warehouses usually involve a large number of those transactions each time. Sometimes all of them, over and over.
For each query to a cloud data warehouse, massive amounts of data are moved from cheap storage to expensive compute clusters, taking up a lot of time, piling huge amounts onto your Cloud provider bill. Further, each BI query generally involves heavy processing, such as joining many tables and performing expensive calculations.
Data Warehouse and its Shortcomings
In all fairness, most cloud data warehouse platforms such as Snowflake and Azure Synapse implement some sort of caching mechanism. Caching mechanisms preserve the results of processing to be leveraged later. The idea of caching is that if we asked for some set of data once, chances are good weâ€™ll ask for that same data again. We donâ€™t need to move data from storage to compute and number crunch quite as much.
The problems are:
- Data not already cached will still need to go through the heavy process of reading the data and crunching those numbers, leading to inconsistent performance.
- Analytics user behavior is based on business progression and insights that drive towards new business models and data products. What is working and not working in their business? Where are the opportunities for growth? These questions lead to ad-hoc analytics where relevant and useful cache is an erratically moving target.
- The rules for deciding what to cache are far too cumbersome to manually manage or implemented too simplistically for automated management.
And thatâ€™s precisely where Kyvos Smart OLAPâ„˘ comes in. Successfully addressing the above points leads to:
- Magnitudes of performance gains
- Decreased cost from the data warehouse
- Significantly less manual effort required by IT to manage the formidable tasks involved with such
- Most importantly, a smooth â€śspeed of thoughtâ€ť analytics experience
You Need OLAP on the Cloud. But Nay, Not the Traditional One.
Today, Online Analytical Processing (OLAP) generally refers to â€śanalytics query patternsâ€ť. However, OLAP is more than just a query pattern. It is an architecture for accelerating the performance of a data warehouse in the least intrusive manner and presenting the data to end-users as a sensible â€śmulti-dimensional data modelâ€ť. OLAP is the specific optimization of the slice and dice analytics pattern.
An OLAP system automatically manages the daunting task of smartly pre-aggregating data for users and adjusting what is pre-aggregated based on changing conditions. In other words, results are pre-calculated before the user asks for them. An OLAP system also involves various levels of caching other expensive calculations.
These pre-aggregations and caching schemes promote fast and consistent responsiveness by minimizing on-the-fly and redundant processing. The number of these pre-aggregations can be in the hundreds to thousands. So manual management of these pre-aggregations by IT is infeasible, and automated management requires a thoughtful, sophisticated approach – OLAP.
The OLAP architecture and methodology was laid out decades ago by Dr. E.F. Codd, the same person who laid out the specifications for workhorse relational databases such as Oracle and SQL Server weâ€™ve loved for decades. In fact, SQL Server Analysis Services (SSAS) was a great implementation of OLAP in its day, accelerating the responsiveness of data warehouses held on relational databases since 1998.
Unfortunately, SSAS is a product of the old scale-up architecture. Meaning, SSASâ€™s capacity grows only through installation on an ever-larger, ever-more expensive server. Therefore, its capacity is limited to the performance of one server. The capacity of such a server, barring multi-million dollar supercomputers, was outgrown long ago.
Smart OLAPâ„˘: Cloud-Native Technology for Speed and Scale
Kyvos Smart OLAPâ„˘ is a cloud-based implementation of OLAP, capable of scaling to cloud-level heights of data volume. It sits between your cloud data warehouse or data lake storage, and your analytics tool. Like a turbocharger, it accelerates responsiveness seamlessly to your analysts by orders of magnitude. That acceleration is facilitated by clever pre-aggregations, the hallmark of OLAP, which minimizes redundant processing.
Notice that I say analysts in the plural. This is because the performance benefits from the pre-aggregations and caching mechanisms of Kyvos Smart OLAPâ„˘ also promote much higher concurrency. Rather than merely a few analysts hitting the data warehouse, dozens to hundreds can be served concurrently through Kyvos. This level of concurrency opens doors to new analytics applications such as embedded BI and assisting with data science tasks such as data profiling.
Kyvos is built on a scale-out architecture for cloud-scale volumes. That means, like all cloud-based applications, it is implemented on clusters that scale â€śinfinitelyâ€ť through the addition of more commodity servers. It is capable of handling not just a few TB of data, but hundreds of TB and more. Kyvos reduces the overall cost of the BI system by minimizing compute costs associated with the data warehouse through the reduction of redundantly processed data. Expensive compute is traded for cheap storage of the pre-aggregated data.
Lastly, Kyvosâ€™ fixed-fee model results in a much more predictable overall cost for the BI system. The fee model for most cloud data warehouse platforms is based on some form of â€śpay-as-you-goâ€ť, either a charge per query or by very expensive compute consumption.
In either case, the learning and discovery nature of analytics means that querying patterns arenâ€™t very predictable. You try this, then that, and then scrap it all in another direction. Should there be cases where expensive queries are executed many times, there could be a surprise bill from the data warehouse and cloud vendor. Kyvosâ€™ fixed-fee model ensures that unpredictable and compute-intensive querying directly to the data warehouse doesnâ€™t result in such a surprise. You can query Kyvos as much as you want at prices that do not generate alarms from your CFO.
OLAP Continues to Live
OLAP has been an indispensable part of business intelligence for a couple of decades. There is no getting around the sensibility of avoiding redundant processing, especially when it can cost large sums of time and money. Thatâ€™s true even at relatively small scales of data. New strategic use cases requiring magnitudes more data can pop up at any moment. Kyvosâ€™ Smart OLAPâ„˘ effectively fills in that venerable space in this day of cloud-scale volumes of data.