Ā« Back to Glossary

 

Aggregate/Smart aggregation

 

Aggregation is the process of summarizing multiple values into a single value. Aggregations can be understood as precalculated summaries of data from leaf cells. Aggregations improve query response time by preparing the answers before the questions are asked.

For example, you may have daily sales data that can be aggregated into a value for the week, the weekly data can be aggregated into a value for the month, and so on. If a query requests the weekly sales totals for a particular product line, it can take a long time to answer if all the rows in the fact table have to be scanned and summed at query time to compute the answer. But, if the summarization data used to answer this query has been precalculated using aggregations, the response time is immediate.

The most frequently used aggregation operator is Sum, but there are many other operators, such as Average, First, Last, Minimum, and Maximum.

Aggregation strategy

In the world of OLAP,  aggregation strategy includes all the aggregates defined for a cube. Though pre-calculation of all possible aggregations in a cube might provide the fastest possible response time for all queries, calculating all possible aggregations requires significant processing time and storage. So, there is a tradeoff between storage requirements and the percentage of possible aggregations that are precalculated. If no aggregations are precalculated, the processing time and storage space for a cube is minimized, but query response time will be slow. It is imperative to define an aggregation strategy that ensures optimized cube building and query response times.

Kyvos comes with built-in intelligent algorithms to select aggregations for pre-calculation, and it can quickly compute some aggregated values from the other precalculated values. For example, if the aggregations are precalculated for the Month level of a Time hierarchy, the calculation for a Quarter level requires only the summarization of three numbers, which can be quickly computed on demand. This technique saves processing time and reduces storage requirements, with minimal effect on query response time.

Smart aggregation

Kyvos comes with a built-in ML-powered Smart Aggregation engine that uses query history, data profiles, and test builds to identify and recommend an optimal set of aggregates to be built for a given cube design. This ensures that you aggregate only the most frequently used combinations in your data. Further, the system can recommend new aggregations based on changes in your usage patterns.

After the aggregation has been created, if there is any change in the cube design, or if data is added to or changed in a cube’s source tables, it is recommended to get recommendations for cube’s aggregations and build the cube again.

Ā« Back to Glossary