...
close
Whitepaper Whitepaper
Universal Semantic Layer : The foundation for instant, actionable, agentic analytics

What Is Cardinality?

Cardinality is the term used to define the uniqueness of data values included in a specific column of a database table. It can refer to two things in databases–

  • Relationship Cardinality: While designing a database, it is something that can be referred to as one-to-one, many-to-many or many-to-one relationships.
  • Data Cardinality: This one matters a lot for query performance, it is referred to as the number of discrete values in a column concerning the number of rows in the table.

Cardinality is usually measured as high or low. If there are a lot of distinct values in a particular column, that means high cardinality; on the contrary, if there are a lot of repeated values means low cardinality. It also impacts the query performance in a way that it affects the query execution plan.

How Cardinality Impacts Query Performance?

When a customer visits your e-commerce website, browse products, fill subscriptions forms, select products, make payments, etc. All these activities performed by the customer on any website can help different business units like sales, marketing, finance, etc analyze the customer journey and gain insights to retain their customer. To analyze the whole customer journey, they would need to count all the events, for instance –

Cookies on their website from different customers or distinct visits and perform analytics for year-over-year, week-over-week comparison of these metrics across various dimensions to help them understand how their business is performing and how can it be made better.

Now, this is where they go through an impediment. Many large-scale enterprises deal with data having millions of distinct count cardinalities, so they had to go through a lot of trouble while calculating the distinct count cardinalities. Some of them are –

  • For massive amounts of data, overall execution time increases as the size of the data increases.
  • To calculate distinct count and store the discrete values users need high memory and if they use hashing with compression to reduce memory needs, it will still increase execution time.

How Does Kyvos Help in Solving This Issue?

Kyvos had successfully solved this problem for many large-scale enterprises and provided them a solution to calculate both accurate and approximate distinct counts.

Kyvos enabled them to slice and dice these values interactively against n number of dimensions using their existing BI tools. It also empowered enterprises to create a scalable data model using both accurate and approximate distinct count measures. This can help enterprises get insights in seconds depending upon the cardinality.

When new data gets added, Kyvos also provide an incremental refresh of the data model which assists users to keep their data and distinct count counters up to date.

Final Thoughts

Understanding and managing cardinality is crucial for ensuring efficient query performance, especially when dealing with massive datasets. With a platform like Kyvos, organizations can overcome these limitations through advanced processing, scalable modeling and accurate or approximate distinct count calculations, enabling fast, interactive analytics even on billions of rows.

Back to Glossary