Apache Pinot: The Columnar OLAP Database Explained
star tree, Thursday, August 17,2023
In 1972, E.F. Codd published a paper that started the relational database management systems (RDBMS) revolution. This encouraged us to think of data as tables made of rows, with each row made up of the same set of columns. The typical implementation of an RDBMS stored the data in rows on disk.
In 2005, Michael Stonebaker et al. published another paper that catalyzed the C-Store, or columnar database, revolution. The paper introduced the idea of a columnar database, which organizes storage by columns, not by rows.
Column vs. Row-Oriented Databases
This blog post aims to contrast row-based with column-based databases, including their architectures, performance, data retrieval efficiency, and suitable applications. Most importantly, we'll explore why Apache Pinot is columnar and how it can deliver real-time analytics at scale.