Column Oriented Database
By Levi Brown
Column Oriented Database History & BI
There has been a significant amount of excitement and recent work on column-oriented database systems (“column-stores”). These column oriented databases have been shown to perform more than an order of magnitude better than status-quo with is predominately row-oriented database systems (“row-stores”) on analytical workloads such as those found in data warehouses, decision support, and business intelligence applications.
All the legacy relational databases currently being offered today were
and are primarily designed to handle online transactional processing
(OLTP) workloads. A transaction typically maps to one
or more rows in a relational database, and all traditional RDBMS
designs are based on a per row paradigm. This simplistic view leads to the assumption that we can obtain
the performance benefits of a column vs row store: either the vertical partitioning scheme, or by indexing all column so that columns can be accessed independently. In this article We show that this assumption is false. We compare performance of an online music store under a variety of different configurations with a column store and show that the online store performance is significantly slower on data recently proposed Reference warehouse. We then analyze the performance difference and show that there are important differences between
both systems at the request executor (in addition to the obvious differences in the storage layer). Using the column store We then sort out these differences, which demonstrates the impact on performance of a variety of techniques based on the column performance of the application, Including vectorized query processing, compression, and a the accession of new algorithms, we introduce in this paper. We conclude that Although it is impossible for an online store to achieve some of performance advantages of a column store, changes must be made both the storage layer and the executor of motion for the full advantages of column-oriented approach.
For transactional-based systems, this architecture is well-suited to handle the input of incoming data. The end result for column databases is the ability to interrogate and return query results against either moderate amounts of information (tens or hundreds of GB’s) or large amounts of data (1-n terabytes) in much less time that standard RDBMS systems can.