Explaining OLAP vs OLTP in 2023 with Databricks in mind
2 min readDec 1, 2023
The aim of this page📝 is to explain the concept of databases and their types, based on the particular example of Databricks and OLTP databases.
- Databases are organized collections of data, stored and accessed electronically.
- Databricks is a unified analytics platform that provides a Lakehouse architecture for data engineering, machine learning, and analytics.
- There is an old (some say originally marketing-driven) distinction between OLAP (Online Analytical Processing) and OLTP (Online Transaction Processing) are two types of data processing that serve different purposes.
- OLTP is used for processing database transactions in real-time (i.e. supporting your app)
- OLAP is used for complex data analysis, trend identification, and report generation (i.e. supporting your analytics).
- While there are tools that aim to bridge the gap between OLAP and OLTP, they still have distinct roles in data management.
- Databricks is not typically used as an OLTP system.
- For high-volume, highly transactional scenarios, a specialized OLTP database system would be a more suitable choice.
- There are several databases that are well-suited for OLTP and can efficiently handle real-time transactions.
- Examples of OLTP databases include Microsoft SQL Server, Oracle Database, IBM DB2, MySQL, and PostgreSQL.
- These databases are designed to handle high transaction volumes and are optimized for quick, efficient reads and writes of small amounts of data.
- They can satisfy the transaction ordering and consistency requirements of an OLTP database with ACID (Atomicity, Consistency, Isolation, Durability) compliance.
- NoSQL databases are used to manage large volumes of unstructured or semi-structured data and are often used in high-velocity data processing environments.
- If you are in the position of needing or wanting a NoSQL database for transaction processing, you need to restrict yourself to NoSQL databases with ACID properties.
- Examples of NoSQL databases include MongoDB, Cassandra, Couchbase, and Amazon DynamoDB.
- These databases are designed to handle high transaction volumes and are optimized for quick, efficient reads and writes of small amounts of data.
- They can satisfy the transaction ordering and consistency requirements of an OLTP database with ACID (Atomicity, Consistency, Isolation, Durability) compliance.