Tuesday, January 28, 2014

Dimensional Modeling vs. Relational Modeling

As someone with a background and experience in relational modeling, I often compare what I learn in our course to my existing knowledge of how to model data in a standard way. The following are the main differences between dimensional modeling and relational modeling, according to my learning:

1. There are two distinct groups of people who support the two approaches. There are the “Kimballites,” who believe in Ralph Kimball’s approach in which it is stated that the data warehouse should be modeled using a dimensional model/star schema. Then, there are the “Inmonites,” who follow Bill Inmon's approach in which it is stated that the data warehouse should be modeled using an E-R model/normalized model.

2. In dimensional modeling, data is partitioned into "facts," which generally refer to numeric transaction data, and "dimensions," which provide context to the "facts." For example, a sales transaction can be broken up into "facts" such as the number of products ordered and the price paid for the products, and into "dimensions" such as order date, customer name, product number, order ship-to and bill-to locations, and salesperson responsible for receiving the order.

In relational modeling, however, data is stored as "entities," which have some kind of physical or abstract existence in the real world and hold distinct attributes or properties that define what or who they are. In addition, the "relationships" between the "entities" are modeled to link them together for querying and analysis.

3. Because of the de-normalized form of a dimensional model, it tends to be easier to understand than the relational model. In addition, it reduces the time for data retrieval, since it does not require linking potential hundreds of tables together to run a query. Nonetheless, because the dimensional model is de-normalized, it is more difficult to maintain data integrity and more complicated to load data from different operational systems.

On the other hand, the relational model allows for a straightforward data entry process, but it makes it tougher for users who do not have a precise understanding of the data source and data warehouse structure.

4. With all that said, the two approaches are not mutually exclusive; there are hybrid approaches and dimensional models can be normalized to a certain degree without losing its core advantage of quick processing.

Check out the comparison table below for a few other key differences between the two methods:

Sources:

No comments:

Post a Comment