About the Author
Jim McHugh is the Vice President of National Intelligence Service – Emerging Markets Portfolio. Jim is responsible for the delivery of Analytics and Data Management to the Intelligence Community.
Today, continuing through our look at the advanced modeling techniques of dimensional data warehouses, we will take a look at simple hierarchical dimensions.
Simple Hierarchical Dimensions
Hierarchical dimensions are those dimensions which have a parent/child relationship. In simple hierarchies every child has a parent at the level above with no skipping of levels. A date (day, month, quarter, year) is the most common example of a simple hierarchical dimension and one that is used by all dimensional data warehouses.
The question for the data modeler is how to model the hierarchy? Let’s use the date dimension for some initial examples. The first and most common way to model the date hierarchy is to flatten the date into a single dimension where you have every piece of information you need concerning the date in a single location.
In this flattened dimensional hierarchy, you can see all of the parent, grandparent, great-grandparent, etc. information of the date in a single table. The selecting of a date will provide you with the detailed information about the date (day of week name, if the day is a holiday, etc.), as well as the information about the week, month, quarter and year of the calendar year as well as that of the fiscal year. One simple select will provide you with a large amount of detail.
Snowflaking Hierarchical Dimensions
A different way to model this information would be to snowflake the dimension. As can be seen below.
Here you will obtain the same information but, as I pointed out in a previous post, it will take you several additional joins to obtain the same information you could obtain from a single join when using a totally flat hierarchical dimension.
The snowflake is often used to address the challenge of having dimensions which need to use the date at different parts of the hierarchy. For instance, I want to capture the measures at a monthly or quarterly level, not at a specific date. To address this issue without snowflaking one would simply create new dimensions with the unique data from that point forward. I have modeled the DIM_MONTH and DIM_QUARTER dimensions below.
One would use these dimensions as needed, always using the dimension with the most detailed information for the grain of the fact.
Other examples of simple hierarchical dimensions can be seen using a store dimension. A store can be in a district and a group of districts can encompass a region.
Next week I will discuss advanced dimensional design techniques for ragged hierarchies, hierarchies which can skip one or more levels.
About the Author
Jim McHugh is the Vice President of National Intelligence Service – Emerging Markets Portfolio. Jim is responsible for the delivery of Analytics and Data Management to the Intelligence Community.