Monday, 31 October 2022

What is mean by SCD and Types of SCD?

A Slowly Changing Dimension (SCD) is a dimension that stores and manages both current and historical data over time in a data warehouse. It is considered and implemented as one of the most critical ETL tasks in tracking the history of dimension records.

SCD Type 0 : 

This refers to attributes that never change and will not be updated in the data warehouse. 
for eg. date of birth or an employee's start date with a company

SCD Type 1 : 

History is not maintained and the table always shows the recent data. This effectively means that such dimension table is always updated with recent data whenever there is a change, and because of this update, we lose the previous values

SCD Type 2 : 

A type 2 dimension table tracks the historical changes by creating separate rows in the table with different surrogate keys. 

SCD Type 3 : 

A type 3 dimension stored the history in a separate column instead of separate rows. So unlike a type 2 dimension which is vertically growing, a type 3 dimension is horizontally growing. 

SCD Type 4 : 

Provides a solution to handle the rapid changes in the dimension tables. The concept lies in creating a junk dimension or a small dimension table with all the possible values of the rapid growing attributes of the dimension.

SCD Type 6 : 

It is a hybrid of type 1, 2 and 3 (1+2+3) which acts very similar to type 2, but only you add one extra column to denote which record is the current record.



No comments:

Post a Comment

How to perform regression testing in ETL?

Regression testing in ETL (Extract, Transform, Load) ensures that changes or updates to the ETL process do not negatively impact existing fu...