slowly changing dimensions

less than 1 minute read

It is for dimension tables where changes are less in source rdbms which we want to get into datawarehouse or hdfs

Type of SCD

SCD Type 1

History is not maintained, data is overwritten.

SCD Type 2

Maintain history. It can be done in below three ways, lets take emaple of product data. All products are stored in products table then:

  • version: maintain multiple version (increasing int value) of same product latest version is the latest details of the product

  • flag: keep flag representing latest details of the product

  • effective date: maintain start date and end date, this way we can keep track of the product details over period of time.

Comments