Tuesday, 22 January 2013

Handling Inferred Members in SSIS


In datwarehouse there are times ,when the dimensional data arrives late than the fact transactional data . The problem these records create is, during the processing of the fact table we cant get the dimension surrogate keys from the dimension table as they are not yet present in the dimension table .Therefore during fact table processing ,if we exempt these transactions ,than our totals will be wrong as some transactions are lost .

The solution to this problem is to insert inferred rows in to the dimension table for the transactional records which are not present in the dimension .For these inferred rows you need to insert the related natural keys from transactional data into the dimension,than set the inferred member indicator  column value to true and set the remaining attributes to unknown value in the dimension  with leaving metadata columns as it is .

After performing the insertion of inferred members into dimension as described previously,than load the fact tables ,so that when you perform the natural key lookup with the dimension table natural key you will not loose those transactions as those natural keys are already loaded onto the dimension table inferred records prior to loading fact table

At this stage we need to note that, even though the rows are included in the fact table they are not truly associated with the actual dimensional data , we need to create an etl logic to process dimension, so that when the actual row appears from the dimension table source ,it will update those inferred records with the actual attributes from the dimension source

SSIS implementation of inferred members

I'm using only two  dimensions employee , date and one fact table for this demo.



The  Dimensional source data looks like below



The transactional source data looks like below



As you can observe that the transactional data contains two records with employee id 16,17 which haven't got the relevant employee id in the dimensional data

I have described the implementation of the  ssis package later in  the article ,but to start the demo ,When you run the package with source  data  as shown above,the employee dimension looks like below ,as you can see the package  inserted two new records for the employeekeys from transactional data  into the dimension table and set the inferred members indicator to true (i.e 1) and  also inserted unknown value into the employee name attribute of the employee dimension ,which is the only single other attribute we have included a part from natural key and metadata columns.



In the next run  i will insert two more records into the  dimensional source which correspond to the previous inferred rows 16,17   of the trasancational data  as shown below



Now after i run the package ,the employee dimension updates inferred rows with employeeid 16,17 with there actual dimensional data and updates its inferred indicator to false as shown below



The developed SSIS Package for this demo looks like below



The package starts with truncating a stage table which contains the employeeid values of facttable as these values are needed to insert the inferred members int to the dimension ,this staging table is loaded before loading the fact data as highlighted in the below image



After truncating the staging table ,we load the dimension ,for this dimension i have used scd component for updating the inferred members and added few metadata columns as shown below



In the next step we load only column empid into the staging stable which we will use in the next for inserting inferred records into the dimension ,the staging load is shown below



In the next step we use a execute sql task to insert inferred records as shown below



And finally we will load the fact table .

No comments:

Post a Comment

Note: only a member of this blog may post a comment.

Power BI Report Server

Power BI Report Server  is a new product which can be used to host both Power BI reports and SSRS reports. Currently beta version is ava...