The solution to this problem is to insert inferred rows in to the dimension table for the transactional records which are not present in the dimension .For these inferred rows you need to insert the related natural keys from transactional data into the dimension,than set the inferred member indicator column value to true and set the remaining attributes to unknown value in the dimension with leaving metadata columns as it is .
After performing the insertion of inferred members into dimension as described previously,than load the fact tables ,so that when you perform the natural key lookup with the dimension table natural key you will not loose those transactions as those natural keys are already loaded onto the dimension table inferred records prior to loading fact table
At this stage we need to note that, even though the rows are included in the fact table they are not truly associated with the actual dimensional data , we need to create an etl logic to process dimension, so that when the actual row appears from the dimension table source ,it will update those inferred records with the actual attributes from the dimension source
SSIS implementation of inferred members
I'm using only two dimensions employee , date and one fact table for this demo.
The Dimensional source data looks like below
The transactional source data looks like below
As you can observe that the transactional data contains two records with employee id 16,17 which haven't got the relevant employee id in the dimensional data
I have described the implementation of the ssis package later in the article ,but to start the demo ,When you run the package with source data as shown above,the employee dimension looks like below ,as you can see the package inserted two new records for the employeekeys from transactional data into the dimension table and set the inferred members indicator to true (i.e 1) and also inserted unknown value into the employee name attribute of the employee dimension ,which is the only single other attribute we have included a part from natural key and metadata columns.
In the next run i will insert two more records into the dimensional source which correspond to the previous inferred rows 16,17 of the trasancational data as shown below
Now after i run the package ,the employee dimension updates inferred rows with employeeid 16,17 with there actual dimensional data and updates its inferred indicator to false as shown below
The developed SSIS Package for this demo looks like below
The package starts with truncating a stage table which contains the employeeid values of facttable as these values are needed to insert the inferred members int to the dimension ,this staging table is loaded before loading the fact data as highlighted in the below image
After truncating the staging table ,we load the dimension ,for this dimension i have used scd component for updating the inferred members and added few metadata columns as shown below
In the next step we load only column empid into the staging stable which we will use in the next for inserting inferred records into the dimension ,the staging load is shown below
In the next step we use a execute sql task to insert inferred records as shown below
And finally we will load the fact table .
No comments:
Post a Comment
Note: only a member of this blog may post a comment.