SqlServer Business Intelligence

Friday, 1 February 2013

Implementing Data Alerts using Sql Server Reporting Services 2012 in SharePoint Integrated Mode

Sql Server Reporting services 2012 has some great new features ,Data Alerts is one the new feature which is a data driven alerting solution which sends the important and interesting report data as per the conditions and time we specify,this feature enables to create interesting business related alerts using a combination of operators .Another major new future of Reporting services in SharePoint integrated mode now is a Shared Service of SharePoint .This shared service architecture allows the configuration of reporting services using SharePoint central administration site and also use the SharePoint scalability future

Before diving into the demo i want to share some common error which i encountered when implementing SharePoint Farm with Sqlserver 2012 is related to SP1 ,if you haven't installed sp1 for SharePoint 2010 than this error will be encountered as shown below,

To start the demo .

The first step is to create a new web application ,go to central administration of share point server2010 and application management and click Manage web applications and create a web application.In the next step go to application management again and go to site collections section and create a site collection with Business Intelligence Center Template selected as shown below

This creates the top level website ,in the next step we will add a a service application of Reporting Services by going to central administration ,application management, service application and click manage service application and create a Reporting services service application as shown below

In the next step go to site collection and site settings to enable all reporting site collection features and also share point enterprise futures as shown below,

Also in site settings go to site actions and site features and enable report server file sync feature

In the next step go to site actions and more options and select report library

In the next step go to the newly created report library and go to library settings and click add from existing content types link which can be found on the same page and select Report Builder Model ,Report Builder report,ReportDataSource content types to add to our library

In the next step on the same report library page go to documents and new documents and click report DataSource and give connection string properties and for credentials use stored credentials as for this demo of data alerts we need stored credentials

In the next step ,on the report library page click new documents and select Report Builder to create a new report ,you will prompted for click once deployment for report builder,permissions need to be granted in the site settings to enable report builder to connect to the site and to use the contents of that site library.In the Report builder click new report and select the data source from the share point report library which we created in earlier step

In the next step create a new data set and select the required fields from the existing tables and click auto detect relationship and edit relationships ,as in most cases auto detect cant detect the complex relationships.

The developed report in this demo looks like below,its a simple report using the AdventureworksDW2012 as data source.

Deploy the report to the report library ,than open the report and from the actions menu of the report , select new data alert as shown below

In data alert window create the condition for the data alert as shown below

As you can see in the above diagram , i have created a data alert to send an email when the condition with product category of Bikes ,sales territory of Canada and sales value greater than 600,023 is met .In the next step go to the the report and click on the dropdown list and select manage data alerts

The alert created is shown below

In the next step right click on the data alert and run the data alert for testing ,if the data alert is run successfully it shows the when the data alert was run and sends the data alert if the required condition is met as shown below

The data alert which i sent was success and The sent mail looks like below ,which shows the condition which we set for the data alert and the filtered data as per the condition as marked in the diagram below

Tuesday, 22 January 2013

Handling Inferred Members in SSIS

In datwarehouse there are times ,when the dimensional data arrives late than the fact transactional data . The problem these records create is, during the processing of the fact table we cant get the dimension surrogate keys from the dimension table as they are not yet present in the dimension table .Therefore during fact table processing ,if we exempt these transactions ,than our totals will be wrong as some transactions are lost .

The solution to this problem is to insert inferred rows in to the dimension table for the transactional records which are not present in the dimension .For these inferred rows you need to insert the related natural keys from transactional data into the dimension,than set the inferred member indicator column value to true and set the remaining attributes to unknown value in the dimension with leaving metadata columns as it is .

After performing the insertion of inferred members into dimension as described previously,than load the fact tables ,so that when you perform the natural key lookup with the dimension table natural key you will not loose those transactions as those natural keys are already loaded onto the dimension table inferred records prior to loading fact table

At this stage we need to note that, even though the rows are included in the fact table they are not truly associated with the actual dimensional data , we need to create an etl logic to process dimension, so that when the actual row appears from the dimension table source ,it will update those inferred records with the actual attributes from the dimension source

SSIS implementation of inferred members

I'm using only two dimensions employee , date and one fact table for this demo.

The Dimensional source data looks like below

The transactional source data looks like below

As you can observe that the transactional data contains two records with employee id 16,17 which haven't got the relevant employee id in the dimensional data

I have described the implementation of the ssis package later in the article ,but to start the demo ,When you run the package with source data as shown above,the employee dimension looks like below ,as you can see the package inserted two new records for the employeekeys from transactional data into the dimension table and set the inferred members indicator to true (i.e 1) and also inserted unknown value into the employee name attribute of the employee dimension ,which is the only single other attribute we have included a part from natural key and metadata columns.

In the next run i will insert two more records into the dimensional source which correspond to the previous inferred rows 16,17 of the trasancational data as shown below

Now after i run the package ,the employee dimension updates inferred rows with employeeid 16,17 with there actual dimensional data and updates its inferred indicator to false as shown below

The developed SSIS Package for this demo looks like below

The package starts with truncating a stage table which contains the employeeid values of facttable as these values are needed to insert the inferred members int to the dimension ,this staging table is loaded before loading the fact data as highlighted in the below image

After truncating the staging table ,we load the dimension ,for this dimension i have used scd component for updating the inferred members and added few metadata columns as shown below

In the next step we load only column empid into the staging stable which we will use in the next for inserting inferred records into the dimension ,the staging load is shown below

In the next step we use a execute sql task to insert inferred records as shown below

And finally we will load the fact table .

Monday, 21 January 2013

Implementing Hybrid Slowly Changing Dimension using SSIS

Hybrid Slowly changing dimension is a specific type of SCD which implements both Type1 and Type2 in the same dimension.

Business users in certain domains have requirement to analyse the data both with the current version and historic perspective in order to gain useful insights ,as we know the type 1 scd satisfies the analyse of the data for the current version and type2 scd enables to analyse the complete history of the data .

To be specific ,consider a customer data as shown below

I will take the customer Jimmy Adams to show the change of occupation using our demo ,Currently Jimmy Adams is from Cambridge and he is a customer of our imaginary company and his current occupation is student as shown above

Lets imagine that Jimmy Adams starts working and the occupation changes to working as shown below

Our requirement is to analyse the data with the current status and with the complete employment status history which is essentially type1 scd and type2 scd as sown below

I will be implementing this requirement with ssis 2012,I have created a ssis package ,the control flow of the package looks like below

It contains a for each loop container which loops through the text files in a folder and maps its location to a variable and i will be using that variable to assign to an expression of my customer source connection string in the data flow ,so that it enables to extract all the existing files of type txt in that folder

The for each loop container collection settings is shown below

The for each loop variable mapping is shown below

After the settings in control flow, we will go to the data flow and will create a flat file source and set the connection string of the flatfile adapter using an expression in properties window , the expression maps to our for each loop variable and gets the file location as shown below

To keep it simple i'm using inbuilt slowly changing dimension component,which i will explain later how to configure specific to our scenario.

The data flow starts by extracting the customer files ,than we add a copy column to create another column to meet our requirement of type 1 and type 2, as we need another column fo to hold the status of employment history,our data flow looks like below

We will configure the columns including the newly created employment status column called Employment Status History in the scd wizard,double click the scd wizard connect to our destination customer dimension, map the source columns with the customer dimension columns and select the business key as shown below

We are not mapping the start date ,End State and Status Column as they are meta data columns,our customer dimension looks like below,please ignore the data in the diagram at this stage ,i will explain at the end

After the previous settings in scd wizard click next and configure the settings ,at this stage you need to determine which columns are of which type ,in my case except employment Status previous column all other columns are set to changing attributes which are of type1 and employment status previous column which we created earlier using our copy column is set to a historic attribute to make it type 2,the settings are shown in below image

After the previous setting, click next in the wizard and make the settings as shown below for changing attributes, by clicking the check box in this wizard , we are telling scd component to update all the changing attributes (type1) to the current latest value as shown below

Click next in the wizard and make the settings for our historic attribute ,Previous employment status column is the only column we mapped for historic attribute previously ,we will be using the meta data columns Start Data,End Date and System time variable to manage the type 2 changes and i will not go through the details of this as there are lot of article on the net which explain the details.The adavantage of the time stamps start date and end date is it allows us to query the status of the employment at any point in time .The below image shows the settings which we described

Click next and accept the default setting for inferred members configuration, as we are not implementing any inferred members ,and finally click next to finish the wizard configuration .

After we run the package with the source data for the first time as shown before ,our dimension will be loaded with the customer attributes as shown below,currently the employment status current and employment status previous are same ,as we haven't got any changes as of now

We will insert a new record for Customer JimmyAdams now with new employment status working as shown below

Now will run the package again,and it will insert a new record and updates the end dates of the previous record for jimmy adams as shown below.

As shown above the current employment status of Jimmy Adams is updated to current values which in our case contains updated working status value and previous status column contains all the values which include student and working also the previous expiry date is shown in end date column

Saturday, 19 January 2013

Implementing Slowly Changing Dimension Type 3 (SCD 3 ) with SSIS

This article defines about Slowly Changing Dimension 3 ,the uses of slowly changing dimension Type 3 and its implementation using SSIS .

Slowly Changing Dimension 3:

Some times in business ,customers regional grouping changes from one region to another region over the time ,the requirement for analyses of the complete data by the new region and the analyses of the complete data by the old region is necessary , scd type 3 will make this possible

Imagine a customer who belongs to a West region group has been moved to Northwest group over the time, and we want to analyse all the sales data by West and as well as Northwest group

Type 1 doesn't meet the requirement as it updates all the old values with the new values ,in this case all the values will be updated from West to Northwest for that customer

Type 2 doesn't meet the requirement either as it associates all the old values with West region and the new values with the Northwest

The Customers source data which i have prepared for this demo contains customers information with there regions information and it looks like as shown below

Fact Source data looks like this:

I have created three separate packages one for loading customer dimensio,n another for loading fact sales , and a master package ,the date dimension is a standard data dimension and have used from AdventureWorksDW2012 DB .

The customer dimension control flow looks like this

The customer dimension Data flow looks like this

The data flow starts by extracting the data from customer source ,the extracted records are referenced with the customer dimension to check for new customers and for existing customers we will pickup the current region and previous region fields

Customers mapping of the new customers looks like below image,as we notice we don't map previous region, as there will be no previous region for new customers

In the next step I'm using conditional split to check the change of regions as shown below

At this stage i'm collecting the regions which have changed in a staging table and updating it using Execute Sql Task as shown below

As you can see in the query what I'm doing is updating the current region with the new region and previous region with the current region . After i run the master package ,the customer dimension and factsales will be inserted with the initial regional values as shown below and at this stage we have only current regions

I'm inserting a new customer record and its associated two sales for which there is a change of region as shown below

The customer dimension and factsales after the insertion of new records is shown below

As you can see the change of the customer region from West to North West ,and the associated fact records for the customer tony ,now we can analyse the data both with the new and old region .

One key point about SCD3 is that, the history will be reinstated from second change as we are holding only one column for previous history,if you need to analyse data with more changes ,than you need to include more columns and change the etl logic as necessary