Skip to main content

Slowly changing dimensions

Slowly changing dimensions in MSTR

Slowly changing dimensions (SCDs) are a common characteristic in many business intelligence environments. Usually, dimensional hierarchies are presented as independent of time. For example, a company may annually reorganize their sales organization or recast their product hierarchy for each retail season. “Slowly” typically means after several months or even years. Indeed, if dimensional relationships change more frequently, it may be better to model separate dimensions.
SCDs are well documented in the data warehousing literature. Ralph Kimball has been particularly influential in describing dimensional modeling techniques for SCDs (see The Data Warehouse Toolkit, for instance). Kimball has further coined different distinctions among ways to handle SCDs in a dimensional model. For example, a Type I SCD presents only the current view of a dimensional relationship, a Type II SCD preserves the history of a dimensional relationship, and so forth.
The discussion below is based on an example sales organization that changes slowly in time as the territories are reorganized; for example, sales representatives switch districts in time.

As-is vs. as-was analysis

One of the capabilities available with slowly changing dimensions is the ability to perform either “as-is” analysis or “as-was” analysis:
“As-is” analysis presents a current view of the slowly changing relationships. For example, you can display sales by District according to the way Districts are organized today.
“As-was” analysis presents a historical view of the slowly changing relationships. For example, you can display sales by District according to the way Districts were organized at the time the sales transactions occurred.
The techniques described here provide the flexibility to perform either type of analysis. They also provide you an easy way to specify which type of analysis you would like to perform.

Example 1: Compound key with Effective Date and End Date

One way to physically store an SCD is to employ Effective Date and End Date columns that capture the period of time during which each element relationship existed. In the example below, Sales Rep Jones moved from District 37 to District 39 on 1/1/2004, and Kelly moved from District 38 to 39 on 7/1/2004.
For information on compound keys, please refer to Lookup tables: Attribute storage.
LU_SALES_REP
Sales_Rep_ID
Sales_Rep_Name
District_ID
Eff_Dt
End_Dt
1
Jones
37
1/1/1900
12/31/2003
2
Smith
37
1/1/1900
12/31/2099
3
Kelly
38
1/1/1900
6/30/2004
4
Madison
38
1/1/1900
12/31/2099
1
Jones
39
1/1/2004
12/31/2099
3
Kelly
39
7/1/2004
12/31/2099
When using this type of dimensional lookup table, the fact table must include a date field, such as a transaction date.
FACT_TABLE
Sales_Rep_ID
Trans_Dt
Sales
1
9/1/2003
100
2
9/10/2003
200
3
9/15/2003
150
1
3/1/2004
200
2
3/10/2004
250
3
3/15/2004
300
2
9/5/2004
125
3
9/15/2004
275
4
9/20/2004
150

To specify the MicroStrategy schema

1Create a logical view to represent just the current District-Sales Rep relationships.
LVW_CURRENT_ORG
select Sales_Rep_ID, District_ID
from LU_SALES_REP
where End_Dt = '12/31/2099'
2Create another logical view that performs the “as-was” join between the lookup table and fact table, resulting in a fact view at the District level.
The resulting view is an “as-was” or historical view, which captures the Sales Rep-District relationships that existed at the time the transactions occurred.
LVW_HIST_DISTRICT_SALES
select District_ID, Trans_Dt, sum(sales)
sales 
from LU_SALES_REP L
join FACT_TABLE F
on(L.Sales_Rep_ID = F.Sales_Rep_ID)
where F.Trans_Dt between L.Eff_Dt and
L.End_Dt
group by District_ID, Trans_Dt
3Create a table alias LU_CURRENT_DISTRICT for LU_DISTRICT.
4Define the following attributes:
Sales Rep:
@ID = sales_rep_id; @Desc = sales_rep_name
Tables: LU_SALES_REP (lookup), LVW_CURRENT_ORG, FACT_TABLE
Current District:
@ID = district_id; @Desc = district_name
Tables: LU_CURRENT_DISTRICT (lookup), LVW_CURRENT_ORG
Child: Sales Rep
Historical District:
@ID = district_id; @Desc = district_name
Tables: LU_DISTRICT (lookup), LU_SALES_REP, LVW_HIST_DISTRICT_SALES
Child: Sales Rep
Date:
@ID = date_id, trans_dt
Tables: LU_TIME (lookup) , FACT_TABLE, LVW_HIST_DISTRICT_SALES
Month:
@ID = MONTH_ID
Tables: LU_TIME (lookup)
5Define the Sales fact:
Expression: sales
Tables: FACT_TABLE, LVW_HIST_DISTRICT_SALES
6Define the metric as required:
Sales: SUM(sales)
The result of this is a logical schema that looks like the following:

As-was analysis

Specify the “as-was” analysis by using the Historical District attribute on reports:
Report definition: Historical District, Month, Sales
Resulting SQL
Select a11.District_ID District_ID,
max(a13.District_Name) District_Name,
a12.Month_ID Month_ID,
sum(a11.SALES) WJXBFS1
From (select District_ID, Trans_dt,sum(sales) sales
from LU_SALES_REP L
join FACT_TABLE F
on (L.Sales_rep_ID = F.Sales_rep_ID)
where F.trans_dt between L.EFF_DT and
L.END_DT
group by District_ID, Trans_dt)
a11
join LU_TIME a12
on (a11.Trans_dt = a12.Date_ID)
join LU_DISTRICT a13
on (a11.District_ID = a13.District_ID)
group by a11.Distrcit_ID,
a12.Month_ID
Report results

As-is analysis

Specify the “as-is” analysis by using the Current District attribute on reports:
Report definition: Current District, Month, Sales
Resulting SQL
select a12.District_ID District_ID,
max (a14.District_Name) District_Name,
a13.Month_ID Month_ID,
sum(a11.SALES) WJXBFS1
from FACT_TABLE a11
join (select Sales_rep_ID, District_ID
from LU_SALES_REP
where END_DT = '12/31/2099')a12
on (a11.Sales_Rep_ID =
a12.Sales_Rep_ID)
join LU_TIME a13
on (a11.Trans_dt = a13.Date_ID)
join LU_DISTRICT a14
on (a12.District_ID = a14.District_ID)
group by a12.District_ID,
a13.Month_ID
Report result

Example 2: New surrogate key for each changing element

A more flexible way to physically store a SCD is to employ surrogate keys and introduce new rows in the dimension table whenever a dimensional relationship changes. Another common characteristic is to include an indicator field that identifies the current relationship records. An example set of records is shown below.
LU_SALES_REP
Sales_Rep_CD
Sales_Rep_ID
Sales_Rep_Name
District_ID
Current_Flag
1
1
Jones
37
0
2
2
Smith
37
1
3
3
Kelly
38
0
4
4
Madison
38
1
5
1
Jones
39
1
6
3
Kelly
39
1
When using this type of dimensional lookup table, the fact table must also include the surrogate key. A transaction date field may or may not exist.
FACT_TABLE
Sale-Rep_CD
Sale
1
100
2
200
3
150
5
200
2
250
3
300
2
125
6
275
4
150

Specifying the MicroStrategy schema

1Create a logical view to represent just the current District-Sales Rep relationship.
LVW_CURRENT_ORG
select Sales_rep_ID, District_ID
from LU_SALES_REP
where Current_flag = 1
2Create a table alias LU_CURRENT_DISTRICT for LU_DISTRICT.
3Define the following attributes:
Sales Rep Surrogate:
@ID = sales_rep_cd
Tables: LU_SALES_REP (lookup), FACT_TABLE
Sales Rep:
@ID = sales_rep_id; @Desc = sales_rep_name
Tables: LU_SALES_REP (lookup), LVW_CURRENT_ORG
Child: Sales Rep Surrogate
Current District:
@ID = district_id; @Desc = district_name
Tables: LU_CURRENT_DISTRICT (lookup), LVW_CURRENT_ORG
Child: Sales Rep
Historical District:
@ID = district_id; @Desc = district_name
Tables: LU_DISTRICT (lookup), LU_SALES_REP
Child: Sales Rep
Date:
@ID = date_id, trans_dt
Tables: LU_TIME (lookup), FACT_TABLE
Month:
@ID = MONTH_ID
Tables: LU_TIME (lookup)
Child: Date
4Define the Sales fact:
Expression: sales
Tables: FACT_TABLE, LVW_HIST_DISTRICT_SALES
5Define the metric as required:
Sales: SUM(sales)
The result is a logical schema as follows:

As-was analysis

Specify the “as-was” analysis by using the Historical District attribute on reports:
Report definition: Historical District, Month, Sales
Resulting SQL
select a12.District_ID District_ID,
max(a14.Distrcit_Name) Distrcit_Name,
a13.Month_ID Month_ID,
sum(a11.SALES) WJXBFS1
from FACT_TABLE a11
join LU_SALES_REP a12
on (a11.Sales_Rep_CD =
a12.Sales_Rep_CD)
join LU_TIME a13
on (a11.Trans_dt = a13.Date_ID)
join LU_DISTRICT a14
on (a12.District_ID =
a14.District_ID)
group by a12.District_ID, 
a13.Month_ID
Report results

As-is analysis

Specify the “as-is” analysis by using the Current District attribute on reports:
Report definition: Current District, Month, Sales
Resulting SQL:
select a13.District_ID District_ID,
max(a15.Distrcit_Name) District_Name,
a14.Month_ID Month_ID,
sum(a11.SALES) WJXBFS1
from FACT_TABLE a11
join LU_SALES_REP a12
on (a11.Sales_Rep_CD =
a12.Sales_Rep_CD)
join (select Sales_rep_ID, District_ID
from LU_SALES_REP
where current_flag = 1) 
a13
on (a12.Sales_Rep_ID =
a13.Sales_Rep_ID)
join LU_TIME a14
on (a11.Trans_dt = a14.Date_ID)
join LU_DISTRICT a15
on (a13.District_ID =
a15.District_ID)
group by a13.District_ID,
a14.Month_ID
Report result

Comments

Popular posts from this blog

Fact tables levels tables in Microstrategy explained

Fact tables levels in Microstrategy: Fact tables are used to store fact data. Fact tables should contain attribute Id's and fact values which are measurable. All the descriptive information about the fact tables should stored in Dimension tables either in Star Schema fashion or Snow Flake Schema fashion which is best suited to your reporting solution. Since attributes provide context for fact values, both fact columns and attribute ID columns are included in fact tables. Facts help to link indirectly related attributes using these attribute ID columns. The attribute ID columns included in a fact table represent the level at which the facts in that table are stored. So the level of a fact table in the Fact_Item_Day_Customer can be the attribute Id's which is at Day, Item & Customer Id level. For example, fact tables containing sales and inventory data look like the tables shown in the following diagram: Base fact columns ver...

Case functions Microstrategy

Ca se functions Microstrategy Case functions return specified data in a SQL query based on the evaluation of user-defined conditions. In general, a user specifies a list of conditions and corresponding return values. Case This function evaluates multiple expressions until a condition is determined to be true, then returns a corresponding value. If all conditions are false, a default value is returned.  Case  can be used for categorizing data based on multiple conditions. This is a single-value function. Syntax Case ( Condition1 ,  ReturnValue1 ,  Condition2 , ReturnValue2 ,...,  DefaultValue ) Example Case(([Total Revenue] < 300000), 0, ([Total Revenue] < 600000), 1, 2) sum(Case (Day@DESC in (“Sat”,”Sun”), Sales, 0) {~+} Sum(Case(Category@DESC In("Books","Electronics"),Revenue,0)){~+} CaseV (case vector) CaseV  evaluates a single metric and returns different values according to the results. It can be used to perfo...

Optimizing queries in Microstrategy using VLDB properties

Optimizing queries in  Microstrategy using VLDB properties #vldb #vldbproperties The table b elow summarizes the Query Optimizations VLDB properties. Additional details about each property, including examples where necessary, are provided in the sections following the table. Property Description Possible Values Default Value Additional Final Pass Option Determines whether the Engine calculates...

Microstrategy "Error type: Odbc error. Odbc operation attempted

 "Error type: Odbc error. Odbc operation attempted: SQLExecDirect. [HYT00:0: on SQLHANDLE] [MicroStrategy][ODBC Oracle Wire Protocol driver]Timeout expired" is shown when executing reports from Web When users are trying to execute some reports in MicroStrategy web in particular, they may receive the Error “SQL Generation Complete Index out of range” and “Timeout expired” error as shown below: Possible Causes: One possible cause is that the MicroStrategy Intelligence Server using a cached database connection that was already dropped by the RDBMS. To resolve this: Admin should delete the database connection caches and create a new DSNs in case they are sharing DSNs to connect to different databases. In addition, change the settings for the ‘Connection lifetime’ and the ‘Connection idle time out’.  Follow the steps below to perform the mentioned changes and verify the report after each step and some of the settings require i-server r...

Control the display of null and zero metric values

Show   Control the display of null and zero metric values in a grid report You can determine how to display or hide rows and columns in a grid report that consist only of null or zero metric values. You can have MicroStrategy hide the rows and columns in the following ways: Hide rows and columns that consist only of null metric values Hide rows and columns that consist only of zero metric values Hide rows and columns that consist only of null or zero metric values (default) Once you have defined how MicroStrategy hides null and zero metric values in the grid, you can quickly show or hide the grid using the Hide Nulls/Zeros option in the Data menu, as described below, or by clicking the  Hide Nulls/Zeros  icon  in the Data toolbar. To determine how null and zero metric values are displayed or hidden in a grid report Open the report in Edit mode. From the  Tools  menu, select  Report Options . The Report Options...

Settings for Outer Join between metrics in MicroStrategy

Settings for Outer Join between metrics in MicroStrategy MicroStrategy adopts multi-pass logic to determine the execution plan for a report. This means that every metric is evaluated in separate SQL passes. Outer Joins come into play when MicroStrategy Engine merges the results from all SQL passes into one report. For a multi-pass report, different Outer Join behaviors can give the user completely different results. In addition, report metrics can be of different types which can, in some cases, influence the result of the outer join. In MicroStrategy, there are two settings that users can access to control Outer Join behavior : Formula Join Type and Metric Join Type . Metric Join Type: VLDB Setting at Database Instance Level Report and Template Levels Report Editor > Data > Report Data Options Metric Level   Metric editor > Tools > Metric Join Type Control Join between Metrics Formula Join Type: Only at Compound/Split...

System Manager workflow to execute on a schedule

Creating a System Manager workflow to execute on a schedule System Manager workflow can execute on a schedule or after an event has been triggered. This can be accomplished by creating a simple batch file, and scheduling that batch file to execute with a third-party tool like Microsoft Task Scheduler.   Note : To avoid user permission conflicts, the following steps must be performed with highest privileges.   In the below example, the workflow makes the i-server restarts every day.   1. The user must first have a valid workflow. This particular workflow is a template that is delivered out-of-the-box with System Manager.   2. Save the workflow in  .smw  format.   3. In a text editor (such as Notepad), enter the command line statement that the task scheduler should execute.     MASysMgr.exe -w C:\filename.smw” “UserName=User1 “Password=1234”   4. Save the file in  .bat  ...

Multi-Table Data Import(MTDI) from one or more supported data sources

Multi-Table Data Import(MTDI) from one or more supported data sources In MicroStrategy Analytics Enterprise Web 10 onewards, users can now simultaneously import two or more tables from one or more supported data sources, this feature is called Multi-Table Data Import (MTDI) which has been renamed as Super Cubes in MSTR 2019 (Does it sound like multisourcing for all the users without admin help?) Currently, all connectors in MicroStrategy Web 10 except " OLAP " and " Search Engine Indices " support Multi-Table Data Import. Users are able to add multiple tables/files when doing data import from single connector, as shown below: Users are also able to combine multiple tables/files from different sources and store them into one single Intelligent Cube, as shown below:
MicroStrategy Developer Preferences options are expanded so big that some options are being cutoff. Show the hidden objects in the  Microstrategy  developer MicroStrategy Developer Preferences options are expanded so big that some options are being cut off. The steps below given in the MSTR article may not work. This can be simple handled by using the steps below:  In the Microstrategy Developer go to Tools -> Preferences (Not my prefernces :) ) Under Developer category -> select Browsing on the browsing tab you see all the options like below: 3. Now using the mouse place the cursor on text box of 10000 which is next to 'Maximum number of monitoring objects displayed per page. 4. Then Hit Tab on Keyboard and hit another Tab on keyboard 5. Then press the space or down arrow on keyboard and click on OK or Enter. That will show the hidden objects in the Microstrategy developer   Normal Version ...

Fiscal Week, Fiscal Month, Fiscal Quarter and Fiscal Year calculations in Microstrategy

Fiscal Week, Fiscal Month, Fiscal Quarter and Fiscal Year calculations in Microstrategy FiscalWeek Returns the numeric position of a week within a fiscal year, for a given  input date. This function is useful in financial reporting when the start of the fiscal year is different than the start of the calendar year. Syntax FiscalWeek< firstWeekDay ,  firstMonth >( Date / Time ) Where: • Date / Time  is the input date or timestamp. • firstWeekDay  (default value is 1) is a parameter that determines which day of the week is considered as the first day of the week. You can type an integer value from 1 to 7, with 1 representing Sunday, 2 representing Monday, and so on until 7 representing Saturday. • firstMonth  (default value is 1) is a parameter that determines which month is considered as the start of the fiscal year. You can type an integer value from 1 to 12, with 1 representing January, 2 representing February, and so on until ...