Skip to main content

Star schemas and aggregate (or summary) fact tables


Aggregate tables can further improve query performance by reducing the number of rows over which higher-level metrics must be aggregated. 
However, the use of aggregate tables with dimension tables is not a valid physical modeling strategy. Whenever aggregation is performed over fact data, it is a general requirement that tables joined to the fact table must be at the same attribute level or at a higher level. If the auxiliary table is at a lower level, fact rows will be replicated prior to aggregation and this will result in inflated metric values (also known as "multiple counting").

With the above Time dimension table, a fact table at the level of Day functions correctly because there is exactly one row in DIM_TIME for each day. To aggregate the facts to the level of Quarter, it is valid to join the fact table to the dimension table and group by the quarter ID from the dimension table.

Sql
select DT.QUARTER_ID,
   max(DT.QUARTER_DESC) Quarter_Desc
   sum(FT.REVENUE) Revenue
from DAY_FACT_TABLE FT
   join DIM_TIME DT
      on (FT.DAY_ID = DT.DAY_ID)
group by DT.QUARTER_ID

If, however, there is an aggregate fact table already at the level of Quarter, the results will not be correct. This is because the query must join on Quarter ID, but the quarter ID is not a unique key of the dimension table. Because any given quarter of a year contains 90, 91 or 92 days, the dimension table will contain that many rows with the same quarter ID. Thus fact rows will be replicated prior to taking the sum, and the sum will be too high.

Sql
select FT.QUARTER_ID,
   max(DT.QUARTER_DESC) Quarter_Desc
   sum(FT.REVENUE) Revenue
from QTR_FACT_TABLE FT
   join DIM_TIME DT
      on (FT.QUARTER_ID = DT.QUARTER_ID)
group by FT.QUARTER_ID

This is a generally recognized problem with star schemas, and is not strictly a MicroStrategy limitation.

Star schemas will function correctly with MicroStrategy SQL Generation Engine 8.x as long as they obey the general data warehousing principle that fact tables should not be at a higher level than the dimension tables to which they are joined.

If aggregate tables are required, it is necessary to provide higher-level lookup tables with unique rows corresponding to each aggregate table's key. Logical views are a way to do this without adding tables or views to the warehouse. For example, LWV_LU_QUARTER may be defined using the following SQL statement:

Sql
select distinct QUARTER_ID, QUARTER_DESC, YEAR_ID
from DIM_TIME

 

With this logical view, it becomes possible for MicroStrategy SQL Generation Engine 8.x to query the quarter-level fact table as follows. Since the logical view has distinct rows per quarter, multiple counting will not occur in this query.

Sql
select FT.QUARTER_ID,
   max(LQ.QUARTER_DESC) Quarter_Desc
   sum(FT.REVENUE) Revenue
from QTR_FACT_TABLE FT
   join (select distinct QUARTER_ID, QUARTER_DESC, YEAR_ID
         from DIM_TIME) LQ
      on (FT.QUARTER_ID = LQ.QUARTER_ID)
group by FT.QUARTER_ID

For more information on the use of logical views in MicroStrategy SQL Generation Engine 8.1.x and 9.x, consult the MicroStrategy Project Design Guide manual, Appendix B: Logical Tables, "Creating logical tables."

Aggregate tables store pre-summarized totals at a higher level of aggregation than the most granular fact table. They allow reports to be generated from small, rather than large, tables; therefore, performance is enhanced. A successful aggregation strategy seeks to choose aggregate tables that will have the most impact while taking the least amount of space.

Aggregation decisions are driven by the following factors:

  • Usage patterns: Build aggregate tables that are likely to be used the most.
  • Compression ratios: The compression ratio between two tables is defined as the size of the aggregate compared to the size of the smallest table from which the aggregate can be derived.
  • Volatility: Changes in hierarchies over time impact the accuracy of aggregate tables. Sometimes aggregate tables must be rebuilt as a result of changes in dimensions.
A good candidate for aggregation should have at most 10-15 percent of the size of the smallest table from which it is derived.

EXAMPLE:
The MicroStrategy Tutorial project uses aggregate tables by default. A simple metric sum (Revenue) will go to different aggregate tables depending on the attributes on the template.

  1. Create a report with Year on the rows and Revenue on the columns.
  2. Execute the report and view the SQL:

  3. Drill from Year to Item and view the SQL:

The query will go from using ORDER_FACT to ORDER_DETAIL. When Year is on the template, the engine selects the smaller table and the fact is calculated as:

sum(a11.ORDER_AMT)
    instead of:

    sum((a11.QTY_SOLD * (a11.UNIT_PRICE - a11.DISCOUNT)))

    Comments

    1. Is there any solution now to use aggregate tables with star schema without creating logical tables?

      ReplyDelete

    Post a Comment

    Popular posts from this blog

    MicroStrategy URL API Parameters

    MicroStrategy URL Structure The following table summarizes the root URL structure used for every request to MicroStrategy Web. Environment Main Application URL Administration URL J2EE http://webserver/MicroStrategy/servlet/mstrWeb http://webserver/MicroStrategy/servlet/mstrWebAdmin .NET http://webserver/MicroStrategy/asp/Main.aspx http://webserver/MicroStrategy/asp/Admin.aspx Every request sent to MicroStrategy Web calls a central controller. Parameters are appended to  Main.aspx  or  mstrWeb  (in a .NET and J2EE environment, respectively) to indicate to the controller how the request should be internally forwarded and handled. The following examples show a URL for accessing a MicroStrategy folder when the user does not have an existing session. The URL contains not only the parameters needed to connect to MicroStrategy Web, but also the parameters needed to log on and create a session. J2EE environment: <a href="http:...

    No 'Alert' option appear when trying to create an alert-based subscription in MicroStrategy Distribution Services

    The 'Alert' option does not appear when attempting to create an alert-based subscription in MicroStrategy Distribution Services In MicroStrategy Distribution Service 9.x and 10.x, and 11.x versions it is possible to create an alert-based subscription. When right-clicking the metric header of a report in MicroStrategy Web 9.0.x, the 'Alerts' option does not appear:    Cause : This issue occurs because the user attempting to create the alert does  not have all of the necessary privileges on alerts.   Fix : In order to create an alert-based subscription, the following privileges are required: In order ti get permissions to create alerts the user should be given the following privileges by the admin: New Version of Microstrategy 11.x: Server- Distribution: Older Versions of Microstrategy 9.x, 10.x etc..: Web Reporter > Web user Web Analyst > Web create alert   ...

    Clean uninstall MicroStrategy Analytics Enterprise using the Uninstallation Cleanup Utility

    How to  clean uninstall MicroStrategy Analytics Enterprise using the Uninstallation Cleanup Utility CONSIDERATIONS/GUIDELINES : This Uninstallation Cleanup Utility is designed to remove the obsolete or leftover services, files and registries after uninstalling MicroStrategy products.  The goal of this cleanup utility is to remove the leftover files and registries after uninstallation to make the machine "cleaner" for a second time installation (it can solve some known downgrade installation issues).  The basic assumption for removing files is that the user installed MicroStrategy  in a path that contains the string "MicroStrategy" . If not, the files and certain registry keys will not be removed. Using the cleanup utility incorrectly can cause system-wide problems that may require re-installation of the Operating System.  This utility should always be tested in a test environment before being run in a production environment.   Lin...

    Microstrategy Custom number formatting symbols

    Custom number formatting symbols If none of the built-in number formats meet your needs, you can create your own custom format in the Number tab of the Format Cells dialog box. Select  Custom  as the Category and create the format using the number format symbols listed in the table below. Each custom format can have up to four optional sections, one each for: Positive numbers Negative numbers Zeros Text Each section is optional. Separate the sections by semicolons, as shown in the example below: #,###;(#,###);0;"Error: Entry must be numeric" For more examples, see  Custom number formatting examples . To jump to a section of the formatting symbol table, click one of the following: Numeric symbols Character/text symbols Date and time symbols Text color symbols Currency symbols Conditional symbols Numeric symbols For details on how numeric symbols apply to the Big Decimal data type, refer to the  Project Design Guide . ...

    Microstrategy Attributes relationship using a relationship table

    Relationship tables in Microstrategy Relate tables store information about the relationship between two attributes when one a parent of the other or vice-versa.. Relate tables contain the ID columns of two or more attributes, which will define associations between them. Relate tables are often used to create relationships between attributes that have a many-to-many relationship to each other. With attributes whose direct relationship is one-to-many—in which every element of a parent attribute can relate to multiple elements of a child attribute—you define parent-child relationships by placing the ID column of the parent attribute in the lookup table of the child attribute. The parent ID column in the child table is called a foreign key. This technique allows you to define relationships between attributes in the attributes’ lookup tables, creating tables that function as both lookup tables and relate tables as shown in the following diagram: ...

    Case functions Microstrategy

    Ca se functions Microstrategy Case functions return specified data in a SQL query based on the evaluation of user-defined conditions. In general, a user specifies a list of conditions and corresponding return values. Case This function evaluates multiple expressions until a condition is determined to be true, then returns a corresponding value. If all conditions are false, a default value is returned.  Case  can be used for categorizing data based on multiple conditions. This is a single-value function. Syntax Case ( Condition1 ,  ReturnValue1 ,  Condition2 , ReturnValue2 ,...,  DefaultValue ) Example Case(([Total Revenue] < 300000), 0, ([Total Revenue] < 600000), 1, 2) sum(Case (Day@DESC in (“Sat”,”Sun”), Sales, 0) {~+} Sum(Case(Category@DESC In("Books","Electronics"),Revenue,0)){~+} CaseV (case vector) CaseV  evaluates a single metric and returns different values according to the results. It can be used to perfo...

    Use a Visualization to Filter the Data in Another Visualization in a Dossier

    Use a Visualization to Filter the Data in Another Visualization Once you add visualizations to a dossier, you can use one visualization to filter or highlight data in another visualization. Define one visualization as the source. Then, select the other visualizations you want to filter or highlight as targets. The target visualizations only display or highlight data that also appears in the source. Your target visualization can be in any chapter or page within your dossier. Open a dossier with two or more visualizations. To enable a visualization to filter or highlight the data in another visualization Open the dossier  you want to modify. Hover over the visualization to use as the source and click  More   in the top right and choose  Select Target . A   icon appears in the upper left corner of the source visualization. The name of the source visualization appears after  Use visualization  at the top of the screen. If the source visualization...

    Joint child relationships in MSTR

    Joint child relationships Some attributes exist at the intersection of other indirectly related attributes. Such attributes are called  joint children. Joint child relationships connect special attributes that are sometimes called  cross-dimensional attributes, text facts, or qualities. They do not fit neatly into the modeling schemes you have learned about thus far. These relationships can be modeled and conceptualized like traditional attributes but, like facts, they exist at the intersection of multiple attribute levels. Many source systems refer to these special attributes as  flags. Therefore, if flags are referenced in your source system documentation, these are likely candidates for joint child relationships. Joint child relationships are really another type of many-to-many relationship where one attribute has a many-to-many relationship to two otherwise unrelated attributes. For example, consider the relationship between three attributes: Promotion, Ite...

    Create an alert-based subscription in MicroStrategy Distribution Services

    Create an alert-based subscription in MicroStrategy Distribution Services on Web Subscription to a report or Report Services document which will be executed when a certain conditional threshold is met based on another executing report. For example, a scheduled report executes which shows the Revenue by day for the past week. If the Revenue on any one day falls below a certain value, a subscription to another report or Report Services document can be triggered and delivered to a recipient. An alert based subscription can only be created directly on a report; however, another report or Report Services document can be delivered when the alert based subscription is triggered. Note: you need a grid report to create an alert and you cannot create if you want to create on a document with text boxes. The following example will walk through the basic steps on how to setup a subscription based on an alert like this: Follow the brief  steps bel...

    display a group horizontally in MSTR document

    Display a group horizontally in MSTR document By default,  groups are displayed vertically in a document.  This means that the detail sections are displayed below the Group Header. For example, a document is grouped by Year. The Detail section includes revenue and profit information by region.  Displaying the group vertically yields the following document: For certain documents, displaying and printing the group horizontally is desired. When displayed horizontally, the detail sections are displayed next to the Group Header, running horizontally across the page. The example given above, if displayed horizontally, shows a row containing the year, and then, for each region, the Region, Revenue, and Profit. When the document is viewed as a PDF, it displays as shown below: When being designed, the document with horizontal display looks like the following in MicroStrategy Developer: The sections within the group are turned sideways and listed horizontally...