Skip to main content

Star schemas and aggregate (or summary) fact tables


Aggregate tables can further improve query performance by reducing the number of rows over which higher-level metrics must be aggregated. 
However, the use of aggregate tables with dimension tables is not a valid physical modeling strategy. Whenever aggregation is performed over fact data, it is a general requirement that tables joined to the fact table must be at the same attribute level or at a higher level. If the auxiliary table is at a lower level, fact rows will be replicated prior to aggregation and this will result in inflated metric values (also known as "multiple counting").

With the above Time dimension table, a fact table at the level of Day functions correctly because there is exactly one row in DIM_TIME for each day. To aggregate the facts to the level of Quarter, it is valid to join the fact table to the dimension table and group by the quarter ID from the dimension table.

Sql
select DT.QUARTER_ID,
   max(DT.QUARTER_DESC) Quarter_Desc
   sum(FT.REVENUE) Revenue
from DAY_FACT_TABLE FT
   join DIM_TIME DT
      on (FT.DAY_ID = DT.DAY_ID)
group by DT.QUARTER_ID

If, however, there is an aggregate fact table already at the level of Quarter, the results will not be correct. This is because the query must join on Quarter ID, but the quarter ID is not a unique key of the dimension table. Because any given quarter of a year contains 90, 91 or 92 days, the dimension table will contain that many rows with the same quarter ID. Thus fact rows will be replicated prior to taking the sum, and the sum will be too high.

Sql
select FT.QUARTER_ID,
   max(DT.QUARTER_DESC) Quarter_Desc
   sum(FT.REVENUE) Revenue
from QTR_FACT_TABLE FT
   join DIM_TIME DT
      on (FT.QUARTER_ID = DT.QUARTER_ID)
group by FT.QUARTER_ID

This is a generally recognized problem with star schemas, and is not strictly a MicroStrategy limitation.

Star schemas will function correctly with MicroStrategy SQL Generation Engine 8.x as long as they obey the general data warehousing principle that fact tables should not be at a higher level than the dimension tables to which they are joined.

If aggregate tables are required, it is necessary to provide higher-level lookup tables with unique rows corresponding to each aggregate table's key. Logical views are a way to do this without adding tables or views to the warehouse. For example, LWV_LU_QUARTER may be defined using the following SQL statement:

Sql
select distinct QUARTER_ID, QUARTER_DESC, YEAR_ID
from DIM_TIME

 

With this logical view, it becomes possible for MicroStrategy SQL Generation Engine 8.x to query the quarter-level fact table as follows. Since the logical view has distinct rows per quarter, multiple counting will not occur in this query.

Sql
select FT.QUARTER_ID,
   max(LQ.QUARTER_DESC) Quarter_Desc
   sum(FT.REVENUE) Revenue
from QTR_FACT_TABLE FT
   join (select distinct QUARTER_ID, QUARTER_DESC, YEAR_ID
         from DIM_TIME) LQ
      on (FT.QUARTER_ID = LQ.QUARTER_ID)
group by FT.QUARTER_ID

For more information on the use of logical views in MicroStrategy SQL Generation Engine 8.1.x and 9.x, consult the MicroStrategy Project Design Guide manual, Appendix B: Logical Tables, "Creating logical tables."

Aggregate tables store pre-summarized totals at a higher level of aggregation than the most granular fact table. They allow reports to be generated from small, rather than large, tables; therefore, performance is enhanced. A successful aggregation strategy seeks to choose aggregate tables that will have the most impact while taking the least amount of space.

Aggregation decisions are driven by the following factors:

  • Usage patterns: Build aggregate tables that are likely to be used the most.
  • Compression ratios: The compression ratio between two tables is defined as the size of the aggregate compared to the size of the smallest table from which the aggregate can be derived.
  • Volatility: Changes in hierarchies over time impact the accuracy of aggregate tables. Sometimes aggregate tables must be rebuilt as a result of changes in dimensions.
A good candidate for aggregation should have at most 10-15 percent of the size of the smallest table from which it is derived.

EXAMPLE:
The MicroStrategy Tutorial project uses aggregate tables by default. A simple metric sum (Revenue) will go to different aggregate tables depending on the attributes on the template.

  1. Create a report with Year on the rows and Revenue on the columns.
  2. Execute the report and view the SQL:

  3. Drill from Year to Item and view the SQL:

The query will go from using ORDER_FACT to ORDER_DETAIL. When Year is on the template, the engine selects the smaller table and the fact is calculated as:

sum(a11.ORDER_AMT)
    instead of:

    sum((a11.QTY_SOLD * (a11.UNIT_PRICE - a11.DISCOUNT)))

    Comments

    1. Is there any solution now to use aggregate tables with star schema without creating logical tables?

      ReplyDelete

    Post a Comment

    Popular posts from this blog

    Best practices for using Distribution Services in Microstrategy

    Best practices for using Distribution Services MicroStrategy recommends the following best practices when scheduling Distribution Services subscriptions, in addition to the best practices given above: • For best results, follow the steps listed in  High-level checklist to set up a report delivery system . • PDF, plain text, and CSV file formats generally offer the fastest delivery performance. Performance can vary, depending on items including your hardware, operating system, network connectivity, and so on. • The performance of the print delivery method depends on the speed of the printer. • When sending very large reports or documents: ▫ Enable the zipping feature for the subscription so that files are smaller. ▫ Use bulk export instead of the CSV file format. Details on bulk exporting are in the  Reports  chapter of the  Advanced Reporting Guide . ▫ Schedule subscription deliveries to occur when your Intelligence Server is experiencing low ...

    Microstrategy "Error type: Odbc error. Odbc operation attempted

     "Error type: Odbc error. Odbc operation attempted: SQLExecDirect. [HYT00:0: on SQLHANDLE] [MicroStrategy][ODBC Oracle Wire Protocol driver]Timeout expired" is shown when executing reports from Web When users are trying to execute some reports in MicroStrategy web in particular, they may receive the Error “SQL Generation Complete Index out of range” and “Timeout expired” error as shown below: Possible Causes: One possible cause is that the MicroStrategy Intelligence Server using a cached database connection that was already dropped by the RDBMS. To resolve this: Admin should delete the database connection caches and create a new DSNs in case they are sharing DSNs to connect to different databases. In addition, change the settings for the ‘Connection lifetime’ and the ‘Connection idle time out’.  Follow the steps below to perform the mentioned changes and verify the report after each step and some of the settings require i-server r...

    Microstrategy Custom number formatting symbols

    Custom number formatting symbols If none of the built-in number formats meet your needs, you can create your own custom format in the Number tab of the Format Cells dialog box. Select  Custom  as the Category and create the format using the number format symbols listed in the table below. Each custom format can have up to four optional sections, one each for: Positive numbers Negative numbers Zeros Text Each section is optional. Separate the sections by semicolons, as shown in the example below: #,###;(#,###);0;"Error: Entry must be numeric" For more examples, see  Custom number formatting examples . To jump to a section of the formatting symbol table, click one of the following: Numeric symbols Character/text symbols Date and time symbols Text color symbols Currency symbols Conditional symbols Numeric symbols For details on how numeric symbols apply to the Big Decimal data type, refer to the  Project Design Guide . ...

    Microstrategy Dossiers explained

    Microstrategy  Dossiers With the release of MicroStrategy 10.9, we’ve taken a leap forward in our dashboarding capabilities by simplifying the user experience, adding storytelling, and collaboration.MSTR has  evolved dashboards to the point that they are more than dashboards - they are  interactive, collaborative analytic stories . Ultimately, it was time to go beyond dashboards, both in concept and in name, and so  the've  renamed VI dashboards to  ‘ dossiers ’.  Dossiers can be created by using the new Desktop product or Workstation or simply from the Web interface which replaces Visual Insights. All the existing visual Insights dashboards will be converted to Dossiers   With MicroStrategy 10.9, there was an active focus on making it easier to build dashboards for the widest audience of end users. To achieve this, some key new capabilities were added that make it easier to author, read, interact and collaborate on dashboards ...

    Configure a report for use with Bulk Export in MicroStrategy

    Configure a report for use with Bulk Export in MicroStrategy The Bulk Export feature enables a large report to be saved as a delimited text file. Using this feature, it is possible to retrieve result sets from a large dataset without having to load the entire dataset into memory. PS:  Once a report is setup for bulk export it cannot be used as a regular report. So if the report needs to be run as a normal report and as a bulk export report, the first step is to make a copy of the report for use with bulk export. Configure Bulk Export Bulk Export options are only available in MicroStrategy Developer. Open a 3-tier connection using MicroStrategy Developer and edit the desired report. Go to 'Data' on the top menu bar. Select 'Configure Bulk Export': Specify any additional desired configuration options. General Settings Bulk export database instance : This is the database instance to use to store the bulk export results. Temporary tables w...

    Bursting file subscriptions Microstartegy

    Bursting file subscriptions: Delivering  parts of reports across multiple files: Large MicroStrategy reports and documents are often broken up into separate pages by attributes. In a similar way, with Distribution Services, you can split up, or burst, a report or document into multiple files. When the subscription is executed, a separate file is created for each element of each attribute selected for bursting. Each file has a portion of data according to the attributes used to group data in the report (page-by axis) or document (group-by axis). Ex:, you may have a report with information for all regions. You could place Region in the page-by axis and burst the file subscription into the separate regions. This creates one report file for each region. As a second ex:, if you choose to burst your report using the Region and Category attributes, a separate file is created for each combination of Region and Category, such as Central and Books as a report, Central and Ele...

    Create an alert-based subscription in MicroStrategy Distribution Services

    Create an alert-based subscription in MicroStrategy Distribution Services on Web Subscription to a report or Report Services document which will be executed when a certain conditional threshold is met based on another executing report. For example, a scheduled report executes which shows the Revenue by day for the past week. If the Revenue on any one day falls below a certain value, a subscription to another report or Report Services document can be triggered and delivered to a recipient. An alert based subscription can only be created directly on a report; however, another report or Report Services document can be delivered when the alert based subscription is triggered. Note: you need a grid report to create an alert and you cannot create if you want to create on a document with text boxes. The following example will walk through the basic steps on how to setup a subscription based on an alert like this: Follow the brief  steps bel...

    exact string when searching for elements in an element prompt in MicroStrategy

    When a user types in keywords to tries to find element names in an element prompt, the search returns all objects containing the keywords in MicroStrategy Developer 9.4.x-10.x. However, the user would like to search for the exact phrase. It is suggested to use quotes to get exact phrase when there is a space between. Like "Black Panther" Using the MicroStrategy Tutorial Project as an example, a user wishes to search for an item named Minolta Maxxum Camera. The search results for Minolta Maxxum Camera return all items containing any or all of those words, as shown below: CAUSE: This occurs due to the search defaulting to 'ORing' the search terms. This means that any or all keywords that match the strings will be returned. The SQL for this search is shown below: SELECT ITEM_NAME FROM LU_ITEM WHERE (ITEM_NAME LIKE '%Minolta%' OR ITEM_NAME LIKE '%Maxxum%' OR ITEM_NAME LIKE '%Camera%') ACTION: To match an exact string, use...

    Control the display of null and zero metric values

    Show   Control the display of null and zero metric values in a grid report You can determine how to display or hide rows and columns in a grid report that consist only of null or zero metric values. You can have MicroStrategy hide the rows and columns in the following ways: Hide rows and columns that consist only of null metric values Hide rows and columns that consist only of zero metric values Hide rows and columns that consist only of null or zero metric values (default) Once you have defined how MicroStrategy hides null and zero metric values in the grid, you can quickly show or hide the grid using the Hide Nulls/Zeros option in the Data menu, as described below, or by clicking the  Hide Nulls/Zeros  icon  in the Data toolbar. To determine how null and zero metric values are displayed or hidden in a grid report Open the report in Edit mode. From the  Tools  menu, select  Report Options . The Report Options...

    Replace object names in bulk using MicroStrategy Repository Translation Wizard

    Replace object names in bulk using MicroStrategy Repository Translation Wizard Users may need to replace  object names  in bulk.  This can be done using MicroStrategy Repository Translation Wizard in MicroStrategy Developer 9.4.x - 10.x.  Follow the steps below for an example of how to do this.  Create an empty MD shell in Microsoft Access.  Run Repository Translation Wizard from the Start Menu -> Programs -> MicroStrategy-> Object Manager In the "Metadata Repository" screen, select the "Project Source Name" and check the "Export Translations" option, as shown in the following screen shot: In the "Languages" screen, select a project, a translation reference language and a language (choose English as default), as shown in the following screen shot: In the "Select objects" screen, if one needs to select a certain type of object, check the option  Use the results of a search object .  Then, click 'New' to...