Skip to main content

Star schemas and aggregate (or summary) fact tables


Aggregate tables can further improve query performance by reducing the number of rows over which higher-level metrics must be aggregated. 
However, the use of aggregate tables with dimension tables is not a valid physical modeling strategy. Whenever aggregation is performed over fact data, it is a general requirement that tables joined to the fact table must be at the same attribute level or at a higher level. If the auxiliary table is at a lower level, fact rows will be replicated prior to aggregation and this will result in inflated metric values (also known as "multiple counting").

With the above Time dimension table, a fact table at the level of Day functions correctly because there is exactly one row in DIM_TIME for each day. To aggregate the facts to the level of Quarter, it is valid to join the fact table to the dimension table and group by the quarter ID from the dimension table.

Sql
select DT.QUARTER_ID,
   max(DT.QUARTER_DESC) Quarter_Desc
   sum(FT.REVENUE) Revenue
from DAY_FACT_TABLE FT
   join DIM_TIME DT
      on (FT.DAY_ID = DT.DAY_ID)
group by DT.QUARTER_ID

If, however, there is an aggregate fact table already at the level of Quarter, the results will not be correct. This is because the query must join on Quarter ID, but the quarter ID is not a unique key of the dimension table. Because any given quarter of a year contains 90, 91 or 92 days, the dimension table will contain that many rows with the same quarter ID. Thus fact rows will be replicated prior to taking the sum, and the sum will be too high.

Sql
select FT.QUARTER_ID,
   max(DT.QUARTER_DESC) Quarter_Desc
   sum(FT.REVENUE) Revenue
from QTR_FACT_TABLE FT
   join DIM_TIME DT
      on (FT.QUARTER_ID = DT.QUARTER_ID)
group by FT.QUARTER_ID

This is a generally recognized problem with star schemas, and is not strictly a MicroStrategy limitation.

Star schemas will function correctly with MicroStrategy SQL Generation Engine 8.x as long as they obey the general data warehousing principle that fact tables should not be at a higher level than the dimension tables to which they are joined.

If aggregate tables are required, it is necessary to provide higher-level lookup tables with unique rows corresponding to each aggregate table's key. Logical views are a way to do this without adding tables or views to the warehouse. For example, LWV_LU_QUARTER may be defined using the following SQL statement:

Sql
select distinct QUARTER_ID, QUARTER_DESC, YEAR_ID
from DIM_TIME

 

With this logical view, it becomes possible for MicroStrategy SQL Generation Engine 8.x to query the quarter-level fact table as follows. Since the logical view has distinct rows per quarter, multiple counting will not occur in this query.

Sql
select FT.QUARTER_ID,
   max(LQ.QUARTER_DESC) Quarter_Desc
   sum(FT.REVENUE) Revenue
from QTR_FACT_TABLE FT
   join (select distinct QUARTER_ID, QUARTER_DESC, YEAR_ID
         from DIM_TIME) LQ
      on (FT.QUARTER_ID = LQ.QUARTER_ID)
group by FT.QUARTER_ID

For more information on the use of logical views in MicroStrategy SQL Generation Engine 8.1.x and 9.x, consult the MicroStrategy Project Design Guide manual, Appendix B: Logical Tables, "Creating logical tables."

Aggregate tables store pre-summarized totals at a higher level of aggregation than the most granular fact table. They allow reports to be generated from small, rather than large, tables; therefore, performance is enhanced. A successful aggregation strategy seeks to choose aggregate tables that will have the most impact while taking the least amount of space.

Aggregation decisions are driven by the following factors:

  • Usage patterns: Build aggregate tables that are likely to be used the most.
  • Compression ratios: The compression ratio between two tables is defined as the size of the aggregate compared to the size of the smallest table from which the aggregate can be derived.
  • Volatility: Changes in hierarchies over time impact the accuracy of aggregate tables. Sometimes aggregate tables must be rebuilt as a result of changes in dimensions.
A good candidate for aggregation should have at most 10-15 percent of the size of the smallest table from which it is derived.

EXAMPLE:
The MicroStrategy Tutorial project uses aggregate tables by default. A simple metric sum (Revenue) will go to different aggregate tables depending on the attributes on the template.

  1. Create a report with Year on the rows and Revenue on the columns.
  2. Execute the report and view the SQL:

  3. Drill from Year to Item and view the SQL:

The query will go from using ORDER_FACT to ORDER_DETAIL. When Year is on the template, the engine selects the smaller table and the fact is calculated as:

sum(a11.ORDER_AMT)
    instead of:

    sum((a11.QTY_SOLD * (a11.UNIT_PRICE - a11.DISCOUNT)))

    Comments

    1. Is there any solution now to use aggregate tables with star schema without creating logical tables?

      ReplyDelete

    Post a Comment

    Popular posts from this blog

    Microstrategy Custom number formatting symbols

    Custom number formatting symbols If none of the built-in number formats meet your needs, you can create your own custom format in the Number tab of the Format Cells dialog box. Select  Custom  as the Category and create the format using the number format symbols listed in the table below. Each custom format can have up to four optional sections, one each for: Positive numbers Negative numbers Zeros Text Each section is optional. Separate the sections by semicolons, as shown in the example below: #,###;(#,###);0;"Error: Entry must be numeric" For more examples, see  Custom number formatting examples . To jump to a section of the formatting symbol table, click one of the following: Numeric symbols Character/text symbols Date and time symbols Text color symbols Currency symbols Conditional symbols Numeric symbols For details on how numeric symbols apply to the Big Decimal data type, refer to the  Project Design Guide . ...

    Reduce Intelligent Cube Size By Finding Intelligent Cube Objects Which Are Not In Use

    Reduce Intelligent Cube Size By Finding Intelligent Cube Objects Which Are Not In Use If the i-cubes can potentially be reduced in size an audit can be performed on the cube objects to see which cube objects are not being used by any of the view reports, documents, or dossiers.   The below are examples for a few of the common metadata database platforms . NOTE: To perform this audit, queries are run against the MicroStrategy metadata database. Ensure a metadata backup is taken prior to performing the below actions. Steps: 1) Identify the object ID of the Intelligent cube to be audited by checking the objects Property window 2) Identify the object ID of the project this cube exists within by opening the Project Configuration Sample Cube ID =   CFAF1E9B4D53990698C42E87C7AF2EB5 Sample Project ID =  B7CA92F04B9FAE8D941C3E9B7E0CD754   3) Run the below SQL against the metadata database by replacing the Cube ID and Project ID within the respective ...

    Create a transaction services photo uploader

    Create a transaction services photo uploader   1.  Create a new table "photo_upload" in Tutorial warehouse database (the default location: C:\Program Files\MicroStrategy\Tutorial Reporting\TUTORIAL_DATA_7200.mdb), as shown below:    2. The 'photo_upload' table has to be pre-populated with *exactly* 10 rows of data, the values for the 'ID' column should be 1-10 and the values for the 'uploaded' column should all be 0 3.  In MicroStrategy Desktop, create a freeform report "R1" based on the new table "photo_upload" in Tutorial data created at step 1, as shown below:   SELECT Location, Description, ID, uploaded, numbers FROM PHOTO_UPLOAD 4.  Create another table for transaction insert SQL. Make sure to create an 'autonumber' type ID as primary key for this table, or auto_increment ID for different DBs.                     5. Create...

    Scheduling a report or document to be sent to an FTP in MSTR

    Scheduling a report or document to be sent to an FTP server You can have a report or document automatically delivered to a location on your FTP server on a specific schedule. To do so, you must subscribe to the report or document, as described in the steps below. You can customize your subscription by typing macros in the  File Name ,  Sub-folder , or  Zip File Name  fields. These macros are automatically replaced with the appropriate text when the report or document is delivered. For example, you create a subscription to a document. If you type  {&Project}  in the  File Name field, the name of the project in which the document is saved is displayed in the name of the document when it is delivered. • This procedure assumes that an administrator has already added your FTP server as a new device in Developer. Steps to do so are included in the  System Administrator Help . To send a report or document to an FTP server on a schedule ...

    OLAP features in Microstrategy

    OLAP features in Microstrategy MSTR  OLAP Services uses Intelligent Cube Technology—an in-memory version of report data that can 1 About MicroStrategy OLAP Services  can be manipulated by the MicroStrategy Analytical Engine. MicroStrategy Desktop, Web, and Office users can slice and dice data in reports within Intelligent Cubes without having to re-execute SQL against the data warehouse.  Many of the standard OLAP features that MicroStrategy provides out of the box, such as: Page-by Pivoting Sorting Subtotals Banding Aliasing Outline mode Thresholds etc.. With an OLAP Services license, user can perform additional OLAP analysis, using the following features:  Displaying data on the fly: dynamic aggregation, page  Creating metrics on-the-fly: derived metrics, Defining attribute elements on-the-fly: derived elements,  Filtering data on the fly: view filters and metric filters,  Importing data as an Intelligent Cube

    Microstrategy Caches explained

    Microstrategy Caches Improving Response Time: Caching A  cache is a result set that is stored on a system to improve response time in future requests.  With caching, users can retrieve results from Intelligence Server rather than re-executing queries against a database. To delete all object caches for a project 1 In Developer, log into a project. You must log in with a user account that has administrative privileges. 2 From the  Administration  menu, point to  Projects , and then select  Project Configuration . The Project Configuration Editor opens. 3 Expand  Caching , expand  Auxiliary Caches , then select  Objects . To delete all configuration object caches for a server 1 Log in to the project source. 2 From the  Administration  menu in Developer, point to  Server , and then select  Purge Server Object Caches . 4 Click  Purge Now . To purge web cache follow the steps in the link ...

    MicroStrategy URL API Parameters

    MicroStrategy URL Structure The following table summarizes the root URL structure used for every request to MicroStrategy Web. Environment Main Application URL Administration URL J2EE http://webserver/MicroStrategy/servlet/mstrWeb http://webserver/MicroStrategy/servlet/mstrWebAdmin .NET http://webserver/MicroStrategy/asp/Main.aspx http://webserver/MicroStrategy/asp/Admin.aspx Every request sent to MicroStrategy Web calls a central controller. Parameters are appended to  Main.aspx  or  mstrWeb  (in a .NET and J2EE environment, respectively) to indicate to the controller how the request should be internally forwarded and handled. The following examples show a URL for accessing a MicroStrategy folder when the user does not have an existing session. The URL contains not only the parameters needed to connect to MicroStrategy Web, but also the parameters needed to log on and create a session. J2EE environment: <a href="http:...

    Microstrategy Dashboard performance improvements steps

    Microstrategy  Dashboard performance improvements steps: Many times, causes of poor performance can be simplified to specific components. To troubleshoot performance issues, users must identify these components, then make the appropriate modifications to the environment and/or to the MicroStrategy dashboard to reduce bottlenecks. Dashboard execution stages can be represented below: MicroStrategy Intelligence Server When an end user makes a  Document Execution Request  through any client (a web browser via MicroStrategy Web, the MicroStrategy Desktop/Developer client, the MicroStrategy Mobile app, or the MicroStrategy Office client), the request is sent to the MicroStrategy Intelligence Server, which processes the request and prepares the response. The MicroStrategy Intelligence Server will execute all children datasets on the dashboard by either generating SQL and running this against the data warehouse, or by fetching data from a cache. The Inte...

    Microstrategy Document Autotext macros:

    Autotext  code/macros in  Microstrategy Document/dashboard This is a list of the available auto text macros that the Report Services Document engine recognizes. The following auto text codes allow you to add  document variable information to your document. These auto text codes are automatically replaced by information about the document. Auto text codes for MSTR document/dashboard:  AUTOTEXT DESCRIPTION   {&PAGE}  Display the current page.  {&NPAGES}  Display the total number of pages.  {&DATETIME}  Display the current date and time.  {&USER}  Display the user name that is executing the Report Services Document.  {&DOCUMENT}  Display the document name.  {&DOCUMENTID}  Display the document ID.  {&DESCRIPTION}  Display the document description.  {&PROJECT}  Display the project name.  {&EXECUTIONTIME}  Dis...

    "System Prompt" and its uses in MicroStrategy

    System Prompt and its uses in MicroStrategy WHAT IS A "SYSTEM PROMPT"? "System Prompt" is a system object that was introduced back in version 8.0.0. The object is named as "User Login" and is implemented as a prompt object. The object can be found under Public Objects > Prompts > System prompts, as shown below: Unlike ordinary prompt objects, system prompts don't require any answers from the user. When a report containing a system prompt runs, the prompt is answered automatically with the login of the user who runs the report. On the other hand, like other prompt objects, answers to system prompts are used to match caches. Therefore, users don't share caches for reports that contain system prompts. For details on how caches are matched, refer to the following MicroStrategy Knowledge Base document: KB5300-7X0-0147 - How are caches matched in MicroStrategy Intelligence Server 7.x? WHEN ARE SYSTEM PROMPTS USED?    System pr...