Skip to main content

many-to-many and joint child relationships

Supporting many-to-many and joint child relationships in MSTR


Two forms of attribute relationships, many-to-many relationships and joint child relationships, can introduce additional complexity to the schema and warehouse design process. The following sections discuss the considerations you must make to ensure an effective warehouse design in light of the unique nature of these relationships.
While the topics are largely related to logical model design, a working knowledge of physical schemas is helpful when dealing with the challenges involved with these topics.
Before reading this section, you should know what logical data models and physical warehouse schemas are, and how to read and interpret them. Logical data models and physical warehouse schemas are discussed in The Logical Data Modeland Warehouse Structure for Your Logical Data Model respectively. These chapters discuss how to plan and create a conceptual framework for your business intelligence data.

Many-to-many relationships

The presence of many-to-many relationships introduces complexity during the warehouse design process. With the presence of many-to-many relationships, you must make additional considerations to effectively plan your design.
Below are some real-life examples of many-to-many relationships which must be carefully handled in the data model and schema:

In a certain organization, each salesperson can work in more than one calling center. Likewise, each calling center has many salespeople.


In a car manufacturing plant, many models of cars are produced, and each comes in several colors. That is, there are many colors for a single type of car, and many types of cars can be associated with the same color.
The following sections use the example of items and colors to demonstrate a many-to-many relationship and the options you have for dealing with them. One item can come in many colors, such as red hats, blue hats, and green hats, and one color can be associated with many items, such as red dress, red hat, red shoes, and red socks.
Potential problems with many-to-many relationships usually come in the following forms, both of which can be avoided by correctly modeling the relationship:

Loss of analytical capability


Multiple counting

Loss of analytical capability

With the color and item many-to-many relationship, there are usually two business questions for which users want answers:

1In what colors are certain items available?


2How much of a particular item/color combination was sold?
Answering the first question requires a table that contains a list of all possible item/color combinations. Recall that one-to-many relationships are usually in the child’s lookup table.
In many-to-many relationships this is not feasible. Rather, a distinct relationship table needs to be present in your warehouse. The following diagram shows the lookup and relationship tables for item and color:
external image color_item_relate_table.gif
The Rel_Color_Item table provides a row for every possible item/color combination.
Answering the second question requires a fact table that has sales information as well as color and item information. The following diagram shows the same scenario as before, but in addition it shows a simple fact table containing sales data keyed by item, color, and date.
external image color_item_relate_fact.gif
The fact table in the above diagram alone is not sufficient to answer the first question. Only item and color combinations that were actually sold, and therefore have sales recorded, can be retrieved from this table. If you have item and color combinations that are available but that have never been sold, this fact table cannot provide a complete list of item and color combinations to answer question one.
In summary, to prevent any loss of analytical flexibility when dealing with a many-to-many attribute relationship, the following requirements must be met:

A distinct relationship table to identify all the possible combinations of attribute elements between attributes


Both the attribute ID columns in the fact table
You can implement the above points in several different ways, which are discussed in Working with many-to-many relationships.

Multiple counting

When dealing with many-to-many relationships, loss of analytical capability is only one challenge. Another equally significant issue is multiple counting. Multiple counting occurs when all of the following takes place:

You attempt to aggregate data to the level of one of the attributes in the many-to-many relationship, or a higher level than one of the attributes in the many-to-many relationship.


The relationship exists in a distinct relationship table.


All of the attributes in the many-to-many relationship are not in the fact table.
Recall the example from above, but make the following change: remove color from the fact table.
external image multiple_counting.gif
Assume that there are three items, including hats, dresses, and socks. These items come in three colors, including red, blue, and green, with the exception of socks, which come in only green and blue. The following diagram shows this data in the lookup tables as well as some simple sales data:
external image multiple_counting2.gif
The risk of multiple counting occurs when you run a query requesting the sales by color, effectively aggregating to the item attribute level in the many-to-many relationship. This query would require both the fact table, which has the sales information by item, and the relationship table, since color is not recorded in the fact table.
The difficulty lies in the fact that color is not in the fact table. There is no way to directly relate the sales of an item in the fact table to the color of that particular item. For example, instead of calculating the sales of red items, the query aggregates sales for all items that come in red according to the relationship table. The sum includes all hats and all dresses, including blue ones and green ones. This obviously leads to numbers that are higher than the true sales for red items.
For example, using the given data, the following questions cannot all be answered accurately:

What are the total sales for hats?
The answer is $35, which can be calculated directly from the fact table.

What are the total sales for red items?
You cannot determine an accurate answer. The answer you get is $85, which is the total for all hats and dresses, since socks do not come in red. It is entirely possible that all the dresses sold are green; however, you cannot confirm this since color is not recorded in the fact table.

What are the total sales for red dresses?
Again, you cannot determine an accurate answer. If all the dresses sold are indeed green, the correct answer is $0, but the answer you will get based on the data in the fact table is $50.
The following section describes several ways to prevent multiple counting when dealing with many-to-many relationships.

Working with many-to-many relationships

As you can see, seemingly simple questions can require you to take a number of steps to answer them when many-to-many relationships are involved.
You can use one of three techniques to provide physical support to answer the types of questions that cannot be answered accurately when using many-to-many relationships. The three techniques all have differing levels of flexibility, and flexibility is always a trade-off with complexity. In all cases, the two fundamental components remain in place in one form or another:

A relationship table to define the attribute relationship


Both the attribute’s ID columns in the fact table
MicroStrategy builds the rules that MicroStrategy SQL Engine uses to generate SQL when a report request is made. If you make both of the above physical implementations, the SQL Engine uses the related table when no metric is included on the report. When a metric is included, the fact table is used to answer the query.
All of the following methods require additional data in the fact table. This means that you must capture the additional data in the source system. For example, you need to have data in the source system as to what the color is of each item sold. If this additional data was never captured in the source system, you cannot fully resolve the many-to-many relationship to calculate the amount of sales for items of a certain color.
Method 1
This method is the most straightforward way to effectively manage many-to-many relationships.
Method 1 requires you to create a separate relationship table (in this case, Rel_Color_Item) and add both attribute IDs to the fact table as shown in the following diagram.
external image MM_Method1.gif
Method 2
Method 2 eliminates the many-to-many relationship and the need for a distinct relationship table.
Here the many-to-many relationship is converted into a compound attribute relationship. You treat one attribute as a child of the other and have a compound key for the lower level attribute. Also, you add both attribute IDs, in this case Item_ID and Color_ID, to the fact table as shown in the following diagram.
external image MM_Method2.gif
While this method eliminates the need for a separate relationship table, you lose the ability to view items independent of color, or vice versa.
Method 3
Method 3 is the most versatile solution and has the following characteristics:

Further simplifies the compound attribute relationship from Method 2 into a simple attribute relationship


Provides the ability to view item and color together or independently


Requires only one attribute column in the fact table for complete flexibility, rather than two
Here you must create a new attribute, lower in level than either Color or Item. This attribute is essentially a concatenation of Color and Item, which gives it a one-to-many relationship between itself and each of its parent attributes. This is the SKU attribute, particularly common in retail data models or situations.
Finally, rather than including Color and Item in the fact table, you only need to include this new child attribute SKU, as shown in the following diagram.
external image MM_Method3.gif
This method is actually quite similar to Method 1. The major difference is that the distinct relationship table from Method 1 has an additional column, SKU, which extends the relationship of each item and color combination into a single value. Consequently, you can use this single value in the fact table.
The major disadvantage of Method 3 lies in creating the new attribute if your business model does not already use a similar structure, as well as possibly adding complexity to the ETL process.

Excerpt from https://www2.microstrategy.com/producthelp/10.7/ProjectDesignGuide/WebHelp/Lang_1033/Content/ProjectDesign/Supporting_many_to_many_and_joint_child_relationsh.htm#pd-attributes_overview_3631995489_1086149

Comments

Post a Comment

Popular posts from this blog

Microstrategy "Error type: Odbc error. Odbc operation attempted

 "Error type: Odbc error. Odbc operation attempted: SQLExecDirect. [HYT00:0: on SQLHANDLE] [MicroStrategy][ODBC Oracle Wire Protocol driver]Timeout expired" is shown when executing reports from Web When users are trying to execute some reports in MicroStrategy web in particular, they may receive the Error “SQL Generation Complete Index out of range” and “Timeout expired” error as shown below: Possible Causes: One possible cause is that the MicroStrategy Intelligence Server using a cached database connection that was already dropped by the RDBMS. To resolve this: Admin should delete the database connection caches and create a new DSNs in case they are sharing DSNs to connect to different databases. In addition, change the settings for the ‘Connection lifetime’ and the ‘Connection idle time out’.  Follow the steps below to perform the mentioned changes and verify the report after each step and some of the settings require i-server r...

Scheduling a report or document to be sent to an FTP in MSTR

Scheduling a report or document to be sent to an FTP server You can have a report or document automatically delivered to a location on your FTP server on a specific schedule. To do so, you must subscribe to the report or document, as described in the steps below. You can customize your subscription by typing macros in the  File Name ,  Sub-folder , or  Zip File Name  fields. These macros are automatically replaced with the appropriate text when the report or document is delivered. For example, you create a subscription to a document. If you type  {&Project}  in the  File Name field, the name of the project in which the document is saved is displayed in the name of the document when it is delivered. • This procedure assumes that an administrator has already added your FTP server as a new device in Developer. Steps to do so are included in the  System Administrator Help . To send a report or document to an FTP server on a schedule ...

Logical Views to specify an outer join between two attribute lookup tables when only attributes are on a report

Logical Views to specify an outer join between two attribute lookup tables when only attributes are on a report Apart from using the VLDB properties to create the left outer join,  article describes how to use the Logical View to specify an outer join between two attribute lookup tables when only attributes are on a report. This method exists as attribute only outer joins will not be generated on their own by the MicroStrategy SQL engine. This is because they are only necessary with r agged/unbalanced hierarchies which are not supported as null attribute IDs are not supported (parent elements with no child elements or child elements with no parents).  Brief instructions are provided using the example below. Consider, two attributes: Parent01 and Child01 have a parent-child relationship. Their Lookup tables are defined, as follows Parent01 Child01 Note that although there are 4 ID values for the attribute Parent01, there is no defined relationship ...

Microstrategy Caches explained

Microstrategy Caches Improving Response Time: Caching A  cache is a result set that is stored on a system to improve response time in future requests.  With caching, users can retrieve results from Intelligence Server rather than re-executing queries against a database. To delete all object caches for a project 1 In Developer, log into a project. You must log in with a user account that has administrative privileges. 2 From the  Administration  menu, point to  Projects , and then select  Project Configuration . The Project Configuration Editor opens. 3 Expand  Caching , expand  Auxiliary Caches , then select  Objects . To delete all configuration object caches for a server 1 Log in to the project source. 2 From the  Administration  menu in Developer, point to  Server , and then select  Purge Server Object Caches . 4 Click  Purge Now . To purge web cache follow the steps in the link ...

Microstrategy Dossier training videos

Microstrategy Dossier training videos Adding data to a Dossier: Creating a visualization filter in Dossier: Sending Dossier to User libraries: How to format Dossiers: Adding a designer filter to a Dossier: Use Bookmarks in Dossiers: Dynamic Links in Dossier: Recreating documents from Dossiers: Exploring and searching in Dossiers: Exploring sample dossiers in MSTR library: Using Page to Page Targets in Dossiers:

Fiscal Week, Fiscal Month, Fiscal Quarter and Fiscal Year calculations in Microstrategy

Fiscal Week, Fiscal Month, Fiscal Quarter and Fiscal Year calculations in Microstrategy FiscalWeek Returns the numeric position of a week within a fiscal year, for a given  input date. This function is useful in financial reporting when the start of the fiscal year is different than the start of the calendar year. Syntax FiscalWeek< firstWeekDay ,  firstMonth >( Date / Time ) Where: • Date / Time  is the input date or timestamp. • firstWeekDay  (default value is 1) is a parameter that determines which day of the week is considered as the first day of the week. You can type an integer value from 1 to 7, with 1 representing Sunday, 2 representing Monday, and so on until 7 representing Saturday. • firstMonth  (default value is 1) is a parameter that determines which month is considered as the start of the fiscal year. You can type an integer value from 1 to 12, with 1 representing January, 2 representing February, and so on until ...

Create a transaction services photo uploader

Create a transaction services photo uploader   1.  Create a new table "photo_upload" in Tutorial warehouse database (the default location: C:\Program Files\MicroStrategy\Tutorial Reporting\TUTORIAL_DATA_7200.mdb), as shown below:    2. The 'photo_upload' table has to be pre-populated with *exactly* 10 rows of data, the values for the 'ID' column should be 1-10 and the values for the 'uploaded' column should all be 0 3.  In MicroStrategy Desktop, create a freeform report "R1" based on the new table "photo_upload" in Tutorial data created at step 1, as shown below:   SELECT Location, Description, ID, uploaded, numbers FROM PHOTO_UPLOAD 4.  Create another table for transaction insert SQL. Make sure to create an 'autonumber' type ID as primary key for this table, or auto_increment ID for different DBs.                     5. Create...

Create an alert-based subscription in MicroStrategy Distribution Services

Create an alert-based subscription in MicroStrategy Distribution Services on Web Subscription to a report or Report Services document which will be executed when a certain conditional threshold is met based on another executing report. For example, a scheduled report executes which shows the Revenue by day for the past week. If the Revenue on any one day falls below a certain value, a subscription to another report or Report Services document can be triggered and delivered to a recipient. An alert based subscription can only be created directly on a report; however, another report or Report Services document can be delivered when the alert based subscription is triggered. Note: you need a grid report to create an alert and you cannot create if you want to create on a document with text boxes. The following example will walk through the basic steps on how to setup a subscription based on an alert like this: Follow the brief  steps bel...

Predictive modelling in Data Science using Microstrategy

Creating a predictive modelling in MicroStrategy MicroStrategy Data Mining Services has been evolving to include more data mining algorithms and functionality. One key feature is MicroStrategy Developer’s Training Metric Wizard. The Training Metric Wizard can be used to create several different types of predictive models including linear and exponential regression, logistic regression, decision tree, cluster, time series, and association rules. Linear and exponential regression The linear regression data mining technique should be familiar to you if you have ever tried to extrapolate or interpolate data, tried to find the line that best fits a series of data points, or used Microsoft Excel’s LINEST or LOGEST functions. Regression analyzes the relationship between several predictive inputs, or independent variables, and a dependent variable that is to be predicted. Regression finds the line that best fits the data, with a minimum of error. For example, you have a dataset ...

Settings for Outer Join between metrics in MicroStrategy

Settings for Outer Join between metrics in MicroStrategy MicroStrategy adopts multi-pass logic to determine the execution plan for a report. This means that every metric is evaluated in separate SQL passes. Outer Joins come into play when MicroStrategy Engine merges the results from all SQL passes into one report. For a multi-pass report, different Outer Join behaviors can give the user completely different results. In addition, report metrics can be of different types which can, in some cases, influence the result of the outer join. In MicroStrategy, there are two settings that users can access to control Outer Join behavior : Formula Join Type and Metric Join Type . Metric Join Type: VLDB Setting at Database Instance Level Report and Template Levels Report Editor > Data > Report Data Options Metric Level   Metric editor > Tools > Metric Join Type Control Join between Metrics Formula Join Type: Only at Compound/Split...