Free Training


C Language  |  CSS  |  MainFrame  |  VBScript  |  PHP  |  XML  |  C++ Tutorials  |  Ajax  |  JavaScript  |  CSS3  |  UML  |  jQuery  |  Microsoft AJAX

MSAS Tutorials

 
Home Tutorials MSAS
 

Tutorial 48: MSAS - Introducing Dimension and Cube Processing

 
The Analysis server is like a black box. Everything that happens in the server is recorded, but the end user sees a multidimensional cube created out of the data in the data warehouse and has very little taste of what really goes on behind the scenes. In this section we will be examining this black box to understand how the Analysis server processes Dimensions and cubes. We will also learn how to optimize cube processing and how to troubleshoot cube processing.

Processing Dimensions: Introduction


Dimensions are building blocks on which the cubes are constructed. When a dimension is processed the Analysis server creates an SQL statement to extract the necessary information from the data warehouse dimension table. At least one or two columns are extracted for each level in the hierarchy. If the member key column and the Member name column properties are the same, the Analysis service extracts only one of them, else it extracts both. One row is extracted for each distinct level key and the rows are sorted using the Member key Column property for the levels. For instance the user wants to process a region dimension containing the following rows and columns.


Region_id


Sales_region


Sales_city


Sales_country


5


Canada West


Vancouver


Canada


6


Canada West


Victoria


Canada


2


Mexico Central


Mexico City


Mexico


4


Mexico West


Guadalajara


Mexico


1


Central West


San Francisco


USA


3


South West


Los Angeles


USA


7


South West


San Diego


USA


8


South West


Coronado


USA


9


South West


National City


USA


10


South West


Lincoln Acres


USA



The data in the warehouse is organized from a relational database perspective. The user however wants the data from a hierarchical perspective. Analysis Services extracts data from the warehouse by creating a unique path that contains a component number for each level of the hierarchy. The Region dimension contains three levels. The region_id and the Sales_region belong to the same level and the Sales_city and Sales_country have one level each.


When the data is retrieved from the data warehouse, the Analysis services begins by creating the All Level member of the dimension.:-0-> 0-> 0. The server then creates the path for the next row which contains the members for Sales_country, Sales_city and Sales_region. In other words the server creates four members for the first row from the relational dimension.:- An ALL Level member and the members Canada West, Vancouver and Canada. Only three members are created for the rows that follow. At the end of the task the server would have created a unique path for each member of the dimension. The path contains the genealogy of each member. The Region dimension, then contains, the member names and paths shown below.


Complete member name


Member path


Member ID


[North America]


0->0->0


1


[North America].[Canada]


1->0->0


2


[North America].[Canada].[Canada West]


1->1->1


3


[North America].[Canada].[Canada West].[Vancouver]


1->1->1


4


[North America].[Mexico]


2->0->0


5


[North America].[Mexico].[Mexico Central]


2->1->0


6



Note that the path is created after sorting out the children of the member by using the value of the Member Key column property, regardless of the value of the Order By property. Also note that a separate ID for each member is created and the Order By property is taken into consideration while creating the ID. However the sorting is not on alphabetical order. The sequence number of the ID matches the sequence of the path numbers by and large and sorting is done on names by default. But the Analysis server follows the logic of the hierarchy order rather than the alphabetical order. If the region id of a member(whose name is lower on the alphabetical order) is higher than the region id of another member (whose name is higher on the alphabetical order) the former is placed above the latter. For instance if the region id of San Francisco(1) is higher than Coronado(8), then San Francisco will be placed above Coronado in the hierarchical order. The Multidimensional Expressions (MDX) query retrieves data in the hierarchical order.


The Analysis server then combines the paths from all the members and creates a map for the dimension. The dimension map allows the Analysis server to slice and dice hierarchies very quickly.


Processing Cubes: Introduction

The Analysis service is extremely efficient in creating relatively small cubes from massive data tables in the data warehouse. A conceptual understanding of how the process works would be useful in designing the cube and in retrieving data for reporting.



As stated earlier, cubes contain dimensions combined with one or more measures. These dimensions form a structure or organization for the data values in the cube. The dimensions used in the cube must be processed before the cube can be processed.


SQL statement is executed by Analysis services when a cube is processed. This statement retrieves values from the fact table. The columns retrieved completely identify each member and all the measures used. A compound path is then, created for each row. Let us look at the rows of a sample cube:


Year


Quarter


Month


Sales_country


Sales_region


Sales_City


Unit_sales


1998


1


1


Canada


Canada West


Vancouver


320


1998


1


1


Canada


Canada West


Victoria


300


1998


2


2


Mexico


Mexico Central


Mexico City


234


1998


1


3


Mexico


Mexico West


Guadalajara


567


1998


4


2


USA


Central West


San Francisco


876


1998


3


4


USA


South West


Los Angeles


234


1998


2


1


USA


South West


San Diego


213


1998


1


5


USA


South West


Coronado


345


1998


4


1


USA


South West


National City


987


1998


5


2


USA


South West


Lincoln Acres


231



The dimensions in the row are Region and Time. The server finds the leaf level member path for each dimension and combines the paths to create a row. Schematically the path would read something like this 1->1->1-> .1->1->1. This internal path of the cube is a number generated consisting of one sub number for each level of the dimension used in the cube. The dimensions in the above cube contain three levels each. Therefore it has a path containing 6 numbers one for each of the three levels of each dimension. The more dimensions a cube has the more numbers that are generated for the levels in the dimensions. The more the information stored in the cube.


Analysis Service creates a data file to store the cells of a cube. When a single row is processed, a single row is extracted from the fact table and the path for the leaf level cell is calculated. The server then checks to see whether a leaf level cell with that path already exists. If it exists, the server adds a new measure to the one already in the cell. If the cell does not exist, then, the server creates a new leaf level cell storing both the path and the value of the measure. This process is not done for ROLAP and HOLAP storage.



Next the Analysis server creates cells for the aggregations designed by the user. If the cell exists the measures are added to it, if not a new cell is created and the measures are stored in it. This completes the processing for a single row of the cube. This process is repeated until all rows are done.


During this process Analysis Server also creates a number of index files to facilitate rapid retrieval of the values. Once all the values have been accumulated the Process Log window announces the successful completion of the processing.



Read Next: Tutorial 49: MSAS - Processing Dimensions and Cubes



 

 

Comments



Post Your Comment:

Members Please Login
Your Name:*
e-mail ID:(required for notification)*
Image Verification: 
 
 Subscribe    

Weekly Offers

Sponsored Links