Saturday, January 5, 2008

MDC---PART 1

Multi dimensional clustered tables can be used to improve performance in many cases.With clustered index you can only cluster data in one dimension.With MDC you can cluster data in multiple dimensions.The other difference between MDC and clustered index is that MDC does not require reorgs as the data is automatically clustered.Below is a create statement for a MDC table for MAKE base table.
CREATE TABLE MAKEMDC(
ID INT,
MAKE CHAR(30),
MODEL CHAR(30)
)
ORGANIZE BY (MAKE,MODEL)

CELL: Every unique combination of dimensions form a cell.A cell can have one or more blocks.
BLOCK:It is a block of pages that contain a unique combination.Its size(blocking factor) is equal to the extent size defined.
A cell for unique combination of FORD(make) FOCUS(model) will be like this

DESIGN OF THE MDC TABLES:
Number of cells can be estimated by executing a select with distinct clause on the dimension columns to arrive at the unique combinations of the dimension values.
For example, for the above mentioned MDC table,the number of cells can be known by executing the following query:
SELECT DISTINCT MAKE,MODEL from MAKEMDC;
A large value of C/R might indicate the wastage of space where C indicates the number of cells and R indicates the total number of rows in the table.So,dimension columns should be wisely chosen.Choosing a firstname column as dimension column with employeeid might generate lot of cells and thereby wastage of space.A column like state with 10 possible values might be a good candidate as it would have only 10 possible values

Space occupied per cell(SC) = R/C * average row size
If SC is less than (extentsize*pagesize) then there is a wastage of space.So,the design should be done in a way such that SC matches up the (extentsize*pagesize) value.
Following are the options that can be used to design MDCs optimally:
1)Decrease the extent size if the blocks are not filled up.Page size can be increased if there are too many blocks per cell
2)Granularity can be varied to decrease the cells ..For example a new dimensional column can be added to the totalbalance table defined as totalbalance/100 to decrease the cells.
3)New dimensional keys can be added or existing ones can be removed depending on the situation.
4)A larger extent size might help in reducing the I/O but it might impact the usage of the space.
5)It is always better to do a comparison with the base table before the MDC table is actually used.For example MAKEMDC mentioned above should be compared with the MAKE table(base table) interms of space usage and performance before the decisions are actually taken.

Reference::Advanced DBA Certification Guide and Reference for DB2 UDB by Snow and Thomas






No comments: