SSAS Many to Many Optimization Technique #2 (using partitions)

Yep, I am finally getting around to posting my *second* idea on optimizing many-to-many relationships.  Really, its more than an idea as I have used the technique successfully on a couple client projects.   Here's the summary, then I'll walk through how to implement this with Adventureworks...   Partition your intermediate MGs by date and include the date dim relationship in the dimension usage.   Again, the same goal is in play, which is how to reduce the number of records that are in the intermediate measure group.  In the prior post, I walked through the Matrix idea to actually compress the number of rows in the intermediate measure group.  This approach is a little different:   M2M queries use *ALL* the common dimensions between the intermediate measure group and the other measure groups needed when a M2M dimension relationship is used. The date dimension is the most common dimension used to filter or slice in MDX queries THEREFORE: If you add the Date/time dimension to the intermediate measure group AND partition by the date dimension, then when M2M queries are run that use the date dimension to filter/slice, the number of rows needed for the runtime join *can* be significantly reduced.   That's it!  And it works, I worked with a client where we had an intermediate measure group with 150M rows. Including this reduced queries times in some cases by 10-50X.  Here's the example for AdventureWorks and the proof.   Implementation Example:   1.) In DSV, right-click on Fact Sales Reason
replace table -> with new names query   SELECT FISR.SalesOrderNumber, FISR.SalesOrderLineNumber, FISR.SalesReasonKey,
FIS.OrderDateKey, FIS.DueDateKey, FIS.ShipDateKey
FROM dbo.FactInternetSalesReason FISR
INNER JOIN dbo.FactInternetSales FIS
ON FISR.SalesOrderNumber = FIS.SalesOrderNumber
AND FISR.SalesOrderLineNumber = FIS.SalesOrderLineNumber   save/close DSV   2) Go to Partitions and expand the "Sales Reasons" partitions rename Internet_Sales_Reasons to "Internet_Sales_Reasons_2001"   Change the source to be a query binding and use the following query:   SELECT FISR.SalesOrderNumber, FISR.SalesOrderLineNumber, FISR.SalesReasonKey,
FIS.OrderDateKey, FIS.DueDateKey, FIS.ShipDateKey
FROM dbo.FactInternetSalesReason FISR
INNER JOIN dbo.FactInternetSales FIS
ON FISR.SalesOrderNumber = FIS.SalesOrderNumber
AND FISR.SalesOrderLineNumber = FIS.SalesOrderLineNumber
WHERE FIS.OrderDateKey <= '184'   Create 3 new identical partitions with different SQL WHERE filters...   "Internet_Sales_Reasons_2002"
...WHERE FIS.OrderDateKey >= '185' AND FIS.OrderDateKey <= '549'   Internet_Sales_Reasons_2003
...WHERE FIS.OrderDateKey >= '550' AND FIS.OrderDateKey <= '914'   Internet_Sales_Reasons_2004
...WHERE FIS.OrderDateKey >= '915' AND FIS.OrderDateKey <= '1280'   Here's what your partitions will now look like:   SSAS Partitions   3.) Go to the dimension usage tab and add relationships between the Sales Reasons measure group and create regular relationships to the Date, Ship Date, and Delivery Date dimensions.
Use the Time key... for the relationship.  Here's what it looks like: Dim Usage   Optional Step: Change the Internet Sales Detail to MOLAP.  The only thing this does is make it easier to see the, otherwise the M2M queries will submit SQL to the DB to get the result.  It still works both ways, but a little cleaner with MOLAP.   Here's the XMLA code of the project.   The Results
Now, turn on SQL Server Profiler and connect to SSAS.  Run the following query against the updated version of ADW...   SELECT [Measures].[Internet Order Quantity] ON COLUMNS
, [Sales Reason].[Sales Reasons].[Sales Reason] ON ROWS
FROM [Adventure Works]
WHERE [Date].[Calendar].[Calendar Quarter].&[2002]&[3]   Check out the trace.  Specifically look at the Partition entries for Sales Reasons partitions.   "Finished reading data from the 'Internet_Sales_Reasons_2002' partition."   Partitions 2003 and 2004 are not being scanned for the query... (NOTE: 2001 actually IS read and the only reason is because it only has 1454 rows, and SSAS will always read it)   Multiply this out to millions of records and dozens of partitions and your M2M queries that include a date filter or slice will perform way better.   The 2 keys to implementing this are: 1.) Include partitions in your M2M measure groups on Date ranges AND 2.) be sure to add measure group relationships to the date dimension(s) in the dimension usage.   That's it.   Regards, Erik  
 •  0 comments  •  flag
Share on Twitter
Published on July 12, 2007 14:22
No comments have been added yet.


Erik Veerman's Blog

Erik Veerman
Erik Veerman isn't a Goodreads Author (yet), but they do have a blog, so here are some recent posts imported from their feed.
Follow Erik Veerman's blog with rss.