The Cubewise Code division wrote a great article recently “7 tips to writing faster IBM TM1 and Planning Analytics rules”. They show different techniques of how you can write rules to improve their performance. I presented the same topic on the European Cubewise Conference in Paris this year and think that it deserves a bit of the background and some real-life examples. In this article, I want to focus on the “why”, “how” and “when” you can measure the performance of rules.
IBM Planning Analytics powered by TM1 is an example of an implementation of a Functional Database – a marriage between multidimensional in-memory OLAP database and a spreadsheet-like cell oriented calculation engine. It takes the best from both worlds and unifies them in one powerful tool.
Multidimensionality allows designing a model that reflects business complexity in your company and allows fast and efficient analysis in a pivot table style. TM1 uses a unique Sparse Data Consolidation Algorithm that allows the creation of huge cubes with a big number of dimensions and a literally unimaginable number of potential cells.
Each cell, like in Excel, can be a simple entered value or can be a function calculated from different cells, which in turn can be simple values or functions. The calculations are called “Rules” and are the heart of TM1 modelling, as they allow you to model all the complexity needed by your business.
TM1 \ PA is extremely scalable and can deal with huge datasets. Rules give the functionality of real-time calculations, that gather information from different modules and give one, reliable answer about the result of model calculations. They are relatively easy and quick to develop and can evolve with your business to reflects changing needs. But, as somebody said – with the great power comes great responsibility. When you write your rules, the same result can be acquired in different ways. Each time you plan the design of the rules, you should have few things in the back of your head:
Some of the goals are contradictory and achieving all of them is not always possible, so compromises must be made. Until Planning Analytics 2.0 measuring the performance of different calculations methods was complicated as there was no tool. Developers usually trusted their experience, gut feelings and myths while writing rules. Sometimes they used a stopwatch to check how fast rules calculated.
The situation changed for the better with Planning Analytics version 2.0 when the }StatsByRule cube was added to TM1’s Performance Monitor. Now you can monitor how many times each rule is called and how long it takes to calculate each rule line.
It is very easy to enable and start rule performance monitoring. You just need to set “RULE_STATS” property of a cube to “YES” and start Performance Monitor.
Now each time user or a process queries rule-calculated values in the cube, Performance Monitor tracks statistics and writes the results to the }StatsByRule cube. There are a few things that you need to be aware of how the mechanism works:
When you open the }StatsByRule cube you will see that it has three dimensions: }Cubes with cube list, }LineNumber with starting line of monitored rule and }RuleStats with statistics measures:
When you start the analysis, the first thing to check is Total Time (ms) measure. You should focus on the rule that takes the longest to execute. Using the techniques from Code’s article, you can try different approach and make the rule faster.
Sometimes you may find that a rule identified as slow might not be the one that causes the problem. When the rule execution is measured, the total time is shown as a result, including other rules called for pre-cursor dependent calculations. For example, let’s assume that rule was found in Cube A with execution time of 1011 milliseconds. After analysis of the data flow, it appeared that there is a chain of calculations to provide a result. Data transfer from C to B and from B to C is performing OK. The core reason for a problem is Cube C that performs slow allocation.
Feeders can also be analysed as well using }StatsByRule Cube. The time showed for feeders also include the time of all other feeders executed by the one measured. The Run Count shows how many times a particular feeder line was fired, not the number of fed cells. If there is low Total Run Count for feeder, but long execution time, there is a good chance that there is overfeeding in the feeder chain.
Although the result of measurements is very precise and can indicate problematic areas, you need to remember that results are environment-specific and can give different results, even on similar models. Some of the factors that affect the performance of the rules are:
Analysis of the rule performance is a time-consuming task, especially if you are trying different techniques to speed them up. You shouldn’t run it on a production server, as there may be problems with instance stability while doing the analysis. If you are trying to correct rule performance and changing rules often, it is a normal fact of development that you can expect reconciliation differences, locks and the other usual suspects.
I see three main scenarios in which it is worth the effort to check rule performance.
In the following article I will show how I performed an analysis on a client database and will show a comparison of the different techniques used. If you find this part interesting I highly recommend reading the second, as some of the findings are quite fascinating and sometimes counter-intuitive.