Jun 1 2020

How to use the Synchronize Function with RunProcess to Limit the Number of Threads

What Is RunProcess? RunProcess is a TM1 TurboIntegrator function that allows you to run multiple processes in parallel, each using its own thread. While this function is great as it allows for faster data processing and data loads there is currently no built-in logic or feature to limit the number of threads that can be […]

What Is RunProcess?

RunProcess is a TM1 TurboIntegrator function that allows you to run multiple processes in parallel, each using its own thread. While this function is great as it allows for faster data processing and data loads there is currently no built-in logic or feature to limit the number of threads that can be executed using RunProcess.

RunProcess threads need to be managed for a couple reasons.

If too many threads are kicked off, all available CPUs will be used to try to run the threads. If there are more threads than available CPU this leads to severe resource constraint on the system. TM1 (and all other programs on the server) will become slow or even unresponsive. If too many threads are released at once the TM1 server may crash.
Each thread runs independently, and each thread can only access its own changes prior to committing. Therefore, if thread A and thread B are using or going to the same intersection, they will not see any changes made by each other or any other simultaneous thread until the final commit after all threads are completed. This can cause incomplete loads or even in some cases data duplication. Any dependencies need to be carefully managed.

Due to the TM1 Server not managing the independent threads and no ability natively within the RunProcess function to do so, it can be difficult to use RunProcess in a Production Environment where data integrity is key. What is the work around?

The Synchronize Function

The Synchronize TurboIntegrator Function can be used to force serial execution of multiple processes by using a defined “lock object”. The Synchronize command can be placed anywhere in a TI script but applies to the entire TI process when it is encountered. Unless synchronize is implemented via conditional logic it is generally placed as one of the first lines on the Prolog tab.

Using this Function in conjunction with RunProcess allows for management of threads by the developer/end user.

How do you Implement Synchronize with RunProcess?

There are a couple of ways to implement Synchronize. The simplest requires only requires a few lines of code. By using modular arithmetic, Mod(nModDividend, nModulus), returns the remainder of nModDividend/nModulus. We can use the Mod function to create a counter variable for processes executed to limit the number of maximum threads to whatever nModulus is set to.

In the Wrapper Process:

In the declarations on the wrapper process we declare two variables with the following code:

### – Initialize dividend and modulus

nModDividend = 0;

nModulus = 25;

### – End

Before executing RunProcess, the following lines of code should be included:

### – Use modular arithmetic to create “groups” of concurrent instances of the below process

sRemainder = NUMBERTOSTRING(MOD(nModDividend, nModulus));

sLockObj = sThisProcName | sRemainder;

RUNPROCESS (Processname, param1, param2, param3…..’pLockObj’, pLockObj);

### – After RunProcess is executed Increment dividend by 1

nModDividend = nModDividend + 1;

In the Child Process:

The child process needs a string parameter for pLockObj. The parameter should be left blank and only contain a value if passed in from the wrapper. Three lines of code should be placed at the very top of the process, before the declarations:

IF (pLockObj @<>”);

SYNCHRONIZED (pLockObj);

ENDIF;

That is all the code needed.

Since we have set our nModules to limit to 25 threads this is what happens:

On the first iteration, lock object string “0” or is passed to the child TI. Because Mod(0,25)=0. The child TI uses the lock object string “0” in the synchronize function. The thread executes immediately.
On the second iteration, lock object string “1” is passed to the child TI. Because Mod(1,25)=1. The child TI uses the lock object string “1” in the synchronize function. The thread executes immediately.
On the 25th iteration, lock object string “24” is passed to the child TI. Because Mod(24,25)=24. This is the max remainder for our Mod function. The child TI uses the lock object string “24” in the synchronize function. The thread executes immediately.
On the 26th iteration, lock object string “0” is passed to the child TI. Because Mod(25,25)=0. We have our first repetition of a lock object. The child TI uses the lock object string “0” in the synchronize function. If the thread from the 1^stiteration has not committed then the process will queue and enter a wait state until it is free to progress.

The table below shows the full iterations.

Now with this simple mechanism even if we release a stack of 1000 RunProcess calls, only 25 can execute at once.

The number of threads that can be run in parallel requires some testing based on your environment and CPU’s available. There is a threshold where the number of threads running causes slower process times and a few less threads performs more optimally.