Introduction
The user-controlled threading tools in GAUSS are designed to provide great power and control to the user, while at the same time being easy to use. In this section, we will examine the GAUSS threading functions and the rules for their usage.
GAUSS threading functions
threadfor
andthreadendfor
- divide the loop iterations among the available number of threads.threadstat
- marks a single line of code to be executed as a thread.threadbegin
andthreadend
- mark the beginning and end of a multi-line block of code to be executed as a thread.threadjoin
- completes the definition of a set of threads to be executed simultaneously (called “sibling threads”), waiting until they are all done before continuing on in the calling thread (or “parent thread”) Since users may create threads within threads, the calling thread may or may not be the main GAUSS execution thread. Every group ofthreadstat
and/orthreadbegin
/threadend
statements must be followed by a call tothreadjoin
.
We will begin by illustrating the most basic usage of these threading commands. Next, we will explain some important concepts required to correctly use GAUSS threads in nontrivial cases and then illustrate with a simple Monte Carlo experiment. Try running the following code:
//Create a 5x5 matrix of 1's
r = 5;
x = ones(r,r);
//Compute matrix addition simultaneously in separate threads
threadstat z = x + x;
threadstat z2 = x + x;
threadjoin;
As you would expect, after the above code, z
and z2
should both be 5x5 matrices where every element is equal to two.
What are GAUSS threads?
User-defined GAUSS threads are sections of GAUSS code that can be run simultaneously. Each thread can access and use any previously created global symbols. In the example above, both of the threads created by the threadstat
statements can access the global matrix x
. If desired, they could also reference the scalar variable r
.
GAUSS threads can create new global symbols or change the value of global symbols that already exist. Either enter the show
command at your GAUSS command prompt or examine the Data Page in the GAUSS GUI to see that this example program created two new global variables, z
and z2
. These variables may be referenced by code that comes after the threadjoin
statement in your program or used interactively after the program ends.
Data Integrity
You can see in the example above that we are writing to different variables inside the two separate threads. This is very important. Never write to the same global variable inside any threads that could run simultaneously. You cannot have two threads writing to the same variable at one time.
Further, if one thread writes to a global variable, no sibling threads (threads executing simultaneously) may read or write to that variable. This is called the writer-must-isolate rule. The reason for this rule is that we cannot know precisely when any given portion of code within a thread will be run. Therefore, if thread 1 reads from a variable that thread 2 is writing to, we cannot know whether thread 2 has written to the variable or is in the process of writing to it when thread 1 is trying to read it.
To illustrate one method to deal with this restriction let us return to our threading example above. Assume that the purpose of this program is to calculate x+x+x+x
. Since we cannot write to our variable z
twice, we are making two partial sums. z contains the first partial sum and z2 contains the second partial sum. After the threadjoin
, we can add up the two partial sums:
r = 5;
x = ones(r,r);
//Compute partial sums
threadstat z = x + x;
threadstat z2 = x + x;
threadjoin;
//Final sum
z = z + z2;
Obviously, this is not the simplest method to calculate x+x+x+x
. However, the concepts remain the same for more complicated examples. Also, remember that you could use this same process to compute a partial product or any type of intermediate value.
Example: Threaded Monte Carlo experiment
//Set seed for repeatable random numbers
rndseed 45234;
num_flips = 1000;
num_experiments = 1500;
max_heads = 0;
//Run loop 'num_experiments' times
for i(1,num_experiments,1);
//Start this experiement with zero heads
heads = 0;
//Create 'num_flips' uniform random numbers
x = rndu(num_flips,1);
//Check each flip to see if it is a head
for j(1,num_flips,1);
//If it is a head, increase our heads counter
if x[j] < 0.5;
heads = heads + 1;
endif;
endfor;
//Check to see if we have a new maximum number of heads
if heads > max_heads;
max_heads = heads;
endif;
endfor;
//'ntos' turns a number into a string for our print statement
//The '$+' operater combines strings together
print "Over "$+ntos(num_experiments)$+" simulations, of "$+ntos(num_flips)$+" coin flips each";
print "the greatest number of heads in one simulation was "$+ntos(max_heads);
The above code should produce the following output:
Over 1500 simulations, of 1000 coin flips each the greatest number of heads in one simulation was 546
Problem: Convert previous code to use threading
Take a moment to look over the single-threaded version of this program and think about how you might parallelize it (that is, run parts of it simultaneously in different threads). Also, consider the potential problems in threading this program. There are two primary issues to resolve. First, we have to decide how to break this problem up to be run in parallel. Second, we need to decide how to avoid writing to the same variables in different threads at the same time.
Solution
In response to the first issue, we will create two threads, each to run half of the total number of iterations. To deal with the second issue, we will create a GAUSS user-defined procedure that will run our simulation. This allows us to use only one version of each variable because they will be local variables, not global variables. It also encapsulates some of our code and allows for code reuse. Our new program file with the use of a procedure will look like this:
//Set seed for repeatable random numbers
rndseed 45234;
num_flips = 1000;
num_experiments = 1500;
threadbegin;
max_heads1 = coinflips(num_flips,num_experiments/2);
threadend;
threadbegin;
max_heads2 = coinflips(num_flips,num_experiments/2);
threadend;
threadjoin;
if max_heads1 > max_heads2;
max_heads = max_heads1;
else;
max_heads = max_heads2;
endif;
print "After " num_experiments "simulations, the most heads per thousand tosses was " max_heads;
//Create procedure to perform computation from earlier code snippet
proc (1) = coinflips(num_flips,num_experiments);
local heads, max_heads, x;
max_heads = 0;
//Run loop 'num_experiments' times
for i(1,num_experiments,1);
//Start this experiement with zero heads
heads = 0;
//Create 'num_flips' uniform random numbers
x = rndu(num_flips,1);
for j(1,num_flips,1);
//Check each flip to see if it is a head
if x[j] < 0.5;
heads = heads+1;
endif;
endfor;
//Check to see if we have a new maximum number of heads
if heads > max_heads;
max_heads = heads;
endif;
endfor;
retp(max_heads);
endp;
Since we used the same random seed in this code snippet, if we did not make a mistake, we should get the same output as above:
Over 1500 simulations, of 1000 coin flips each the greatest number of heads in one simulation was 546
After encapsulating the majority of the code inside of the new procedure, we are left with only two (temporary) global variables that we are writing to inside our two threads (max_heads1
and max_heads2
). After reading the next section, come back to this program and use it as a starting point to explore the GAUSS threading functions further. Two sample problems you might try are:
- Modify the procedure and program so that it returns the minimum number of heads from a simulation.
- Modify the program to create four threads.
Summary
- User-defined threads in GAUSS are portions of code that may be executed simultaneously.
- These threads can use any previously created global symbols. However, the writer-must-isolate rule says that if a thread writes to a global variable, no other sibling thread may access that variable.
- The same procedure may be called from multiple sibling threads. Since the variables in a procedure are local in scope and not global, you do not have to worry about writing to variables with the same names inside a procedure.
FAQ
Q: Can I create GAUSS threads inside of other GAUSS threads?
A: Yes, you can create threads that call threads. GAUSS will allow you to nest threads as deeply as you would like. However, keep your system resources in mind.
Q: Can I create GAUSS threads inside of procedures?
A: Yes, you can create GAUSS threads inside of procedures. However, the threads created in your procedure must obey the writer-must-isolate rule for local variables in the procedure. For example, the following procedure is illegal:
proc myFunction(x,y);
local z;
z = 10;
threadstat z = x'x;
threadstat y = y'y+z; //Read from variable 'z',
//written to in other thread
threadjoin;
retp(y);
endp;
Q: Can I create GAUSS threads inside of for
or do
loops?
A: Yes, you can create GAUSS threads inside of for
loops and inside of do
loops.
Q: How many threads can I create in GAUSS?
A: GAUSS allows you to create an unlimited number of threads. You are limited only by your available hardware.
Q: How many threads should I create?
A: This is system and problem dependent. A good starting guideline is to create between 1 and 2 threads per core on your system. In some cases, however, you may be able to profitably create many more threads than that. See the next part in this tutorial series for more details on how to use threading most effectively.
Click to start the next section of this tutorial: GAUSS Threading Performance Considerations.