Below, we will give a short example of a profiler session. See also Profiling for the documentation of the profiler functions in detail. Consider the code:
global N A;
N = 300;
A = rand (N, N);
function xt = timesteps (steps, x0, expM)
global N;
if (steps == 0)
xt = NA (N, 0);
else
xt = NA (N, steps);
x1 = expM * x0;
xt(:, 1) = x1;
xt(:, 2 : end) = timesteps (steps - 1, x1, expM);
endif
endfunction
function foo ()
global N A;
initial = @(x) sin (x);
x0 = (initial (linspace (0, 2 * pi, N)))';
expA = expm (A);
xt = timesteps (100, x0, expA);
endfunction
function fib = bar (N)
if (N <= 2)
fib = 1;
else
fib = bar (N - 1) + bar (N - 2);
endif
endfunction
If we execute the two main functions, we get:
tic; foo; toc;
⇒ Elapsed time is 2.37338 seconds.
tic; bar (20); toc;
⇒ Elapsed time is 2.04952 seconds.
But this does not give much information about where this time is spent;
for instance, whether the single call to expm is more expensive
or the recursive time-stepping itself. To get a more detailed picture,
we can use the profiler.
profile on;
foo;
profile off;
data = profile ('info');
profshow (data, 10);
This prints a table like:
# Function Attr Time (s) Calls
---------------------------------------------
7 expm 1.034 1
3 binary * 0.823 117
41 binary \ 0.188 1
38 binary ^ 0.126 2
43 timesteps R 0.111 101
44 NA 0.029 101
39 binary + 0.024 8
34 norm 0.011 1
40 binary - 0.004 101
33 balance 0.003 1
The entries are the individual functions which have been executed (only
the 10 most important ones), together with some information for each of
them. The entries like ‘binary *’ denote operators, while other
entries are ordinary functions. They include both built-ins like
expm and our own routines (for instance timesteps). From
this profile, we can immediately deduce that expm uses up the
largest proportion of the processing time, even though it is only called
once. The second expensive operation is the matrix-vector product in the
routine timesteps. 1
Timing, however, is not the only information available from the profile.
The attribute column shows us that timesteps calls itself
recursively. This may not be that remarkable in this example (since it's
clear anyway), but could be helpful in a more complex setting. As to the
question of why is there a ‘binary \’ in the output, we can easily
shed some light on that too. Note that data is a structure array
(Structure Arrays) which contains the field FunctionTable.
This stores the raw data for the profile shown. The number in the first
column of the table gives the index under which the shown function can
be found there. Looking up data.FunctionTable(41) gives:
scalar structure containing the fields:
FunctionName = binary \
TotalTime = 0.18765
NumCalls = 1
IsRecursive = 0
Parents = 7
Children = [](1x0)
Here we see the information from the table again, but have additional
fields Parents and Children. Those are both arrays, which
contain the indices of functions which have directly called the function
in question (which is entry 7, expm, in this case) or been called
by it (no functions). Hence, the backslash operator has been used
internally by expm.
Now let's take a look at bar. For this, we start a fresh
profiling session (profile on does this; the old data is removed
before the profiler is restarted):
profile on;
bar (20);
profile off;
profshow (profile ('info'));
This gives:
# Function Attr Time (s) Calls
-------------------------------------------------------
1 bar R 2.091 13529
2 binary <= 0.062 13529
3 binary - 0.042 13528
4 binary + 0.023 6764
5 profile 0.000 1
8 false 0.000 1
6 nargin 0.000 1
7 binary != 0.000 1
9 __profiler_enable__ 0.000 1
Unsurprisingly, bar is also recursive. It has been called 13,529
times in the course of recursively calculating the Fibonacci number in a
suboptimal way, and most of the time was spent in bar itself.
Finally, let's say we want to profile the execution of both foo
and bar together. Since we already have the run-time data
collected for bar, we can restart the profiler without clearing
the existing data and collect the missing statistics about foo.
This is done by:
profile resume;
foo;
profile off;
profshow (profile ('info'), 10);
As you can see in the table below, now we have both profiles mixed together.
# Function Attr Time (s) Calls
---------------------------------------------
1 bar R 2.091 13529
16 expm 1.122 1
12 binary * 0.798 117
46 binary \ 0.185 1
45 binary ^ 0.124 2
48 timesteps R 0.115 101
2 binary <= 0.062 13529
3 binary - 0.045 13629
4 binary + 0.041 6772
49 NA 0.036 101
[1] We only know it is the binary multiplication operator, but fortunately this operator appears only at one place in the code and thus we know which occurrence takes so much time. If there were multiple places, we would have to use the hierarchical profile to find out the exact place which uses up the time which is not covered in this example.